Retour

Programming and web data collection

ECTS : 3

Description du contenu de l'enseignement :

This course covers essential Python programming techniques for web data collection in applied economic analysis. Students will learn practical methods to extract structured data from online sources, starting with basics such as HTML/CSS, HTTP requests, XPath, CSS selectors, browser emulation, and public/private APIs (World Bank, INSEE, IMF).

Advanced topics include hidden APIs, overcoming technical obstacles (session management, blocking points), and large-scale data extraction. Students will gain expertise using libraries like requests, BeautifulSoup, and pandas for JSON/XML handling, data cleaning, and pipeline creation.

The course also emphasizes ethics, legal compliance, privacy, and responsible data use. Practical exercises and real-world examples will enable students to develop robust solutions for collecting and analyzing economic data from the web.

Compétence à acquérir :

Course Objectives: 

Targeted competencies:

Mode de contrôle des connaissances :

The assessment will consist in written exam and an oral presentation of a project made in groups. 

Bibliographie, lectures recommandées :

Python and Scraping

Ethical, Legal, and Practical Considerations

Document susceptible de mise à jour - 01/04/2026
Université Paris Dauphine - PSL - Place du Maréchal de Lattre de Tassigny - 75775 PARIS Cedex 16