First semester

New Data Sources

Objectifs

– Knowledge of a range of new data sources that can be mobilized for official statistics
– Introduction to the methods and tools needed to exploit new data sources
– Work on bibliography and a critical look at a given theme, to better understand the contributions and limitations of certain sources.

Plan

The multiplication of new types of data collected by private players or by government agencies represents a challenge for official statistics. What are these data sources? In what areas can they meet certain criteria of quality and representativeness, while providing complementary information to administrative or survey data? What recent IT innovations have made it possible to exploit these data?

In this course, we’ll look at a number of data sources newly exploited by INSEE and official statistics in general, such as cell phone data, bank account or CB transaction data (Big Data angle), satellite images, textual data, as well as several examples of Internet-derived data (Twitter and social networks, Se Loger, etc., which can be obtained by webscraping or via APIs). We’ll look at the purposes for which these data can be mobilized for official statistics, and the tools and methods that can be used to exploit them.

Prérequis

Not indicated