v2.6.4 (3793)

PA - C8 - MAP670M : Missing Data and causality


In machine learning, there has been great progress in obtaining powerful predictive models, but these models rely on correlations between variables and do not allow for an understanding of the underlying mechanisms or how to intervene on the system for achieve a certain goal. The concepts of causality are fundamental to have levers for action, to formulate recommendations and to answer the following questions: "what would happen if » we had acted differently?


The questions of causal inference arise in many areas (socio-economics, politics, psychology, medicine, etc.): depending on the context which drug to use to improve the patient's health? what marketing strategy for product placement should be used to influence consumer buying behavior, etc. The formalism of causal inference makes it possible to study these questions as a problem of classical statistical inference. The gold standard for estimating the effect of treatment is a randomized controlled trial (RCT) which is, for example, mandatory for the authorization of new drugs in pharmaceutical and medical research. However, RCTs are generally very expensive in terms of time and financial costs, and in some areas such as economics or political science, it is often not possible to implement an RCT, for example to assess the effectiveness of a given policy.


The aim of this course is to present  the available methods to perform causal inference from observational data.  We focus on both the theoritical framework and practical aspects  (available software solution).


Numerus Clausus : 30

Instructor: Julie Josse julie.josse@polytechnique.edu

TA: Imke Mayer imke.mayer@polytechnique.edu


Class Time:

Monday  3 february, 10 february, 24 february, 2 mars, 9 mars,

8 :30-12:30


Office Hours:

 Monday 5:00-6:00 PM and by appointment


Grading – 2.5 ECTS:

Final project (100%)

It could be more practical with a data analysis or  presentations based on recent research papers.


Topics covered:

- The Neyman-Rubin potential outcome causal model for observational studies

- Matching, propensity scores

- Efficiency and double robustness, double machine learning

- Estimating treatment effect heterogeneity, learning decision rules


 - Causal discovery : causal models, graphical models and markov conditions



Hernan, Miguel A., and James M. Robins. Causal Inference. Chapman & Hall/CRC

Imbens, Guido W., and Donald B. Rubin. Causal Inference in Statistics, Social, and Biomedical Sciences

Jonas Peters, Dominik Janzing, Bernhard Schölkopf. Elements of Causal Inference: Foundations and Learning Algorithms


Articles : A  Survey of Learning Causality with Data: Problems and Methods
Guo, Ruocheng and Cheng, Lu and Li, Jundong and Hahn, P. Richard and Liu, Huan

















Format des notes

Numérique sur 20

Littérale/grade réduit

Pour les étudiants du diplôme Master 2 Mathématiques et Applications - Mathématiques pour les Sciences du Vivant

Le rattrapage est autorisé (Max entre les deux notes)
    L'UE est acquise si Note finale >= 10
    • Crédits ECTS acquis : 4 ECTS

    Pour les étudiants du diplôme Data Sciences

    Le rattrapage est autorisé (Max entre les deux notes)
      L'UE est acquise si Note finale >= 10
      • Crédits ECTS acquis : 2.5 ECTS
      Veuillez patienter