v2.11.0 (5518)

PA - C8 - MAP535 : Regression

Domaine > Mathématiques appliquées.



The objective of this course is to introduce linear. Regression plays a key role in many problems and it is absolutely essential for a datascientist to understand the theory and the practice of regression analysis. It is also an important vehicle to address the statistical challenges in statistical learning : model selection, penalisation, resampling (bootstrap, cross-validation) robustness, detection of outliers, and also methods to detect deviations from an assumed model. The course will also serve as a motivation to sharpen the understanding of statistical techniques, covering both estimation and tests.


1.    Introduction to statistical learning
Regression: Learning objectives and applications
Linear models : interpretation, examples
Least-Square estimators properties (bias, variance)
Case study: univariate and multivariate regression
Multivariate Linear Regression: Parametric casee
Construction of least-square estimators
2.    Parametric true model
Distribution of least-squares estimatorsAsymptotic properties
Gaussian case (distribution of the parameters, confidence regions)
Confidence intervals and tests
Classical regression diagnostic (leverage points)
Case studyAlgo: understanding  multiple linear regression with R (lm summary, detecting outliers, understanding classical regression diagnosis)

3.    Residual analysis (homoscedasticity, non-linear dependence)
    Outlier detection (leverage effects, influence, introduction to robsut statistics)
Functional modelintroduction to non-parametric regression : from parameters to functions
Multiple models for a single problemFunction classes, model selection
Variable choice / Basis / Spline
Bias / Variance (Approximation error / Estimation Error)
Case study : Spline regression

4.    Model Selection and Resampling
Approximation Error / Estimation Error
Learning Error / Generalization Error
Resampling based method: jacknife, bootstrap, and Cross Validation
Case study: model selection with CV

5.    Model Selection and Unbiased Risk Estimation
Unbiased Risk Estimation
AIC/BIC Penalization and Exhaustive Exploration
Forward / Backward and Stochastic Exploration
Multiple tests

6.    Model Selection and Penalization
Restricted Model and Penalization
Ridge and Lasso
Numerical algorithm: Gradient Descent and Coordinate Descent
Case study: Coordinate Descent and Lasso

Langue du cours : Anglais

Format des notes

Numérique sur 20

Littérale/grade réduit

Pour les étudiants du diplôme Data Science for Business

Le rattrapage est autorisé (Note de rattrapage conservée)
    L'UE est acquise si note finale transposée >= C
    • Crédits ECTS acquis : 4 ECTS

    La note obtenue rentre dans le calcul de votre GPA.

    Pour les étudiants du diplôme Echanges PEI

    Le rattrapage est autorisé (Note de rattrapage conservée)

      Pour les étudiants du diplôme Diplôme d'ingénieur de l'Ecole polytechnique

      Le rattrapage est autorisé (Note de rattrapage conservée)
        L'UE est acquise si note finale transposée >= C
        • Crédits ECTS acquis : 5 ECTS
        Veuillez patienter