v2.11.0 (5757)

Cours scientifiques - APM_51435_EP : Linear algebra and regression

Domaine > Mathématiques appliquées.

Descriptif

Objectives

The objective of this course is to introduce linear. Regression plays a key role in many problems and it is absolutely essential for a datascientist to understand the theory and the practice of regression analysis. It is also an important vehicle to address the statistical challenges in statistical learning : model selection, penalisation, resampling (bootstrap, cross-validation) robustness, detection of outliers, and also methods to detect deviations from an assumed model. The course will also serve as a motivation to sharpen the understanding of statistical techniques, covering both estimation and tests.

Syllabus

1.    Introduction to statistical learning
Regression: Learning objectives and applications
Linear models : interpretation, examples
Least-Square estimators properties (bias, variance)
Case study: univariate and multivariate regression
Multivariate Linear Regression: Parametric casee
Construction of least-square estimators
2.    Parametric true model
Distribution of least-squares estimatorsAsymptotic properties
Gaussian case (distribution of the parameters, confidence regions)
Confidence intervals and tests
Classical regression diagnostic (leverage points)
Case studyAlgo: understanding  multiple linear regression with R (lm summary, detecting outliers, understanding classical regression diagnosis)

3.    Residual analysis (homoscedasticity, non-linear dependence)
    Outlier detection (leverage effects, influence, introduction to robsut statistics)
Functional modelintroduction to non-parametric regression : from parameters to functions
Multiple models for a single problemFunction classes, model selection
Variable choice / Basis / Spline
Bias / Variance (Approximation error / Estimation Error)
Case study : Spline regression

4.    Model Selection and Resampling
Approximation Error / Estimation Error
Learning Error / Generalization Error
Resampling based method: jacknife, bootstrap, and Cross Validation
Case study: model selection with CV

5.    Model Selection and Unbiased Risk Estimation
Unbiased Risk Estimation
AIC/BIC Penalization and Exhaustive Exploration
Forward / Backward and Stochastic Exploration
Multiple tests

6.    Model Selection and Penalization
Restricted Model and Penalization
Ridge and Lasso
Numerical algorithm: Gradient Descent and Coordinate Descent
Case study: Coordinate Descent and Lasso



Langue du cours : Anglais

Format des notes

Numérique sur 20

Littérale/grade réduit

Pour les étudiants du diplôme Titre d’Ingénieur diplômé de l’École polytechnique

Le rattrapage est autorisé (Note de rattrapage conservée)
    L'UE est acquise si note finale transposée >= C
    • Crédits ECTS acquis : 5 ECTS

    La note obtenue rentre dans le calcul de votre GPA.

    Pour les étudiants du diplôme MScT-Data Science for Business

    Vos modalités d'acquisition :

    Mid term + devoir à rendre sur moodle

    Le rattrapage est autorisé (Note de rattrapage conservée)
      L'UE est acquise si Note finale >= 10
      • Crédits ECTS acquis : 5 ECTS

      La note obtenue rentre dans le calcul de votre GPA.

      Pour les étudiants du diplôme MScT-Double Degree Data and Finance (DDDF)

      Vos modalités d'acquisition :

      Mid term + devoir à rendre sur moodle

      Le rattrapage est autorisé (Note de rattrapage conservée)
        L'UE est acquise si Note finale >= 10
        • Crédits ECTS acquis : 5 ECTS

        La note obtenue rentre dans le calcul de votre GPA.

        Veuillez patienter