v2.11.0 (5790)

Cours scientifiques - MAP535 : Linear algebra and regression

Domaine > Mathématiques appliquées.

Descriptif

Objectives

The objective of this course is to introduce linear. Regression plays a key role in many problems and it is absolutely essential for a datascientist to understand the theory and the practice of regression analysis. It is also an important vehicle to address the statistical challenges in statistical learning : model selection, penalisation, resampling (bootstrap, cross-validation) robustness, detection of outliers, and also methods to detect deviations from an assumed model. The course will also serve as a motivation to sharpen the understanding of statistical techniques, covering both estimation and tests.

Syllabus

1.    Introduction to statistical learning
Regression: Learning objectives and applications
Linear models : interpretation, examples
Least-Square estimators properties (bias, variance)
Case study: univariate and multivariate regression
Multivariate Linear Regression: Parametric casee
Construction of least-square estimators
2.    Parametric true model
Distribution of least-squares estimatorsAsymptotic properties
Gaussian case (distribution of the parameters, confidence regions)
Confidence intervals and tests
Classical regression diagnostic (leverage points)
Case studyAlgo: understanding  multiple linear regression with R (lm summary, detecting outliers, understanding classical regression diagnosis)

3.    Residual analysis (homoscedasticity, non-linear dependence)
    Outlier detection (leverage effects, influence, introduction to robsut statistics)
Functional modelintroduction to non-parametric regression : from parameters to functions
Multiple models for a single problemFunction classes, model selection
Variable choice / Basis / Spline
Bias / Variance (Approximation error / Estimation Error)
Case study : Spline regression

4.    Model Selection and Resampling
Approximation Error / Estimation Error
Learning Error / Generalization Error
Resampling based method: jacknife, bootstrap, and Cross Validation
Case study: model selection with CV

5.    Model Selection and Unbiased Risk Estimation
Unbiased Risk Estimation
AIC/BIC Penalization and Exhaustive Exploration
Forward / Backward and Stochastic Exploration
Multiple tests

6.    Model Selection and Penalization
Restricted Model and Penalization
Ridge and Lasso
Numerical algorithm: Gradient Descent and Coordinate Descent
Case study: Coordinate Descent and Lasso



Langue du cours : Anglais

Format des notes

Numérique sur 20

Littérale/grade réduit

Pour les étudiants du diplôme MScT-Double Degree Data and Finance (DDDF)

Le rattrapage est autorisé (Note de rattrapage conservée)
    L'UE est acquise si Note finale >= 10
    • Crédits ECTS acquis : 4 ECTS

    La note obtenue rentre dans le calcul de votre GPA.

    Pour les étudiants du diplôme MScT-Data Science for Business

    Le rattrapage est autorisé (Note de rattrapage conservée)
      L'UE est acquise si Note finale >= 10
      • Crédits ECTS acquis : 4 ECTS

      La note obtenue rentre dans le calcul de votre GPA.

      Pour les étudiants du diplôme Titre d’Ingénieur diplômé de l’École polytechnique

      Le rattrapage est autorisé (Note de rattrapage conservée)
        L'UE est acquise si Note finale >= 10
        • Crédits ECTS acquis : 5 ECTS

        La note obtenue rentre dans le calcul de votre GPA.

        Veuillez patienter