Descriptif
Course description: This course develops tools to analyze statistical problems in high-dimensional settings where the number of variables may be greater than the sample size. It is in contrast with the classical statistical theory that focuses on the behavior of estimators in the asymptotics as the sample increases while the number
of variables stays fixed. We will show that, in high-dimensional problems, powerful statistical methods can be constructed under such properties as sparsity or low-rankness. The emphasis will be on the non-asymptotic theory underlying these developments.
Topics covered:
-- Sparsity and thresholding in the Gaussian sequence model.
-- High-dimensional linear regression: Lasso, BIC, Dantzig selector, Square
Root Lasso. Oracle inequalities and variable selection properties.
-- Estimation of high-dimensional low rank matrices. Matrix completion.
-- Inhomogeneous random graph model. Community detection and esti-
mation in the stochastic block model.
Prerequisites: Solid knowledge of probability theory, mathematical statis-
tics, linear algebra. Notions of convex optimization.
Resources:
Alexandre Tsybakov. High-dimensional Statistics. Lecture Notes.
Grading: The grade is determined by a final exam. Extra points can be obtained for optional homeworks.
Format des notes
Numérique sur 20Littérale/grade réduitPour les étudiants du diplôme Data Sciences
Le rattrapage est autorisé (Max entre les deux notes)- Crédits ECTS acquis : 3 ECTS