Descriptif
The lecture will mostly be based on the book "Probabilistic Machine Learning: An Introduction" by Kevin Murphy. It will cover various learning algorithms, mostly in the supervised setting (both for classification and regression), but also in some unsupervised settings. The following topics will be discussed:
- Introduction to machine learning
- Least square regression
- Classification with logistic regression
- Stochastic gradient methods
- Principal component analysis
- Support vector machines
- Trees and ensemble methods
- Neural networks
- Clustering methods
Objectifs pédagogiques
This lecture provides an introduction to various concepts and algorithms in Machine Learning -- a field where data is used to gain experience or make predictions (as opposed to fields where data is produced as a result of modeling, e.g. when modeling natural phenomena using partial differential equations). Tasks include classification, regression, ranking, clustering, dimensionality reduction. This can be done in a supervised or unsupervised manner, depending on whether data is labelled or not; and sometimes in an active way using reinforcement learning.
From a mathematical viewpoint, Machine Learning is a very broad field, which uses elements from linear algebra and optimization to devise numerical methods, and techniques of scientific computing and computer science to implement efficient algorithms. From a more abstract perspective, elements of statistical learning theory allow to give rigorous foundations to the various methods at hand, by formalizing concepts such as risk minimization, the bias-complexity tradeoff, overfitting, cross-validation, etc.
Hands-on sessions in the form of jupyter notebooks complement the lecture. They allow students to assess the successes and limitations of the most popular methods, first on toy synthetic data (for the ease of visualization and a complete understanding of the behavior of the algorithms), and then on more relevant datasets such as MNIST, or, in order to somewhat depart from models too often seen in introductory courses on Machine Learning, examples from physics such as the Ising model.
From a practical perspective, this will be done from scratch for simple methods, and using Scikit-learn for more advanced techniques except neural network models for which PyTorch will be used.
Diplôme(s) concerné(s)
Objectifs de développement durable
ODD 15 Vie terrestre.Pour les étudiants du diplôme M1 APPMS - Mathématiques Appliquées et Statistiques
Pré-requis : Cours de base en probabilités et statistiques.
Format des notes
Numérique sur 20Littérale/grade réduitPour les étudiants du diplôme M1 APPMS - Mathématiques Appliquées et Statistiques
Vos modalités d'acquisition :
- examen intermédiaire (5 points). Aucun document autorisé.
- projet, possiblement accompagné d'une discussion avec l'enseignant sur son contenu (5 points)
- examen final (10 points). Les notes manuscrites et les documents du cours sont autorisés (pas d'autre document).
Tout appareil électronique est interdit (calculatrice, tablette, etc).
L'examen de rattrapage est un examen écrit pour lequel les étudiants doivent refaire des exercices vus en cours ou aux examens de l'année (intermédiaire/final). Aucun document n'est autorisé. En cas de succès à l'examen de rattrapage, le cours sera validé avec une note de 10/20.
Le rattrapage est autorisé (Max entre les deux notes)- Crédits ECTS acquis : 6 ECTS
Programme détaillé
The lecture will mostly be based on the book "Probabilistic Machine Learning: An Introduction" by Kevin Murphy. It will cover various learning algorithms, mostly in the supervised setting (both for classification and regression), but also in some unsupervised settings. The following topics will be discussed:
- Introduction to machine learning
- Least square regression
- Classification with logistic regression
- Stochastic gradient methods
- Principal component analysis
- Support vector machines
- Trees and ensemble methods
- Neural networks
- Clustering methods