v2.11.0 (5509)

PA - C8 - DS-UPSUD-3 : NLP


Course description: This class is an introductory course to the domain of Natural Language Processing (NLP). It is meant to provide a high-level overview of this vibrant field, which is evolving at a fast pace owing to the recent advances of deep learning models; it also covers some of the main algorithmic development notably aimed to process structured linguistic data such as syntactic parse trees or semantic graphs, as well as the deep neural architectures that are used to learn numerical representations for words and phrases; the course finally includes a glimpse at its most recent developments aimed at developing NLP systems in a multilingual contexts.


Main themes : 

  • Statistical NLP: A brief retrospective 
  • Words and the Lexicon
  • The art of language modeling
  • The essence of NLP: Models for Structured Data
  • Shallow Semantics and Representation Learning
  • Multilingualism and Machine Translation


Language: English


Numerus Clausus: 50


Recommended readings:


  • Jacob EISEINSEIN. Natural Language Processing. The MIT Press, 2019
  • Yoav GOLDBERG. Neural Network Methods for Natural Language Processing. Morgan & Claypool Publishers. 2017. 287 pages. ISBN 978-1-62705-298-6


  • Julia Hirschberg and Christopher D. Manning (2016) Advances in natural language processing, Science Magazine.


Prerequisites: Basic statistics and optimization, formal language theory (automata and grammars) is a big plus.


Grading: Final quizz. 

effectifs minimal / maximal:


Diplôme(s) concerné(s)

Format des notes

Numérique sur 20

Littérale/grade réduit

Pour les étudiants du diplôme Data Sciences

Le rattrapage est autorisé (Max entre les deux notes)
    L'UE est acquise si Note finale >= 10
    • Crédits ECTS acquis : 3 ECTS
    Veuillez patienter