Descriptif
Course Objective
The course aims to familiarize students with advanced machine learning and data mining methods towards design and development of solutions for data sets that are characterized by complexity and large volume (Bigdata).
Real world cases will be presented in the area of Web and social media/networks.
Course contents
Data Preprocessing: Linear and nonlinear dimensionality reduction, spectral methods, Feature selection, Cross-validation.
Supervised learning: Linear Regression, Support vector machines (SVMs), Unsupervised learning: Gaussian Mixture models, EM algorithm, Spectral Clustering.
Learning in Graphs: ranking algorithms, evaluation measures, degeneracy and community mining methods.
Text Mining
Feature extraction (measures), Indexing (pros and cons) with regards to bigtable.,Retrieval functions (tf, idf, BM25 intuition etc), Adhoc retrieval – Filtering, classification
Bigdata
An introduction (Hadoop, Mapreduce),No SQL databases,algorithms in Mapreduce
Graph Mining & Community Evaluation algorithms - applications in social networks.
References
- Bayesian Reasoning and Machine Learning, David Barber, University College London, Cambridge University Press, ISBN:9780521518147, Publication date:February 2012
- Pattern Recognition and Machine Learning, Bishop, Christopher M., Springer, 1st ed. 2006 2006, XX, 740 p, ISBN 978-0-387-31073-2