21210229 - Statistical learning

• Bayesian networks. Structural and parameter learning. Estimation of association between variables.
• Introduction to causality.
• Sampling techniques.
• Quality analysis of the information in big data. Weighting, calibrating and sampling techniques for going from big data to smart data.
Prerequisites: students need to have passed the two previous Statistics courses.
teacher profile | teaching materials

Programme

The course will cover the following topics:
- Introduction to the main models of statistical learning;
- Prediction and classification problems: Recalls on linear regression and the main methods of unsupervised classification;
- Supervised classification: K-Nearest-Neighbours;
- Misclassification error;
- Resampling methods: cross validation and bootstrap;
- Decision tree-based methods: regression trees, classification trees, bagging, random forests, boosting.
- Introduction to semi-supervised classification methods;
- Use of the statistical environment R

Core Documentation

The teaching materials will be available on the Moodle page and the teaching Teams class.

Type of delivery of the course

The teaching is structured in 40 hours of frontal teaching, with theoretical and practical lessons (using RStudio software).

Type of evaluation

oral examination