21810762 - BIG DATA AND MACHINE LEARNING

The course aims to provide students with the basic methodological and application knowledge needed to solve machine learning problems and to analyze big data.

Students acquire theoretical and practical skills that allows them to use and develop machine-learning tools to analyze big data.
teacher profile | teaching materials

Programme

The characteristic of big data- Programming models for big data: Hadoop MapReduce and Apache Spark- Machine Learning algorithms. Apache Spark with R: sparklyr, dplyr, ggplot2.

Core Documentation

Slides provided by the teacher

Jared Dean. Big Data, Data Mining, and Machine Learning: Value Creation for Business Leaders and Practitioners, 2014, Wiley.

Readings and lecture notes provided by the teacher.

Reference Bibliography

Ankam, Venkat. Big data analytics. Packt Publishing Ltd, 2016. Dietrich D.. Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data. Wiley, 2015.

Type of delivery of the course

The course consists of 36 hours divided into lectures and exercises. During the exercises some applications on real data are illustrated.

Type of evaluation

The satisfactory achievement of the aims of the course is assessed through an exam with marks out of thirty. The exam includes an oral interview. The mark is expressed out of thirty and the pass mark is 18. The oral interview, of length approximately equal to 25 minutes, consists in theoretical questions on the main methods, models and, in general, notions included in the course program. In particular, the focus will be on evaluating the ability to correctly apply the taught methods, the rigour and clarity of expression.