The course aims to introduce methods and models to extract relevant information from large amounts of data, with particular attention to
statistical learning (statistical learning) both in a predictive and nonpredictive context (supervised and non-supervised learning). In order to
provide the skills for the analysis and modeling of real data, the lessons will be supplemented by R exercises in the computer room.
Program:
Introduction to data mining and statistical learning.
Data visualization techniques
Regression and Classification: multiple linear regression, logistic regression, discriminant analysis and K-nearest
neighbors.
Non-linear methods (flexible regression): polynomial regression, regression splines, smoothing splines, generalized additive models.
Unsupervised learning: association rules, principal component analysis,
grouping methods.