STATISTICAL LEARNING THEORY
Stampa
Enrollment year
2020/2021
Academic year
2020/2021
Regulations
DM270
Academic discipline
ING-INF/05 (DATA PROCESSING SYSTEMS)
Department
DEPARTMENT OF MATHEMATICS "FELICE CASORATI"
Course
MATHEMATICS
Curriculum
PERCORSO COMUNE
Year of study
Period
1st semester (01/10/2020 - 20/01/2021)
ECTS
6
Lesson hours
45 lesson hours
Language
English
Activity type
WRITTEN TEST
Teacher
DE NICOLAO GIUSEPPE (titolare) - 2 ECTS
DE NICOLAO GIUSEPPE (titolare) - 4 ECTS
Prerequisites
Matrix algebra; elements of probability: scalar and vector random variables; elements of statistics: estimators and their properties.
Learning outcomes
Knowledge of main learning methods for classification and regression, of their properties and limitations. Ability to translate an experimental learning problem into a statistical formulation and select an appropriate method for its solution.
Course contents
Introduction: Supervised and Unsupervised Learning.
Statistical Learning: Statistical Learning and Regression, Curse of Dimensionality and Parametric Models, Assessing Model Accuracy and Bias-Variance Trade-off, Classification Problems and K-Nearest Neighbors.
Linear Regression: Simple Linear Regression and Confidence Intervals, Hypothesis Testing, Multiple Linear Regression, Model Selection, Interactions and Nonlinearity.
Classification: Introduction to Classification, Logistic Regression and Maximum Likelihood, Linear Discriminant Analysis and Bayes Theorem, Naive Bayes.
Resampling Methods: Estimating Prediction Error and Validation Set Approach, K-fold Cross-Validation, Cross-Validation: The Right and Wrong Ways, The Bootstrap.
Linear Model Selection and Regularization: Linear Model Selection and Best Subset Selection, Stepwise Selection, Estimating Test Error Using Mallow’s Cp, AIC, BIC, Adjusted R-squared, Cross-Validation, Shrinkage Methods and Ridge Regression, The Lasso, Principal Components Regression and Partial Least Squares.
Moving Beyond Linearity: Polynomial Regression, Piecewise Polynomials and Splines, Smoothing Splines, Local Regression and Generalized Additive Models.
Tree-Based Methods: Decision Trees, Classification Trees and Comparison with Linear Models, Bootstrap Aggregation (Bagging) and Random Forests, Boosting.
Support Vector Machines: Support Vector Classifier, Kernels and Support Vector Machines.
Unsupervised Learning: Unsupervised Learning and Principal Components Analysis, K-means Clustering.
The fallacies of learning: regression to mediocrity, the covariate shift, statistical significance vs practical significance, correlation is not causation, observational vs experimental studies.
Teaching methods
Lectures, practical class.
Reccomended or required readings
Friedman, J., Hastie, T., & Tibshirani, R. (2001). The elements of statistical learning (Vol. 1, No. 10). New York: Springer series in statistics.
Assessment methods
Written examination: two theory-based questions and two practical ones.
Further information
Written examination: two theory-based questions and two practical ones.
Sustainable development goals - Agenda 2030