DIPARTIMENTO DI SCIENZE ECONOMICHE E AZIENDALI
Corso di studio
INTERNATIONAL BUSINESS AND ENTREPRENEURSHIP - MANAGEMENT INTERNAZIONALE E IMPRENDITORIALITÀ
Primo Semestre (24/09/2018 - 21/12/2018)
66 ore di attività frontale
SCRITTO E ORALE CONGIUNTI
Knowledge of basic concepts of Statistics like inference, confidence interval, test of hypothesis, simple regression model.
For the coding part, it is recommended to have some familiarity with Python language, or similar software like R, even through on-line courses (ex. Coursera).
Knowledge of the most relevant statistical methods for large data set analysis. The student will learn how to run a complete work flow of analysis by employing Python software. From the collection and management of large data set, through the choice of the most appropriate models to the final interpretation and contextualization of the results.
Programma e contenuti
The aim of this course is to study and apply the most relevant statistical models in the analysis of large data set.
The perspective in mainly applicative: choosing and applying suitable models to exploit the whole informative content of (large) data set with a particular attention to the correct and contextualized interpretation of the final results. Moreover, a focus will be set on some frameworks for the management of large data set like MapReduce for data clustering.
The course will be held with the interactive employment of open source software like Python to learn practically the complete analysis work-flow.
A particular emphasis will be given to social network data, textual data, business-financial case studies.
Some of the models that will be covered are: Naive Bayes Classifier, Latent Dirichlet Analysis, Clustering algorithm, Penalized regression Support Vector Machines.
The class integrates theoretical lectures with practicals based on Python to learn how to implement and analyze the most appropriate models according to the available data.
A tutor will help students weekly in acquiring all the necessary theoretical and practical knowledge.
All the material used during the lectures (slides. script. data) will be available on Kiro platform.
Testi di riferimento
1) Coelho and Richert, Building Machine Learning Systems with Python, Second edition, Packt Publishing.
2)Introduction to Python for Econometrics, Statistics and Data Analysis
Kevin Sheppard, pdf version available
Modalità verifica apprendimento
Midterm about the first explained models.
Final project of analysis with real case studies.
Some lectured will be thought by experts of the big data field.