Programma Didattico MD2SL.docx


  • Introduction to Optimality conditions

  • Introduction to unconstrained local optimization methods

  • Stochastic gradient and variants

  • Basic constrained optimization methods

  • Global optimization

  • Exact global optimization methods

  • Heuristic global optimization methods

  • Bayesian optimization

Numerical Calculus and Linear Algebra

Coming soon

Probability and Stochastic Processes

  1. Probability:

    • Discrete random variables: Probability distributions, probability mass functions, cumulative distribution functions, mean and variance. Discrete models.

    • Joint probability distribution, Marginal distributions, Conditional probability, conditional mean and variance. Discrete models.

    • Continuous random variables: Probability distributions, probability density functions, cumulative distribution functions, mean and variance. Conditional probability. Continuous models.

    • Convergence theorems and normal approximation. Poisson Process and applications.

  2. Stochastic Processes:

    • Introduction to Markov Chains and their transition matrix.

    • Classification of states, invariant distributions.

    • Simulated annealing and Metropolis algorithm.

    • Birth-and-death chains on finite state spaces.

Statistical Inference and Modelling

  1. Inference and linear models:

    • Statistical thinking

    • Frequentist (classical) inference

    • Exploring associations

    • Significance tests

    • Prediction

  2. Generalized linear models:

    • Non-normal responses

    • Regression with a binary response

    • Binary data

    • The general linear logistic model

    • Inference and prediction

    • Generalized linear models

    • Contingency tables and Poisson models

    • Log-linear models

    • The Ising model in 3 binary variables

Algorithms and Programming in Python and R for Data Science

  1. Python:

    • Introduction to Python and simple Data

    • Python Modules and Functions

    • Selections and Iterations

    • Recursion and Strings

    • Lists and Dictionary

    • Classes and Objects, Files

    • Analysis of Algorithms

    • Sorting and Searching

  2. R:

    • Introduction to R: the R console, R packages, files .R

    • Elementary objects of R: vectors, matrices, arrays, lists; different typologies of objects (numerical, characters, logical, factorial)

    • Basic mathematical functions; personalization of functions

    • The dataframe: definition and manipulation

    • Data import and data export in R (.txt files, Excel files, Stata/SAS/SPSS files, .R Data files)

    • Manipulations of objects - 1: variable recoding, time variables, missing data, record linkage

    • Manipulations of objects - 2: statistical descriptive analyses (tables, synthetic measures, basic graphical display)

Introduction to Machine Learning

  • Supervised versus unsupervised ML, essential probability theory, statistics, and distributions for ML, Bayesian versus frequentist interpretations for ML

  • Linear models for supervised regression and classification

  • The bias-variance decomposition, overfitting, underfitting, and model regularization

  • Maximum Likelihood Estimation (MLE), the expectation-maximization (EM) algorithm, Maximum a Posteriori (MAP) versus Bayesian inference

  • Connectionist models and introduction to artificial neural networks

  • From neurons to artificial neural networks: training as a non-linear optimization problem

  • Backpropagation and gradient-based methods

  • Linear Support Vector Machines (SVMs)

  • Non-linear SVMs and radial basis function networks

  • Using the LIBSVM library

Statistical Learning

  1. Introduction to statistical learning:

    • Statistical point of view of machine learning

    • Data generating process

    • Monte Carlo simulations

  2. Graphical models:

    • Networks and concentration graph models

    • DAG and Bayesian network

  3. Supervised statistical learning based on trees:

    • CART algorithm

    • Bagging and Random forest

    • Boosted trees

    • BART

  4. Interpretable statistical learning:

    • Predicting vs explaining

    • Interpretability, transparency, fairness

Machine Learning

  • Introduction to supervised learning and regression.

  • Classification problems.

  • Online learning: the perceptron learning algorithm.

  • Gradient descent and stochastic gradient descent: analysis, MATLAB implementation, backpropagation.

  • Unsupervised learning. MATLAB implementation of principal component analysis and spectral clustering.

  • Introduction to statistical learning theory.

  • Structural risk minimization and support vector machines.

  • Trade-off between sample size and precision of supervision.

  • A comparison of approximation error bounds for neural networks and linear approximators.

  • Application of neural networks to optimal control problems.

  • Radial basis function interpolating networks and their application to surrogate modeling and optimization.

  • Connection between supervised learning and reinforcement learning.

Deep Learning, Neural Networks, and Reinforcement Learning

  • Sequence learning and recurrent networks

  • Attention mechanisms

  • Graph learning

  • Explainable machine learning

  • Explainable deep learning

Geo-spatial and Network Data Modelling

  1. Network data modelling:

    • Introduction to network data

    • Network representation: types of relations, graph representation, matrix representation

    • Hints on network visualization

    • Descriptive analysis of network data: network statistics

    • Descriptive analysis of network data: nodal statistics

    • Exponential Random Graph models

    • Stochastic blockmodels

    • Latent space models

  2. Geo-spatial data modelling:

    • Introduction to spatial and geographical data

    • Stochastic spatial processes and their properties

    • Analysis of point process data

    • Analysis of geodata random surface

    • Analysis of areal data (lattice data)

    • Spatial interaction data: gravity models

    • Introduction to Geographical Information Systems

Complex System Analysis

  • Dynamical systems in 1D, 2D and 3D

  • Fixed points and stability

  • Bifurcation theory

  • Discrete maps

  • Chaos

  • Turing instability in reaction diffusion models

  • Examples and applications

Text Mining and NLP

Coming soon

Network and Media Analysis

  1. Introduction to complex networks:

    • networks definition;

    • network representation;

    • degree and ANND.

  2. Introduction to Twitter data:

    • data structure

  3. User features and power law distributions:

    • information per user: tweets, retweets, followers and friends

    • power law distributions: scale free networks

    • verified users

  4. Gonzalez-Bailon user classification

  5. The retweet network:

    • building retweet network from data;

    • visualize the network;

    • assign attributes to nodes.

  6. Centrality measures:

    • Hub and Authorities;

    • Page Rank;

    • Node betweenness

  7. Community detection algorithms:

    • Girvan-Newman and the definition of Modularity;

    • The importance of null-model;

    • Louvain community detection

Analytics in Economics and Business

  • General introduction

  • New Tricks for Econometrics and Artificial Intelligence

  • Statistical Learning with Sparsity: The Lasso and Generalizations

  • Classification and Regression Trees

  • Matrix Completion and Networks

  • Using Big Data for Measurement and Research

  • Neural Networks

  • Mining Text and Images

Bayesian Inference and Causal Machine Learning

Coming soon

Hands-on Labs

  1. Hands on R and STATA for Data Science:

    • Introduction to R and STATA

    • Data Modeling for policy evaluation:

    • Data Modeling: inference and predictive analysis

    • Data Modeling: causal machine learning

  1. Hands on Python for Data Science:

  • Introduction to Python for data science:

    • Unsupervised learning

    • Dimensionality reduction

    • Neural networks and deep learning

    • Support Vector Machine

Experiments and Real-World Evidence in Economics

  1. From theory to data (and the way back). Introduction to behavioral and experimental economics.

2. Learning from the data. Correlation is not causation. In search for practicable ways to go beyond correlations in social and economic phenomena.

  • The controlled solution: Experiments (online, in the laboratory, in the field).

  • The less controlled solution: Natural and Quasi-experiments.

3. Statistical analysis of experimental data. Mediator variables, modulator variables, specific statistical tests, multiple testing of hypotheses.

4. Case studies.

  • Examples of controlled experiments and their analysis (e.g., risky behaviors, addiction, strategic behaviors, moral dilemmas, marketing, persuasion, nudging).

  • Examples of natural experiments and their analysis (e.g., Italian clemency bill and criminal behaviors).

  • Examples of quasi-experiments and their analysis (e.g., evaluating educational programs in primary schools).

Policy Evaluation and Impact Analysis

  1. Introduction to microeconometrics:

    • Structure, Endogeneity, and Identification Problems

    • Least-squares, Probit, and Logit Estimators

    • Static panel data

    • Dynamic panel data

  1. The Evaluation Problem:

    • Randomization and Matching Models

    • The Difference-in-difference Estimators

    • Instrumental Variables

    • Regression Discontinuity Design

  2. Causality and Non-linear Models:

    • Quantile Regressions

    • Multinomial Models

    • Models for Count Data

    • Survival/Duration Analyses

    • Models with Control Functions

Business Analytics

  • Non-parametric time series analysis

  • ARIMA models

  • GARCH models for heteroskedasticity

  • Forecasting methods and assessment of forecast accuracy

  • Introduction to multivariate time series analysis: VAR models

Optimization of Financial Portfolios

  1. Financial assets, returns, statistical features of returns

  2. Portfolio choice criteria: expected utility vs. Markowitz mean-variance

  3. Mean-variance portfolio selection in action

  4. Further topics: dealing with high-dimensional portfolios; constraints on concentration and turnover; the Black-Litterman model; sensitivity w.r.t. inputs ("estimation risk"); mean-VaR and mean-CVaR portfolio selection

  5. Portfolio optimization in Matlab: 'quadprog' function and 'portfolio' object via Financial Toolbox

Health Analytics and Data-Driven Medicine

  • Causal inference in healthcare with MEPS data

  • Predictive healthcare and patient outcome (digital records, diagnostic procedure and intervention)

  • Clinical trials and prescription behavior: market analysis and regulation

  • Epidemiology and COVID-19

Environmental and Genomic Data Analysis

Coming soon

Ethics and Law for Data Science

Coming soon