# EDUCATIONAL OFFER 2020/21

## TEACHING PROGRAM

### Optimization

Introduction to Optimality conditions

Introduction to unconstrained local optimization methods

Stochastic gradient and variants

Basic constrained optimization methods

Global optimization

Exact global optimization methods

Heuristic global optimization methods

Bayesian optimization

### Numerical Calculus and Linear Algebra

Coming soon

### Probability and Stochastic Processes

Probability:

Discrete random variables: Probability distributions, probability mass functions, cumulative distribution functions, mean and variance. Discrete models.

Joint probability distribution, Marginal distributions, Conditional probability, conditional mean and variance. Discrete models.

Continuous random variables: Probability distributions, probability density functions, cumulative distribution functions, mean and variance. Conditional probability. Continuous models.

Convergence theorems and normal approximation. Poisson Process and applications.

Stochastic Processes:

Introduction to Markov Chains and their transition matrix.

Classification of states, invariant distributions.

Simulated annealing and Metropolis algorithm.

Birth-and-death chains on finite state spaces.

### Statistical Inference and Modelling

Inference and linear models:

Statistical thinking

Frequentist (classical) inference

Exploring associations

Significance tests

Prediction

Generalized linear models:

Non-normal responses

Regression with a binary response

Binary data

The general linear logistic model

Inference and prediction

Generalized linear models

Contingency tables and Poisson models

Log-linear models

The Ising model in 3 binary variables

### Algorithms and Programming in Python and R for Data Science

Python:

Introduction to Python and simple Data

Python Modules and Functions

Selections and Iterations

Recursion and Strings

Lists and Dictionary

Classes and Objects, Files

Analysis of Algorithms

Sorting and Searching

R:

Introduction to R: the R console, R packages, files .R

Elementary objects of R: vectors, matrices, arrays, lists; different typologies of objects (numerical, characters, logical, factorial)

Basic mathematical functions; personalization of functions

The dataframe: definition and manipulation

Data import and data export in R (.txt files, Excel files, Stata/SAS/SPSS files, .R Data files)

Manipulations of objects - 1: variable recoding, time variables, missing data, record linkage

Manipulations of objects - 2: statistical descriptive analyses (tables, synthetic measures, basic graphical display)

### Introduction to Machine Learning

Supervised versus unsupervised ML, essential probability theory, statistics, and distributions for ML, Bayesian versus frequentist interpretations for ML

Linear models for supervised regression and classification

The bias-variance decomposition, overfitting, underfitting, and model regularization

Maximum Likelihood Estimation (MLE), the expectation-maximization (EM) algorithm, Maximum a Posteriori (MAP) versus Bayesian inference

Connectionist models and introduction to artificial neural networks

From neurons to artificial neural networks: training as a non-linear optimization problem

Backpropagation and gradient-based methods

Linear Support Vector Machines (SVMs)

Non-linear SVMs and radial basis function networks

Using the LIBSVM library

### Statistical Learning

Introduction to statistical learning:

Statistical point of view of machine learning

Data generating process

Monte Carlo simulations

Graphical models:

Networks and concentration graph models

DAG and Bayesian network

Supervised statistical learning based on trees:

CART algorithm

Bagging and Random forest

Boosted trees

BART

Interpretable statistical learning:

Predicting vs explaining

Interpretability, transparency, fairness

### Machine Learning

Introduction to supervised learning and regression.

Classification problems.

Online learning: the perceptron learning algorithm.

Gradient descent and stochastic gradient descent: analysis, MATLAB implementation, backpropagation.

Unsupervised learning. MATLAB implementation of principal component analysis and spectral clustering.

Introduction to statistical learning theory.

Structural risk minimization and support vector machines.

Trade-off between sample size and precision of supervision.

A comparison of approximation error bounds for neural networks and linear approximators.

Application of neural networks to optimal control problems.

Radial basis function interpolating networks and their application to surrogate modeling and optimization.

Connection between supervised learning and reinforcement learning.

### Deep Learning, Neural Networks, and Reinforcement Learning

Sequence learning and recurrent networks

Attention mechanisms

Graph learning

Explainable machine learning

Explainable deep learning

### Geo-spatial and Network Data Modelling

Network data modelling:

Introduction to network data

Network representation: types of relations, graph representation, matrix representation

Hints on network visualization

Descriptive analysis of network data: network statistics

Descriptive analysis of network data: nodal statistics

Exponential Random Graph models

Stochastic blockmodels

Latent space models

Geo-spatial data modelling:

Introduction to spatial and geographical data

Stochastic spatial processes and their properties

Analysis of point process data

Analysis of geodata random surface

Analysis of areal data (lattice data)

Spatial interaction data: gravity models

Introduction to Geographical Information Systems

### Complex System Analysis

Dynamical systems in 1D, 2D and 3D

Fixed points and stability

Bifurcation theory

Discrete maps

Chaos

Turing instability in reaction diffusion models

Examples and applications

### Text Mining and NLP

Coming soon

### Network and Media Analysis

Introduction to complex networks:

networks definition;

network representation;

degree and ANND.

Introduction to Twitter data:

data structure

User features and power law distributions:

information per user: tweets, retweets, followers and friends

power law distributions: scale free networks

verified users

Gonzalez-Bailon user classification

The retweet network:

building retweet network from data;

visualize the network;

assign attributes to nodes.

Centrality measures:

Hub and Authorities;

Page Rank;

Node betweenness

Community detection algorithms:

Girvan-Newman and the definition of Modularity;

The importance of null-model;

Louvain community detection

### Analytics in Economics and Business

General introduction

New Tricks for Econometrics and Artificial Intelligence

Statistical Learning with Sparsity: The Lasso and Generalizations

Classification and Regression Trees

Matrix Completion and Networks

Using Big Data for Measurement and Research

Neural Networks

Mining Text and Images

### Bayesian Inference and Causal Machine Learning

Coming soon

### Hands-on Labs

Hands on R and STATA for Data Science:

Introduction to R and STATA

Data Modeling for policy evaluation:

Data Modeling: inference and predictive analysis

Data Modeling: causal machine learning

Hands on Python for Data Science:

Introduction to Python for data science:

Unsupervised learning

Dimensionality reduction

Neural networks and deep learning

Support Vector Machine

### Experiments and Real-World Evidence in Economics

From theory to data (and the way back). Introduction to behavioral and experimental economics.

2. Learning from the data. Correlation is not causation. In search for practicable ways to go beyond correlations in social and economic phenomena.

The controlled solution: Experiments (online, in the laboratory, in the field).

The less controlled solution: Natural and Quasi-experiments.

3. Statistical analysis of experimental data. Mediator variables, modulator variables, specific statistical tests, multiple testing of hypotheses.

4. Case studies.

Examples of controlled experiments and their analysis (e.g., risky behaviors, addiction, strategic behaviors, moral dilemmas, marketing, persuasion, nudging).

Examples of natural experiments and their analysis (e.g., Italian clemency bill and criminal behaviors).

Examples of quasi-experiments and their analysis (e.g., evaluating educational programs in primary schools).

### Policy Evaluation and Impact Analysis

Introduction to microeconometrics:

Structure, Endogeneity, and Identification Problems

Least-squares, Probit, and Logit Estimators

Static panel data

Dynamic panel data

The Evaluation Problem:

Randomization and Matching Models

The Difference-in-difference Estimators

Instrumental Variables

Regression Discontinuity Design

Causality and Non-linear Models:

Quantile Regressions

Multinomial Models

Models for Count Data

Survival/Duration Analyses

Models with Control Functions

### Business Analytics

Non-parametric time series analysis

ARIMA models

GARCH models for heteroskedasticity

Forecasting methods and assessment of forecast accuracy

Introduction to multivariate time series analysis: VAR models

### Optimization of Financial Portfolios

Financial assets, returns, statistical features of returns

Portfolio choice criteria: expected utility vs. Markowitz mean-variance

Mean-variance portfolio selection in action

Further topics: dealing with high-dimensional portfolios; constraints on concentration and turnover; the Black-Litterman model; sensitivity w.r.t. inputs ("estimation risk"); mean-VaR and mean-CVaR portfolio selection

Portfolio optimization in Matlab: 'quadprog' function and 'portfolio' object via Financial Toolbox

### Health Analytics and Data-Driven Medicine

Causal inference in healthcare with MEPS data

Predictive healthcare and patient outcome (digital records, diagnostic procedure and intervention)

Clinical trials and prescription behavior: market analysis and regulation

Epidemiology and COVID-19

### Environmental and Genomic Data Analysis

Coming soon

### Ethics and Law for Data Science

Coming soon