An Introduction to Machine Learning for Public Policy

Preamble

Machine learning has become an increasingly integral part of public policies. It is applied for policy problems that do not require causal inference but instead require predictive inference. Solving these prediction policy problems requires tools that are tuned to minimizing prediction errors, but also frameworks to ensure that models are efficient and fair. ML4PP will introduce the theory and applications of machine learning algorithms with a focus on policy applications and issues. The goals of this course include:

Developing a basic understanding of the statistical theory underlying common supervised machine learning algorithms
Developing skills necessary to train and assess the performance of selected popular machine learning algorithms for solving public policy problems
Gaining an understanding of the benefits and risks of applying machine learning algorithms to public policy problems

The course consists of 6 sessions each consisting of a technical introductory lecture and a hands-on application of the topics to a real-world policy problem. Students will be working with the programming language R, but coding is not the primary focus of the course.

To end the course, we will meet online for a Collaborative Policy Challenge, which will be delivered by a colleague from an International Organisation. In groups of interdisciplinary teams, we will provide a possible solution to the challenge, and get feedback from our peers and policy experts.

Course textbook (e-book available for free):

James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An Introduction to Statistical Learning

Course contents

1. Gentle Introduction to R and Rstudio, and Python.

Introduction to the course
Introduction to the R statistical programming language with the Rstudio IDE
Introduction to the Python programming language with Visual Studio Code

Instructors: Stephan, Alex and Michelle (who will give you a warm welcome!)

2. Introduction to Machine Learning for Public Policy

Prediction Policy problems
Inference vs. prediction for policy analysis
Assessing accuracy: bias-variance tradeoff
Training error vs. test error
Feature selection: brief introduction to Lasso

Instructors: Michelle González Amador

Readings:

Mandatory

An introduction to Statistical learning, Chapter 2, 3 (Regression), 5 (Cross-validation) and 6 (for more about Lasso).
Athey, S. (2017). Beyond prediction: Using big data for policy problems. Science, 355(6324), 483-485.
Kleinberg, J., Ludwig, J., Mullainathan, S. and Obermeyer, Z., 2015. Prediction policy problems. American Economic Review, 105(5), pp.491-95.

Optional readings

Kleinberg, J., Lakkaraju, H., Leskovec, J., Ludwig, J. and Mullainathan, S., 2017. Human decisions and machine predictions. The Quarterly Journal of Economics, 133(1), pp.237-293.
Hanna, R., & Olken, B. A. (2018). Universal basic incomes versus targeted transfers: Anti-poverty programs in developing countries. Journal of Economic Perspectives, 32(4), 201-26. (exercise application)
McBride, L., & Nichols, A. (2018). Retooling poverty targeting using out-of-sample validation and machine learning. The World Bank Economic Review, 32(3), 531-550.pter 5.1

3. Classification

Logistic regression
Confusion matrix
Performance metrics: Accuracy, Recall, Precision (…)

Instructor: Dr. Stephan Dietrich

Readings:

An introduction to Statistical learning Chapter 4
Athey, S., & Imbens, G. W. (2019). Machine learning methods that economists should know about. Annual Review of Economics, 11, 685-725.
McBride, L., & Nichols, A. (2018). Retooling poverty targeting using out-of-sample validation and machine learning. The World Bank Economic Review, 32(3), 531-550.pter 5.1

Optional Readings