
S&DS 365 is a second course in machine learning at the advanced undergraduate or beginning graduate level. The course assumes familiarity with the basic ideas and techniques in machine learning, for example as covered in S&DS 265. The course treats methods together with mathematical frameworks that provide intuition and justifications for how and when the methods work. Assignments give students hands-on experience with machine learning techniques, to build the skills needed to adapt approaches to new problems. Topics include nonparametric regression and classification, kernel methods, risk bounds, nonparametric Bayesian approaches, attention and language models, generative models, sparsity and manifolds, and reinforcement learning. Programming is central to the course, and is based on the Python programming language and Jupyter notebooks.
As prerequisites, students are expected to have a background in probability and statistics, at the level of S&DS 242 (Theory of Statistics), familiarity with the core ideas from linear algebra, for example through Math 222 (Linear Algebra with Applications), and computational skills at the level of S&DS 265 (Introductory Machine Learning) or CPSC 200 (Introduction to Information Systems). Background material can be found at the Introductory Machine Learning (iML, S&DS 265) course site.
Computing for the course uses Python in Jupyter notebooks. The recommended way to run these notebooks is in the cloud on Google Colab, accessed by clicking on the icon. The notebooks can also be run locally using Anaconda with the IML environment that includes the packages needed (click here to download)
; instructions for installing this environment are available on Yale Canvas.
Complementary readings refer to sections in the book Probabilistic Machine Learning: An Introduction, by Kevin Murphy, MIT Press, 2022. Part I “Foundations” gives a good treatment of background in probability, statistics, and linear algebra that is useful for this course. (But this part of the book also covers much more than we need.)
Assignments and quizzes are posted and due on Wednesday in a given week.
Lectures: Monday/Wednesday 1:00-2:15pm
HQ L02 - Humanities Quadrangle L02
| Week | Dates | Topics | Demos & Tutorials | Lecture Slides | Readings & Notes | Assignments & Exams |
|---|---|---|---|---|---|---|
| 1 | Aug 27, Aug 29 | Sparse regression | Wed: Course overview Fri: Sparse regression |
Google Colab Basics PML Section 11.4 Notes on linear regression |
||
| 2 | Sep 3 | Smoothing and kernels | Wed: Lasso (continued) and smoothing | PML Sections 16.3, 17.1 Notes on computing the lasso |
Quiz 1 | |
| 3 | Sep 8, 10 | Density estimation and Mercer kernels | Mon: Smoothing and density estimation Wed: Mercer kernels |
Risk bounds for local smoothing Notes on Mercer kernels |
||
| 4 | Sep 15, 17 | Neural networks and overparameterized models | TensorFlow playground |
Mon: Neural networks Wed: Double descent |
PML Sections 13.1, 13.2 Notes on backpropagation Notes on double descent |
Quiz 2 |
| 5 | Sep 22, 24 | Convolutional neural networks | Mon: Convolutional neural networks Wed: CNNs and Gaussian Processes |
PML Section 17.2 Notes on Bayesian inference Notes on nonparametric Bayes |
Assn 1 in |
|
| 6 | Sept 29, Oct 1 | Gaussian processes and approximate inference | Mon: Gaussian processes Wed: Recap of GPs Introduction to approximate inference |
Notes on simulation | Quiz 3 | |
| 7 | Oct 6, 8 | Variational inference | Mon: Variational inference Wed: VAEs |
PML Section 20.3 Notes on variational inference |
Assn 2 in |
|
| 8 | Oct 13 | Midterm | Practice midterms | Oct 13: Midterm exam | ||
| 9 | Oct 20, 22 | Graphs and structure learning | Mon: Sparsity and graphs Wed: Discrete data and graph neural nets |
Notes on graphs and structure learning Graph neural networks PML Section 23.4 |
||
| 10 | Oct 27, Oct 29 | Deep reinforcement learning | Mon: Reinforcement learning Wed: Deep reinforcement learning |
Sutton and Barto, Section 6.5 | Assn 3 in |
|
| 11 | Nov 3, 5 | Policy methods | Mon: Policy gradient methods Wed: Policy gradients (continued) |
Sutton and Barto, Section 13.1-13.3, 13.5 | Quiz 4 | |
| 12 | Nov 10, 12 | Sequential models | Mon: HMMs and RNNs Wed: RNNs, GRUs, LSTMs, and all that |
TensorFlow: Text generation Notes on HMMs and Kalman filters PML Chapter 15 |
||
| 13 | Nov 17, 19 | Sequence-to-sequence models and Transformers | Mon: Sequence models and attention Wed: Transformers, LLM scaling PML Sections 15.4, 15.5 |
Quiz 5 Assn 4 in |
||
| 14 | Nov 24, 26 | No class, Thanksgiving break | ||||
| 15 | Dec 1, 3 | The LLM pipeline; broader issues | Minimal LLM decoder |
Mon: LLM finetuning, postprocessing Wed: Course wrap up |
Assn 5 in | |
| 17 | Dec 12, 9am | Final exam | Practice exams | Registrar: final exam schedule |