| 1 | 29-Mar | Introduction and MDP basics [annotated, scribe,video] | AJKS Ch 1.1-1.2 | HW0 out |
| 2 | 31-Mar | Markov Decision Processes I [annotated, scribe,video] | AJKS Ch 1.3-1.5 | |
| 3 | 5-Apr | Markov Decision Processes II [annotated, scribe,video] | AJKS Ch 2 | HW1 out |
| 4 | 7-Apr | MDP III and RL Algorithms I [annotated, scribe,video] | SB Ch 5-6 | |
| 5 | 12-Apr | RL Algorithms II [annotated, scribe,video] | SB Ch 9-10 | HW0 due |
| 6 | 14-Apr | RL Algorithm III and Exploration I: MAB [annotated,video] | SB Ch 13, AJKS Ch 9, AJKS Ch 5.1 | |
| 7 | 19-Apr | Exploration I: MAB and Linear Bandits [annotated, scribe,video] | AJKS Ch 5.1 | Project proposal due |
| 8 | 21-Apr | Exploration II: Linear Bandits [annotated, scribe,video] | AJKS Ch 5.2-5.3 | |
| 9 | 26-Apr | Exploration III: Tabular MDPs [annotated, scribe,video] | AJKS Ch 6 | HW2 out / HW1 due |
| 10 | 28-Apr | Exploration IV: Linear MDP [annotated, scribe,video] | AJKS Ch 7 | |
| 11 | 3-May | Wrap up exploration, Intro to Offline RL [annotated,video] | AJKS 7.3-7.4, Lihong's perspective article. | Midterm report due |
| 12 | 5-May | Offline RL: OPE in Bandits and RL [annotated, scribe,video] | (W., Agarwal, Dudik, 2016) (Jiang et al., 2016) | |
| 13 | 10-May | Offline RL: MIS and Fitted Q Iterations [annotated, scribe,video] | (Yin and W., 2019) (Duan and Wang, 2019) | HW2 due |
| 14 | 12-May | Offline RL: Uniform OPE [annotated,video] | (Yin et al., 2020) | |
| 15 | 17-May | Offline RL: Uniform OPE and optimal offline learning [annotated,video] | (Yin et al., 2020) | HW3 out |
| 16 | 19-May | Offline RL: Function approximation [annotated,video] | AJKS Ch 15 | |
| 17 | 24-May | Office Hours / Project Consulation | | |
| 18 | 26-May | Office Hours / Project Consulation | | |
| 19 | 31-May | No lecture, Memorial Day | | |
| 20 | 2-Jun | Mini-Symposium on Statistical RL | | HW3 due / Final project report due |