You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+79-79Lines changed: 79 additions & 79 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,23 +23,23 @@ The book starts with an introduction to Reinforcement Learning followed by OpenA
23
23
24
24
*[1.1. What is Reinforcement Learning?](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/01.%20Introduction%20to%20Reinforcement%20Learning/1.1%20What%20is%20Reinforcement%20Learning.ipynb)
25
25
* 1.2. Reinforcement Learning Cycle
26
-
*[1.3. How RL differs from other ML Paradigms?](#)
27
-
*[1.4. Elements of Reinforcement Learning](#)
28
-
*[1.5. Agent Environment Interface](#)
29
-
*[1.6. Types of RL Environments](#)
30
-
*[1.7. Reinforcement Learning Platforms](#)
31
-
*[1.8. Applications of Reinforcement Learning](#)
26
+
* 1.3. How RL differs from other ML Paradigms?
27
+
* 1.4. Elements of Reinforcement Learning
28
+
* 1.5. Agent Environment Interface
29
+
* 1.6. Types of RL Environments
30
+
* 1.7. Reinforcement Learning Platforms
31
+
* 1.8. Applications of Reinforcement Learning
32
32
33
33
34
34
35
35
### [2. Getting Started with OpenAI and Tensorflow](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/02.%20Getting%20Started%20with%20OpenAI%20and%20Tensorflow)
*[2.8. Training a Robot to walk ](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/02.%20Getting%20Started%20with%20OpenAI%20and%20Tensorflow/2.08%20Training%20an%20Robot%20to%20Walk.ipynb)
45
45
*[2.9. Building a Video Game Bot](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/02.%20Getting%20Started%20with%20OpenAI%20and%20Tensorflow/2.09%20Building%20a%20Video%20Game%20Bot%20.ipynb)
@@ -50,121 +50,121 @@ The book starts with an introduction to Reinforcement Learning followed by OpenA
50
50
### [3. Markov Decision Process and Dynamic Programming](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/03.%20Markov%20Decision%20Process%20and%20Dynamic%20Programming)
51
51
52
52
53
-
*[3.1. Markov Chain and Markov Process ](#)
54
-
*[3.2. Markov Decision Process ](#)
55
-
*[3.3. Rewards and Returns ](#)
56
-
*[3.4. Episodic and Continous Tasks](#)
57
-
*[3.5. Policy Function ](#)
58
-
*[3.6. State Value Function](#)
59
-
*[3.7. State-Action Value Function (Q Function) ](#)
60
-
*[3.8. Bellman Equation and Optimality ](#)
61
-
*[3.9. Deriving Bellman Equation for Value and Q functions](#)
62
-
*[3.10. Solving the Bellman Equation ](#)
63
-
*[3.11. Dynamic Programming ](#)
53
+
* 3.1. Markov Chain and Markov Process
54
+
* 3.2. Markov Decision Process
55
+
* 3.3. Rewards and Returns
56
+
* 3.4. Episodic and Continous Tasks
57
+
* 3.5. Policy Function
58
+
* 3.6. State Value Function
59
+
* 3.7. State-Action Value Function (Q Function)
60
+
* 3.8. Bellman Equation and Optimality
61
+
* 3.9. Deriving Bellman Equation for Value and Q functions
62
+
* 3.10. Solving the Bellman Equation
63
+
* 3.11. Dynamic Programming
64
64
*[3.12. Solving Frozen Lake Problem using Value Iteration](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/03.%20Markov%20Decision%20Process%20and%20Dynamic%20Programming/3.12%20Value%20Iteration%20-%20Frozen%20Lake%20Problem.ipynb)
65
65
*[3.13. Solving Frozen Lake Problem using Policy Iteration](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/03.%20Markov%20Decision%20Process%20and%20Dynamic%20Programming/3.13%20Policy%20Iteration%20-%20Frozen%20Lake%20Problem.ipynb)
66
66
67
67
68
68
### [4. Gaming with Monte Carlo Methods](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/04.%20Gaming%20with%20Monte%20Carlo%20Methods)
69
69
70
-
*[4.1. Monte Carlo Methods](#)
70
+
* 4.1. Monte Carlo Methods
71
71
*[4.2. Estimating Value of Pi Using Monte Carlo](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/04.%20Gaming%20with%20Monte%20Carlo%20Methods/4.2%20Estimating%20Value%20of%20Pi%20using%20Monte%20Carlo.ipynb)
72
-
*[4.3. Monte Carlo Prediction](#)
73
-
*[4.4. First visit Monte Carlo](#)
74
-
*[4.5. Every visit Monte Carlo](#)
72
+
* 4.3. Monte Carlo Prediction
73
+
* 4.4. First visit Monte Carlo
74
+
* 4.5. Every visit Monte Carlo
75
75
*[4.6. BlackJack with Monte Carlo](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/04.%20Gaming%20with%20Monte%20Carlo%20Methods/4.6%20BlackJack%20with%20First%20visit%20MC.ipynb)
*[5.5. Solving the Taxi Problem using Q learning](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/5.%20Temporal%20Difference%20Learning/05.5%20Taxi%20Problem%20-%20Q%20Learning.ipynb)
90
-
*[5.6. SARSA](#)
90
+
* 5.6. SARSA
91
91
*[5.7. Solving the Taxi Problem using SARSA](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/5.%20Temporal%20Difference%20Learning/05.7%20Taxi%20Problem%20-%20SARSA.ipynb)
92
-
*[5.8. Difference Between Q learning and SARSA](#)
*[6.7. Identifying Right Advertisement Banner Using MAB](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/06.%20Multi-Armed%20Bandit%20Problem/6.7%20Identifying%20Right%20AD%20Banner%20Using%20MAB.ipynb)
105
-
*[6.8. Contextual Bandits](#)
105
+
* 6.8. Contextual Bandits
106
106
107
107
108
108
### [7. Deep Learning Fundamentals](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/07.%20Deep%20Learning%20Fundamentals)
109
109
110
-
*[7.1. Artificial Neurons](#)
111
-
*[7.2. Artificial Neural Network](#)
112
-
*[7.3. Activation Functions](#)
113
-
*[7.4. Deep Dive into ANN](#)
114
-
*[7.5. Gradient Descent](#)
110
+
* 7.1. Artificial Neurons
111
+
* 7.2. Artificial Neural Network
112
+
* 7.3. Activation Functions
113
+
* 7.4. Deep Dive into ANN
114
+
* 7.5. Gradient Descent
115
115
*[7.6. Neural Networks in Tensorflow](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/07.%20Deep%20Learning%20Fundamentals/7.6%20Neural%20Network%20Using%20Tensorflow.ipynb)
116
-
*[7.7. Recurrent Neural Network](#)
117
-
*[7.8. Backpropagation Through Time](#)
118
-
*[7.9. Long Short Term Memory RNN](#)
116
+
* 7.7. Recurrent Neural Network
117
+
* 7.8. Backpropagation Through Time
118
+
* 7.9. Long Short Term Memory RNN
119
119
*[7.10. Generating Song Lyrics using LSTM RNN](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/07.%20Deep%20Learning%20Fundamentals/7.10%20Generating%20Song%20Lyrics%20Using%20LSTM%20RNN.ipynb)
120
-
*[7.11. Convolutional Neural Networks](#)
121
-
*[7.12. CNN Architecture](#)
120
+
* 7.11. Convolutional Neural Networks
121
+
* 7.12. CNN Architecture
122
122
*[7.13. Classifying Fashion Products Using CNN](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/07.%20Deep%20Learning%20Fundamentals/7.13%20Classifying%20Fashion%20Products%20Using%20CNN.ipynb)
123
123
124
124
125
125
### [8. Atari Games With Deep Q Network](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/08.%20Atari%20Games%20with%20DQN)
126
126
127
-
*[8.1. What is Deep Q network?](#)
128
-
*[8.2. Architecture of DQN](#)
129
-
*[8.3. Convolutional Network](#)
130
-
*[8.4. Experience Replay](#)
131
-
*[8.5. Target Network](#)
132
-
*[8.6. Clipping Rewards](#)
133
-
*[8.7. DQN Algorithm](#)
127
+
* 8.1. What is Deep Q network
128
+
* 8.2. Architecture of DQN
129
+
* 8.3. Convolutional Network
130
+
* 8.4. Experience Replay
131
+
* 8.5. Target Network
132
+
* 8.6. Clipping Rewards
133
+
* 8.7. DQN Algorithm
134
134
*[8.8. Building an Agent to Play Atari Games](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/08.%20Atari%20Games%20with%20DQN/8.8%20Building%20an%20Agent%20to%20Play%20Atari%20Games.ipynb)
135
-
*[8.9. Double DQN](#)
136
-
*[8.10. Dueling Architecture](#)
135
+
* 8.9. Double DQN
136
+
* 8.10. Dueling Architecture
137
137
138
138
139
139
### [9. Playing Doom With Deep Recurrent Q Network ](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/09.%20Playing%20Doom%20Game%20using%20DRQN)
*[9.5. Build an Agent to Play Doom Game using DRQN](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/09.%20Playing%20Doom%20Game%20using%20DRQN/9.5%20Doom%20Game%20Using%20DRQN.ipynb)
146
-
*[9.6. Deep Attention Recurrent Q Network](#)
146
+
* 9.6. Deep Attention Recurrent Q Network
147
147
148
148
149
149
### [10. Asynchronous Advantage Actor Critic Network ](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/10.%20Aysnchronous%20Advantage%20Actor%20Critic%20Network)
150
150
151
-
*[10.1. Asynchronous Actor Critic Algorithm](#)
152
-
*[10.2. The three A's](#)
153
-
*[10.3. Architecture of A3C](#)
154
-
*[10.4. Working of A3C](#)
151
+
* 10.1. Asynchronous Actor Critic Algorithm
152
+
* 10.2. The three A's
153
+
* 10.3. Architecture of A3C
154
+
* 10.4. Working of A3C
155
155
*[10.5. Drive up the Mountain with A3C](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/10.%20Aysnchronous%20Advantage%20Actor%20Critic%20Network/10.5%20Drive%20up%20the%20Mountain%20Using%20A3C.ipynb)
156
-
*[10.6. Visualization in Tensorboard](#)
156
+
* 10.6. Visualization in Tensorboard
157
157
158
158
159
159
160
160
### [11. Policy Gradients and Optimization](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/11.%20Policy%20Gradients%20and%20Optimization)
161
161
162
-
*[11.1. Policy Gradient](#)
162
+
* 11.1. Policy Gradient
163
163
*[11.2. Lunar Lander Using Policy Gradient](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/11.%20Policy%20Gradients%20and%20Optimization/11.2%20Lunar%20Lander%20Using%20Policy%20Gradients.ipynb)
164
-
*[11.3. Deep Deterministic Policy Gradient ](#)
164
+
* 11.3. Deep Deterministic Policy Gradient
165
165
*[11.4. Swinging up the Pendulum using DDPG](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/11.%20Policy%20Gradients%20and%20Optimization/11.3%20Swinging%20Up%20the%20Pendulum%20Using%20DDPG.ipynb)
166
-
*[11.5. Trust Region Policy Optimization](#)
167
-
*[11.6. Proximal Policy Optimization](#)
166
+
* 11.5. Trust Region Policy Optimizatio
167
+
* 11.6. Proximal Policy Optimization
168
168
169
169
### [12. Capstone Project: Car Racing using DQN](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/12.%20Capstone%20Project:%20Car%20Racing%20using%20DQN)
170
170
@@ -177,9 +177,9 @@ The book starts with an introduction to Reinforcement Learning followed by OpenA
177
177
178
178
### [13. Recent Advancements and Next Steps](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/13.%20Recent%20Advancements%20and%20Next%20Steps)
179
179
180
-
*[13.1. Imagination Augmented Agents](#)
181
-
*[13.2. Learning From Human Preference](#)
180
+
* 13.1. Imagination Augmented Agents
181
+
* 13.2. Learning From Human Preference
182
182
*[13.3. Deep Q Learning From Demonstrations](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/13.%20Recent%20Advancements%20and%20Next%20Steps/13.3%20Deep%20Q%20Learning%20From%20Demonstrations.ipynb)
0 commit comments