Skip to content

Commit ce43f40

Browse files
Update README.md
1 parent ac79f29 commit ce43f40

File tree

1 file changed

+79
-79
lines changed

1 file changed

+79
-79
lines changed

README.md

Lines changed: 79 additions & 79 deletions
Original file line numberDiff line numberDiff line change
@@ -23,23 +23,23 @@ The book starts with an introduction to Reinforcement Learning followed by OpenA
2323

2424
* [1.1. What is Reinforcement Learning?](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/01.%20Introduction%20to%20Reinforcement%20Learning/1.1%20What%20is%20Reinforcement%20Learning.ipynb)
2525
* 1.2. Reinforcement Learning Cycle
26-
* [1.3. How RL differs from other ML Paradigms?](#)
27-
* [1.4. Elements of Reinforcement Learning](#)
28-
* [1.5. Agent Environment Interface](#)
29-
* [1.6. Types of RL Environments](#)
30-
* [1.7. Reinforcement Learning Platforms](#)
31-
* [1.8. Applications of Reinforcement Learning](#)
26+
* 1.3. How RL differs from other ML Paradigms?
27+
* 1.4. Elements of Reinforcement Learning
28+
* 1.5. Agent Environment Interface
29+
* 1.6. Types of RL Environments
30+
* 1.7. Reinforcement Learning Platforms
31+
* 1.8. Applications of Reinforcement Learning
3232

3333

3434

3535
### [2. Getting Started with OpenAI and Tensorflow](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/02.%20Getting%20Started%20with%20OpenAI%20and%20Tensorflow)
3636

37-
* [2.1. Setting Up Your Machine ](#)
38-
* [2.2. Installing Anaconda](#)
39-
* [2.3. Installing Docker](#)
40-
* [2.4. Installing OpenAI Gym and Universe ](#)
41-
* [2.5. Common Error Fixes](#)
42-
* [2.6. OpenAI Gym ](#)
37+
* 2.1. Setting Up Your Machine
38+
* 2.2. Installing Anaconda
39+
* 2.3. Installing Docker
40+
* 2.4. Installing OpenAI Gym and Universe
41+
* 2.5. Common Error Fixes
42+
* 2.6. OpenAI Gym
4343
* [2.7. Basic Simulations](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/02.%20Getting%20Started%20with%20OpenAI%20and%20Tensorflow/2.07%20Basic%20Simulations.ipynb)
4444
* [2.8. Training a Robot to walk ](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/02.%20Getting%20Started%20with%20OpenAI%20and%20Tensorflow/2.08%20Training%20an%20Robot%20to%20Walk.ipynb)
4545
* [2.9. Building a Video Game Bot](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/02.%20Getting%20Started%20with%20OpenAI%20and%20Tensorflow/2.09%20Building%20a%20Video%20Game%20Bot%20.ipynb)
@@ -50,121 +50,121 @@ The book starts with an introduction to Reinforcement Learning followed by OpenA
5050
### [3. Markov Decision Process and Dynamic Programming](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/03.%20Markov%20Decision%20Process%20and%20Dynamic%20Programming)
5151

5252

53-
* [3.1. Markov Chain and Markov Process ](#)
54-
* [3.2. Markov Decision Process ](#)
55-
* [3.3. Rewards and Returns ](#)
56-
* [3.4. Episodic and Continous Tasks](#)
57-
* [3.5. Policy Function ](#)
58-
* [3.6. State Value Function](#)
59-
* [3.7. State-Action Value Function (Q Function) ](#)
60-
* [3.8. Bellman Equation and Optimality ](#)
61-
* [3.9. Deriving Bellman Equation for Value and Q functions](#)
62-
* [3.10. Solving the Bellman Equation ](#)
63-
* [3.11. Dynamic Programming ](#)
53+
* 3.1. Markov Chain and Markov Process
54+
* 3.2. Markov Decision Process
55+
* 3.3. Rewards and Returns
56+
* 3.4. Episodic and Continous Tasks
57+
* 3.5. Policy Function
58+
* 3.6. State Value Function
59+
* 3.7. State-Action Value Function (Q Function)
60+
* 3.8. Bellman Equation and Optimality
61+
* 3.9. Deriving Bellman Equation for Value and Q functions
62+
* 3.10. Solving the Bellman Equation
63+
* 3.11. Dynamic Programming
6464
* [3.12. Solving Frozen Lake Problem using Value Iteration](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/03.%20Markov%20Decision%20Process%20and%20Dynamic%20Programming/3.12%20Value%20Iteration%20-%20Frozen%20Lake%20Problem.ipynb)
6565
* [3.13. Solving Frozen Lake Problem using Policy Iteration](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/03.%20Markov%20Decision%20Process%20and%20Dynamic%20Programming/3.13%20Policy%20Iteration%20-%20Frozen%20Lake%20Problem.ipynb)
6666

6767

6868
### [4. Gaming with Monte Carlo Methods](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/04.%20Gaming%20with%20Monte%20Carlo%20Methods)
6969

70-
* [4.1. Monte Carlo Methods](#)
70+
* 4.1. Monte Carlo Methods
7171
* [4.2. Estimating Value of Pi Using Monte Carlo](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/04.%20Gaming%20with%20Monte%20Carlo%20Methods/4.2%20Estimating%20Value%20of%20Pi%20using%20Monte%20Carlo.ipynb)
72-
* [4.3. Monte Carlo Prediction](#)
73-
* [4.4. First visit Monte Carlo](#)
74-
* [4.5. Every visit Monte Carlo](#)
72+
* 4.3. Monte Carlo Prediction
73+
* 4.4. First visit Monte Carlo
74+
* 4.5. Every visit Monte Carlo
7575
* [4.6. BlackJack with Monte Carlo](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/04.%20Gaming%20with%20Monte%20Carlo%20Methods/4.6%20BlackJack%20with%20First%20visit%20MC.ipynb)
76-
* [4.7. Monte Carlo Control](#)
77-
* [4.8. Monte Carlo Exploration Starts](#)
78-
* [4.9. On Policy Monte Carlo Control](#)
79-
* [4.10. Off Policy Monte Carlo Control](#)
76+
* 4.7. Monte Carlo Control](#)
77+
* 4.8. Monte Carlo Exploration Starts
78+
* 4.9. On Policy Monte Carlo Control
79+
* 4.10. Off Policy Monte Carlo Control
8080

8181

8282
### [5. Temporal Difference Learning](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/05.%20Temporal%20Difference%20Learning)
8383

8484

85-
* [5.1. Temporal Difference Learning](#)
86-
* [5.2. TD Prediction](#)
87-
* [5.3. TD Control](#)
88-
* [5.4. Q Learning](#)
85+
* 5.1. Temporal Difference Learning
86+
* 5.2. TD Prediction
87+
* 5.3. TD Control
88+
* 5.4. Q Learning
8989
* [5.5. Solving the Taxi Problem using Q learning](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/5.%20Temporal%20Difference%20Learning/05.5%20Taxi%20Problem%20-%20Q%20Learning.ipynb)
90-
* [5.6. SARSA](#)
90+
* 5.6. SARSA
9191
* [5.7. Solving the Taxi Problem using SARSA](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/5.%20Temporal%20Difference%20Learning/05.7%20Taxi%20Problem%20-%20SARSA.ipynb)
92-
* [5.8. Difference Between Q learning and SARSA](#)
92+
* 5.8. Difference Between Q learning and SARSA
9393

9494

9595
### [6. Multi-Armed Bandit Problem](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/06.%20Multi-Armed%20Bandit%20Problem)
9696

9797

9898
* [6.1. Multi-armed Bandit Problem](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/06.%20Multi-Armed%20Bandit%20Problem/6.1%20MAB%20-%20Various%20Exploration%20Strategies.ipynb)
99-
* [6.2. Epsilon-Greedy Algorithm](#)
100-
* [6.3. Softmax Exploration Algorithm](#)
101-
* [6.4. Upper Confidence Bound Algorithm](#)
102-
* [6.5. Thompson Sampling Algorithm](#)
103-
* [6.6. Applications of MAB](#)
99+
* 6.2. Epsilon-Greedy Algorithm
100+
* 6.3. Softmax Exploration Algorithm
101+
* 6.4. Upper Confidence Bound Algorithm
102+
* 6.5. Thompson Sampling Algorithm
103+
* 6.6. Applications of MAB
104104
* [6.7. Identifying Right Advertisement Banner Using MAB](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/06.%20Multi-Armed%20Bandit%20Problem/6.7%20Identifying%20Right%20AD%20Banner%20Using%20MAB.ipynb)
105-
* [6.8. Contextual Bandits](#)
105+
* 6.8. Contextual Bandits
106106

107107

108108
### [7. Deep Learning Fundamentals](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/07.%20Deep%20Learning%20Fundamentals)
109109

110-
* [7.1. Artificial Neurons](#)
111-
* [7.2. Artificial Neural Network](#)
112-
* [7.3. Activation Functions](#)
113-
* [7.4. Deep Dive into ANN](#)
114-
* [7.5. Gradient Descent](#)
110+
* 7.1. Artificial Neurons
111+
* 7.2. Artificial Neural Network
112+
* 7.3. Activation Functions
113+
* 7.4. Deep Dive into ANN
114+
* 7.5. Gradient Descent
115115
* [7.6. Neural Networks in Tensorflow](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/07.%20Deep%20Learning%20Fundamentals/7.6%20Neural%20Network%20Using%20Tensorflow.ipynb)
116-
* [7.7. Recurrent Neural Network](#)
117-
* [7.8. Backpropagation Through Time](#)
118-
* [7.9. Long Short Term Memory RNN](#)
116+
* 7.7. Recurrent Neural Network
117+
* 7.8. Backpropagation Through Time
118+
* 7.9. Long Short Term Memory RNN
119119
* [7.10. Generating Song Lyrics using LSTM RNN](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/07.%20Deep%20Learning%20Fundamentals/7.10%20Generating%20Song%20Lyrics%20Using%20LSTM%20RNN.ipynb)
120-
* [7.11. Convolutional Neural Networks](#)
121-
* [7.12. CNN Architecture](#)
120+
* 7.11. Convolutional Neural Networks
121+
* 7.12. CNN Architecture
122122
* [7.13. Classifying Fashion Products Using CNN](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/07.%20Deep%20Learning%20Fundamentals/7.13%20Classifying%20Fashion%20Products%20Using%20CNN.ipynb)
123123

124124

125125
### [8. Atari Games With Deep Q Network](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/08.%20Atari%20Games%20with%20DQN)
126126

127-
* [8.1. What is Deep Q network?](#)
128-
* [8.2. Architecture of DQN](#)
129-
* [8.3. Convolutional Network](#)
130-
* [8.4. Experience Replay](#)
131-
* [8.5. Target Network](#)
132-
* [8.6. Clipping Rewards](#)
133-
* [8.7. DQN Algorithm](#)
127+
* 8.1. What is Deep Q network
128+
* 8.2. Architecture of DQN
129+
* 8.3. Convolutional Network
130+
* 8.4. Experience Replay
131+
* 8.5. Target Network
132+
* 8.6. Clipping Rewards
133+
* 8.7. DQN Algorithm
134134
* [8.8. Building an Agent to Play Atari Games](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/08.%20Atari%20Games%20with%20DQN/8.8%20Building%20an%20Agent%20to%20Play%20Atari%20Games.ipynb)
135-
* [8.9. Double DQN](#)
136-
* [8.10. Dueling Architecture](#)
135+
* 8.9. Double DQN
136+
* 8.10. Dueling Architecture
137137

138138

139139
### [9. Playing Doom With Deep Recurrent Q Network ](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/09.%20Playing%20Doom%20Game%20using%20DRQN)
140140

141-
* [9.1. Deep Recurrent Q Network](#)
142-
* [9.2. Partially Observable MDP](#)
143-
* [9.3. Architecture of DRQN](#)
141+
* 9.1. Deep Recurrent Q Network
142+
* 9.2. Partially Observable MDP
143+
* 9.3. Architecture of DRQN
144144
* [9.4. Basic Doom Game](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/09.%20Playing%20Doom%20Game%20using%20DRQN/9.4%20Basic%20Doom%20Game.ipynb)
145145
* [9.5. Build an Agent to Play Doom Game using DRQN](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/09.%20Playing%20Doom%20Game%20using%20DRQN/9.5%20Doom%20Game%20Using%20DRQN.ipynb)
146-
* [9.6. Deep Attention Recurrent Q Network](#)
146+
* 9.6. Deep Attention Recurrent Q Network
147147

148148

149149
### [10. Asynchronous Advantage Actor Critic Network ](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/10.%20Aysnchronous%20Advantage%20Actor%20Critic%20Network)
150150

151-
* [10.1. Asynchronous Actor Critic Algorithm](#)
152-
* [10.2. The three A's](#)
153-
* [10.3. Architecture of A3C](#)
154-
* [10.4. Working of A3C](#)
151+
* 10.1. Asynchronous Actor Critic Algorithm
152+
* 10.2. The three A's
153+
* 10.3. Architecture of A3C
154+
* 10.4. Working of A3C
155155
* [10.5. Drive up the Mountain with A3C](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/10.%20Aysnchronous%20Advantage%20Actor%20Critic%20Network/10.5%20Drive%20up%20the%20Mountain%20Using%20A3C.ipynb)
156-
* [10.6. Visualization in Tensorboard](#)
156+
* 10.6. Visualization in Tensorboard
157157

158158

159159

160160
### [11. Policy Gradients and Optimization](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/11.%20Policy%20Gradients%20and%20Optimization)
161161

162-
* [11.1. Policy Gradient ](#)
162+
* 11.1. Policy Gradient
163163
* [11.2. Lunar Lander Using Policy Gradient](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/11.%20Policy%20Gradients%20and%20Optimization/11.2%20Lunar%20Lander%20Using%20Policy%20Gradients.ipynb)
164-
* [11.3. Deep Deterministic Policy Gradient ](#)
164+
* 11.3. Deep Deterministic Policy Gradient
165165
* [11.4. Swinging up the Pendulum using DDPG](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/11.%20Policy%20Gradients%20and%20Optimization/11.3%20Swinging%20Up%20the%20Pendulum%20Using%20DDPG.ipynb)
166-
* [11.5. Trust Region Policy Optimization](#)
167-
* [11.6. Proximal Policy Optimization](#)
166+
* 11.5. Trust Region Policy Optimizatio
167+
* 11.6. Proximal Policy Optimization
168168

169169
### [12. Capstone Project: Car Racing using DQN](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/12.%20Capstone%20Project:%20Car%20Racing%20using%20DQN)
170170

@@ -177,9 +177,9 @@ The book starts with an introduction to Reinforcement Learning followed by OpenA
177177

178178
### [13. Recent Advancements and Next Steps](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/13.%20Recent%20Advancements%20and%20Next%20Steps)
179179

180-
* [13.1. Imagination Augmented Agents](#)
181-
* [13.2. Learning From Human Preference](#)
180+
* 13.1. Imagination Augmented Agents
181+
* 13.2. Learning From Human Preference
182182
* [13.3. Deep Q Learning From Demonstrations](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/13.%20Recent%20Advancements%20and%20Next%20Steps/13.3%20Deep%20Q%20Learning%20From%20Demonstrations.ipynb)
183183
* [13.4. Hindsight Experience Replay](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/13.%20Recent%20Advancements%20and%20Next%20Steps/13.4%20Hindsight%20Experience%20Replay.ipynb)
184-
* [13.5. Hierarchical Reinforcement Learning](#)
185-
* [13.6. Inverse Reinforcement Learning](#)
184+
* 13.5. Hierarchical Reinforcement Learning
185+
* 13.6. Inverse Reinforcement Learning

0 commit comments

Comments
 (0)