Algorithms for Policy Evaluation, Estimation of Action Values, Policy Improvement, Policy Iteration, Truncated Policy Evaluation, Truncated Policy Iteration, Value Iteration . From Udacity's Deep Reinforcement Learning Nanodegree program.
reinforcement-learning openai-gym gym dynamic-programming policy-evaluation policy-iteration value-iteration bellman-equation frozenlake policy-improvement state-value-function action-value-function
- Updated
Apr 3, 2019 - Jupyter Notebook