This repo contains the implementation of the Proximal Policy Optimization algorithm using the Keras library on a custom environment made with Unity 3D engine.
Important details about this repository:
- Unity engine version used to build the environment = 2019.3.15f1
- ML-Agents branch = release_1
- Environment binary:
- For Windows = Learning-Agents--r1.exe
- For Linux(Headless/Server build) = RL-agent.x86_64
- For Linux(Normal build) = This will be uploaded soon.
Windows environment binary is used in this code. So if you want to use the Linux environment binary then change the ENV_NAME in train.py & test.py files to the correct path pointing to that binary stored over here.
- Introduction
- Environment Specific Details
- Setup Instructions
- Getting Started
- Motivation and Learning
- License
- Acknowledgements
- Check out this video to see the trained agent using the learned navigation skills to find the flag in a closed environment, which is divided into nine different segments.
- And if you want to see the training phase/process of this agent, then check out this video.
These are some details which you should know before hand. And I think without knowing this, you might get confused because some of the Keras implementations are environment-dependent.
- Observation/State space: Vectorized (Unlike Image)
- Action space: Continuous [shape(1, 2)] (Unlike discrete)
- Reward System:
- (1.0/MaxStep) per step (MaxStep is used to reset the env irrespective of achieving the goal state) & the same reward is used if the agent crashes into the walls.
- +2 is the agent reaches the goal state.
Install the ML-Agents github repo release_1_branch, but if you want to use the different branch version then modify the python APIs to interact with the environment.
Clone this repos:
$ git clone --branch release_1 https://github.com/Unity-Technologies/ml-agents.git $ git clone https://github.com/Dhyeythumar/PPO-algo-with-custom-Unity-environment.gitCreate and activate the python virtual environment:
$ python -m venv myvenv $ myvenv\Scripts\activateInstall the dependencies:
$ pip install -e ./ml-agents/ml-agents-envs $ pip install tensorflow $ pip install keras $ pip install tensorboardXNow to start the training process use the following commands:
$ cd PPO-algo-with-custom-Unity-environment $ python train.pyActivate the tensorboard:
$ tensorboard --logdir=./training_data/summaries --port 6006This video by OpenAI inspired me to develop something in the field of reinforcement learning. So for the first phase, I decided to create a simple RL agent who can learn navigation skills.
After completing the first phase, I gained much deeper knowledge in the RL domain and got some of my following questions answered:
- How to create custom 3D environments using the Unity engine?
- How to use ML-Agents (Unity's toolkit for reinforcement learning) to train the RL agents?
- And I also learned to implement the PPO algorithm using the Keras library. 😃
What's next? 🤔
So I have started working on the next phase of this project, which will include a multi-agent environment setup and, I am also planning to increase the difficulty level. So for more updates, stay tuned for the next video on my youtube channel.
Licensed under the MIT License.