This repo contains the implementation of the Proximal Policy Optimization algorithm using the Keras library on a custom environment made with Unity 3D engine.
Important details about this repository:
- Unity engine version used to build the environment = 2019.3.15f1
- ML-Agents branch = release_1
- Environment binary:
- For Windows = Learning-Agents--r1 (.exe)
- For Linux(Headless/Server build) = RL-agent (.x86_64)
- For Linux(Normal build) = RL-agent (.x86_64)
Windows environment binary is used in this repo. But if you want to use the Linux environment binary, then change the ENV_NAME in train.py & test.py scripts to the correct path pointing to those binaries stored over here.
- Introduction
- Environment Specific Details
- Setup Instructions
- Getting Started
- Motivation and Learning
- License
- Acknowledgements
- Check out this video to see the trained agent using the learned navigation skills to find the flag in a closed environment, which is divided into nine different segments.
- And if you want to see the training phase/process of this agent, then check out this video.
These are some details which you should know before hand. And I think without knowing this, you might get confused because some of the Keras implementations are environment-dependent.
- Observation/State space: Vectorized (Unlike Image)
- Action space: Continuous (Unlike discrete)
- Action shape: (num of agents, 2) (Here num of agents alive at every env step is 1, i.e shape(1, 2))
- Reward System:
- (1.0/MaxStep) per step (MaxStep is used to reset the env irrespective of achieving the goal state) & the same reward is used if the agent crashes into the walls.
- +2 if the agent reaches the goal state.
Install the ML-Agents github repo release_1 branch, but if you want to use the different branch version then modify the python APIs to interact with the environment.
Clone this repos:
$ git clone --branch release_1 https://github.com/Unity-Technologies/ml-agents.git $ git clone https://github.com/Dhyeythumar/PPO-algo-with-custom-Unity-environment.git
Create and activate the python virtual environment:
$ python -m venv myvenv $ myvenv\Scripts\activate
Install the dependencies:
$ pip install -e ./ml-agents/ml-agents-envs $ pip install tensorflow $ pip install keras $ pip install tensorboardX
Now to start the training process use the following commands:
$ cd PPO-algo-with-custom-Unity-environment $ python train.py
Activate the tensorboard:
$ tensorboard --logdir=./training_data/summaries --port 6006
This video by OpenAI inspired me to develop something in the field of reinforcement learning. So for the first phase, I decided to create a simple RL agent who can learn navigation skills.
After completing the first phase, I gained much deeper knowledge in the RL domain and got some of my following questions answered:
- How to create custom 3D environments using the Unity engine?
- How to use ML-Agents (Unity's toolkit for reinforcement learning) to train the RL agents?
- And I also learned to implement the PPO algorithm using the Keras library. 😃
What's next? 🤔
So I have started working on the next phase of this project, which will include a multi-agent environment setup and, I am also planning to increase the difficulty level. So for more updates, stay tuned for the next video on my youtube channel.
Licensed under the MIT License.