Training a simple chess agent (failed miserably) #2400

denizetkar · 2024-08-22T01:00:34Z

denizetkar
Aug 22, 2024

Greetings dear community members!

I would like to share with you my efforts and resulting failures at training a chess agent using TorchRL and ask you for guidance.

A few weeks back I had the idea of training a chess AI using PPO with self learning. In the past I had already made extensive use of Pytorch as a deep learning library. When I discovered PytorchRL, I was excited to try it out even though it is clearly not stable yet (I hope it will become stable eventually). So, I gave it a go.
After a week into much coding and debugging, I realized why it isn't a mature library yet because I had to create many of my own subclasses for utilities. But here is more or less what came out of it:

https://github.com/denizetkar/chess-rl-test

In the days following that, I started running many experiments and tried to reach an AI that can at least consistently win against a randomly acting agent. I tried:

Self-learning from scratch (starting from random actor policy)
Self-learning from pretrained weights (pretrained on the winner moves of ~20k chess games)
Learning against a random agent from scratch
Learning against a random agent from pretrained weights

But nothing seemed to work. The actors I trained were consistently performing poorly, and by poor I mean:

Around 80% of the time the game ends in a draw.
In the rest, the random agent manages to win 50% of the time or less. But the random agent can always win a match after a few non-draw games.

Here is some information on how the actions are decided at each step by my agent:

The board state and the turn info -> action_net_0 -> Which piece to move (action_0)
- Turn is False for black and True for white
- action_net_0 is an MLP
The board state, the turn info and action_0 -> action_net_1 -> Where to move the selected piece (action_1)
The actions are always drawn from the next legal moves given by the chess engine.

Running my training loop is super simple. After installing the dependencies, all it takes is pipenv run python ./src/train.py

Currently, I feel quite stuck and don't know what to do next to get it working as I wish. So, now I would like to ask you RL experts, what changes would I need to train my agent so that it can at least consistently win against a random actor?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Training a simple chess agent (failed miserably) #2400

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Training a simple chess agent (failed miserably) #2400

Uh oh!

Uh oh!

denizetkar Aug 22, 2024

Replies: 0 comments

denizetkar
Aug 22, 2024