Skip to content
This repository was archived by the owner on Aug 28, 2025. It is now read-only.

Conversation

@carsondenison
Copy link
Contributor

@carsondenison carsondenison commented Dec 3, 2021

Before submitting

  • I raised this issue on github, issue Found mistake in pytorch-lightning DQN example. How do I upload a fix? pytorch-lightning#10883. The user @awaelchli said to fork the repo, clone it to my local machine, make my changes, push to my repo, and then submit a pull request.
  • This is my first attempt to contribute to an open-source project, so please let me know if I've done anything incorrectly.
  • I could not find where the unit tests are kept for the example colab notebooks. I would be happy to upload my unit tests for the get_epsilon method, but I don't know where to put them in the codebase.

What does this PR do?

This fixes issue Lightning-AI/pytorch-lightning#10883, where there was an incorrect version of epsilon decay for the epsilon-random policy of a DQN. The original code has a single train_step with self.hparams.eps_start and then immediately switches to self.hparams.eps_end. The intended behavior is to linearly decrease epsilon from self.haparms.eps_start to self.hparams.eps_end over the first self.hparams.eps_last_frame steps. I wrote a small function get_epsilon which fixes this logic and returns the correct epsilon.

I have also made a few minor changes on other lines, because the code would not run on my local machine without these changes. Specifically, the type hint for the __iter__ method of the RLDataset class was a Tuple, and should be an Iterator[Tuple], because it returns a generator of tuples representing (state, action, reward, done, new_state). Additionally, on line 264 (formerly 276), I got an error that the index for the gather() function must be of type int64, so I cast the actions tensor to type long. Finally, I added logging of self.episode_reward and epsilon, so that I could see that a) it still learned successfully, and b) my intended changes to epsilon were working as intended.

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Sure did!

@Borda Borda added the bug / fix Something isn't working label Dec 3, 2021
@Borda Borda enabled auto-merge (squash) December 4, 2021 23:13
@Borda Borda merged commit b3762a2 into Lightning-AI:main Dec 4, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

bug / fix Something isn't working

4 participants