-
- Notifications
You must be signed in to change notification settings - Fork 48.6k
Add Q-Learning algorithm implementation with epsilon-greedy policy an… #13402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…d grid world demo
…d grid world demo code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Click here to look at the relevant links ⬇️
🔗 Relevant Links
Repository:
Python:
Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.
algorithms-keeper
commands and options
algorithms-keeper actions can be triggered by commenting on this PR:
@algorithms-keeper review
to trigger the checks for only added pull request files@algorithms-keeper review-all
to trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.
machine_learning/q_learning.py Outdated
current_state = (0, 0) | ||
| ||
| ||
def get_q_value(state, action): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide return type hint for the function: get_q_value
. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: state
Please provide type hint for the parameter: action
machine_learning/q_learning.py Outdated
return q_table[state][action] | ||
| ||
| ||
def get_best_action(state, available_actions): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide return type hint for the function: get_best_action
. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: state
Please provide type hint for the parameter: available_actions
machine_learning/q_learning.py Outdated
return random.choice(best) | ||
| ||
| ||
def choose_action(state, available_actions): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide return type hint for the function: choose_action
. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: state
Please provide type hint for the parameter: available_actions
machine_learning/q_learning.py Outdated
return get_best_action(state, available_actions) | ||
| ||
| ||
def update(state, action, reward, next_state, next_available_actions, done=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide return type hint for the function: update
. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: state
Please provide type hint for the parameter: action
Please provide type hint for the parameter: reward
Please provide type hint for the parameter: next_state
Please provide type hint for the parameter: next_available_actions
Please provide type hint for the parameter: done
machine_learning/q_learning.py Outdated
q_table[state][action] = new_q | ||
| ||
| ||
def get_policy(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide return type hint for the function: get_policy
. If the function does not return a value, please provide the type hint as: def function() -> None:
machine_learning/q_learning.py Outdated
return policy | ||
| ||
| ||
def reset_env(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/q_learning.py
, please provide doctest for the function reset_env
Please provide return type hint for the function: reset_env
. If the function does not return a value, please provide the type hint as: def function() -> None:
machine_learning/q_learning.py Outdated
return current_state | ||
| ||
| ||
def get_available_actions_env(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/q_learning.py
, please provide doctest for the function get_available_actions_env
Please provide return type hint for the function: get_available_actions_env
. If the function does not return a value, please provide the type hint as: def function() -> None:
machine_learning/q_learning.py Outdated
return [0, 1, 2, 3] | ||
| ||
| ||
def step_env(action): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/q_learning.py
, please provide doctest for the function step_env
Please provide return type hint for the function: step_env
. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: action
machine_learning/q_learning.py Outdated
return next_state, reward, done | ||
| ||
| ||
def run_q_learning(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/q_learning.py
, please provide doctest for the function run_q_learning
Please provide return type hint for the function: run_q_learning
. If the function does not return a value, please provide the type hint as: def function() -> None:
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Click here to look at the relevant links ⬇️
🔗 Relevant Links
Repository:
Python:
Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.
algorithms-keeper
commands and options
algorithms-keeper actions can be triggered by commenting on this PR:
@algorithms-keeper review
to trigger the checks for only added pull request files@algorithms-keeper review-all
to trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.
| ||
# Type alias for state | ||
type State = tuple[int, int] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An error occurred while parsing the file: machine_learning/q_learning.py
Traceback (most recent call last): File "/opt/render/project/src/algorithms_keeper/parser/python_parser.py", line 146, in parse reports = lint_file( ^^^^^^^^^^ libcst._exceptions.ParserSyntaxError: Syntax Error @ 16:1. parser error: error at 15:11: expected one of !=, %, &, (, *, **, +, ,, -, ., /, //, :, ;, <, <<, <=, =, ==, >, >=, >>, @, NEWLINE, [, ^, and, if, in, is, not, or, | type State = tuple[int, int] ^
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Click here to look at the relevant links ⬇️
🔗 Relevant Links
Repository:
Python:
Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.
algorithms-keeper
commands and options
algorithms-keeper actions can be triggered by commenting on this PR:
@algorithms-keeper review
to trigger the checks for only added pull request files@algorithms-keeper review-all
to trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.
| ||
# Type alias for state | ||
type State = tuple[int, int] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An error occurred while parsing the file: machine_learning/q_learning.py
Traceback (most recent call last): File "/opt/render/project/src/algorithms_keeper/parser/python_parser.py", line 146, in parse reports = lint_file( ^^^^^^^^^^ libcst._exceptions.ParserSyntaxError: Syntax Error @ 16:1. parser error: error at 15:11: expected one of !=, %, &, (, *, **, +, ,, -, ., /, //, :, ;, <, <<, <=, =, ==, >, >=, >>, @, NEWLINE, [, ^, and, if, in, is, not, or, | type State = tuple[int, int] ^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Click here to look at the relevant links ⬇️
🔗 Relevant Links
Repository:
Python:
Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.
algorithms-keeper
commands and options
algorithms-keeper actions can be triggered by commenting on this PR:
@algorithms-keeper review
to trigger the checks for only added pull request files@algorithms-keeper review-all
to trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.
| ||
# Type alias for state | ||
type State = tuple[int, int] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An error occurred while parsing the file: machine_learning/q_learning.py
Traceback (most recent call last): File "/opt/render/project/src/algorithms_keeper/parser/python_parser.py", line 146, in parse reports = lint_file( ^^^^^^^^^^ libcst._exceptions.ParserSyntaxError: Syntax Error @ 16:1. parser error: error at 15:11: expected one of !=, %, &, (, *, **, +, ,, -, ., /, //, :, ;, <, <<, <=, =, ==, >, >=, >>, @, NEWLINE, [, ^, and, if, in, is, not, or, | type State = tuple[int, int] ^
9397754
to f3594e6
Compare There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Click here to look at the relevant links ⬇️
🔗 Relevant Links
Repository:
Python:
Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.
algorithms-keeper
commands and options
algorithms-keeper actions can be triggered by commenting on this PR:
@algorithms-keeper review
to trigger the checks for only added pull request files@algorithms-keeper review-all
to trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.
| ||
# Type alias for state | ||
type State = tuple[int, int] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An error occurred while parsing the file: machine_learning/q_learning.py
Traceback (most recent call last): File "/opt/render/project/src/algorithms_keeper/parser/python_parser.py", line 146, in parse reports = lint_file( ^^^^^^^^^^ libcst._exceptions.ParserSyntaxError: Syntax Error @ 16:1. parser error: error at 15:11: expected one of !=, %, &, (, *, **, +, ,, -, ., /, //, :, ;, <, <<, <=, =, ==, >, >=, >>, @, NEWLINE, [, ^, and, if, in, is, not, or, | type State = tuple[int, int] ^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue came from the line type State = tuple[int, int], which uses the new PEP 695 type alias syntax introduced in Python 3.12. It works fine locally if you’re running Python 3.12 or newer, but the CI/CD environment (Render + Ruff parser) is running on an older Python version, probably 3.11 or below. Since older interpreters don’t understand the type keyword as a valid alias declaration, the parser throws a syntax error like ParserSyntaxError @ line 16. Basically, the parser just doesn’t know what to do with the type statement. To make it compatible across all environments, I tried to replace it with the old typing style alias:
from typing import Tuple
State = Tuple[int, int]
but ruff test is failing with this , so had to revert back .
IT STILL WORKS WITH PYTHON 3.12+
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cclauss hey can you please review if any changes are needed
…d grid world demo
Describe your change:
Checklist: