Expert-Guided ML for 2D car driving trained under intentionally poor demonstrations.
To use experts (humans, most likely, although not necessarily) in order to train a neural network in driving a car without crashing, providing only intentionally poor driving demonstrations.
This project was started as a possible final project for CS 5170 in Northeastern University, Spring 2021 semester.
PostgreSQL and Redis ports left open instead of merely exposed in case these databases need to be accessed from other servers, this may be modified.
make devThen visit http://localhost:8080/ to see the website running.
Software:
- Docker
Python3 libraries:
- matplotlib
- numpy
- shapely
All commands start within the root directory for this repository. Unless mentioned in this README, all reinforcement_learning python scripts can show the meaning of all flags when flag --help is used.
- Generate the necessary matrices and data
Example:
cd reinforcement_learning/ python3 create_circuit.py --circuit circuits/five.json --output circuits/five_Q_matrix.json --showTo show all the options:
cd reinforcement_learning/ python3 create_circuit.py --help- (Optional) Run an A* search in order to obtain an optimal baseline to compare to
Example:
cd reinforcement_learning/ python3 q_Astar_trainer.py --A-star-runs 50 --data circuits/five_Q_matrix.json --output demonstration_data/five_A_1.jsonNote: A* is not an optimal search method, since it is implemented using the L1 norm (Manhattan distance) as a heuristic. However, it results in a faster search than Breadth First Search while still producing good results.
- Setup the docker containers
Modify the .env file to change the credentials.
Enter the generated Q matrix circuit (five_Q_matrix.json) from step 1 as five.json in circuits (this step is aready complete in the repository):
cat reinforcement_learning/circuits/five_Q_matrix.json > server/main_node/circuits/five.jsonNote: sudo permission may be necessary.
make deployNote, to bring the containers down:
make teardown- Train
Create an account by going to the appropiate URL and sign-up, then click on Driving RL. Execute as many positive and negative intent demonstrations as needed. Here, only 20 of each are done.
Enter the main_node container (sudo may be needed), combine all positive and negative intent data, export the data:
# Enter the container docker exec -it server_main_node_1 bash # Combine the data python3 data_combiner.py # Exit the container and export the data exit docker cp server_main_node_1:/DARLMID/data/combined_negative.json reinforcement_learning/demonstration_data/combined_negative.json docker cp server_main_node_1:/DARLMID/data/combined_positive.json reinforcement_learning/demonstration_data/combined_positive.json- Q-learning
Note: reinforcement_learning/q_compare_1v1.py doesnot accept flags or arguments, changes must be done on the file itself.
Run a simple Q-learning agent without any demonstrations.
Example:
cd reinforcement_learning/ python3 q_learn.py --epochs 300 --explore-probability 0.15 --learning-rate 0.25 \ --discount-factor 0.3 --data circuits/five_Q_matrix.json \ --output results/five_output.json \ --showOptional: Run a Q-learning agent using the A* demonstrations from step 2:
Example:
cd reinforcement_learning/ python3 q_learn.py --epochs 300 --explore-probability 0.15 --learning-rate 0.25 \ --discount-factor 0.3 --data circuits/five_Q_matrix.json \ --positive-demonstration demonstration_data/five_A_1.json \ --output results/five_A_1_output.json \ --show # Compare both models # Update q_compare_1v1.py first if needed vi q_compare_1v1.py python3 q_compare_1v1.pyRun a Q-learning agent using the positive demonstrations from step 4:
Example:
cd reinforcement_learning/ python3 q_learn.py --epochs 300 --explore-probability 0.15 --learning-rate 0.25 \ --discount-factor 0.3 --data circuits/five_Q_matrix.json \ --positive-demonstration demonstration_data/combined_positive.json \ --output results/five_positive_output.json \ --show # Compare both models # Update q_compare_1v1.py first if needed vi q_compare_1v1.py python3 q_compare_1v1.pyRun a Q-learning agent using negative demonstrations from step 4:
Example:
cd reinforcement_learning/ python3 q_learn.py --epochs 300 --explore-probability 0.15 --learning-rate 0.25 \ --discount-factor 0.3 --data circuits/five_Q_matrix.json \ --negative-demonstration demonstration_data/combined_negative.json \ --output results/five_negative_output.json \ --show # Compare both models # Update q_compare_1v1.py first if needed vi q_compare_1v1.py python3 q_compare_1v1.pyRun a Q-learning agent using both positive and negative demonstrations from step 4:
Example:
cd reinforcement_learning/ python3 q_learn.py --epochs 300 --explore-probability 0.15 --learning-rate 0.25 \ --discount-factor 0.20 --data circuits/five_Q_matrix.json \ --positive-demonstration demonstration_data/combined_positive.json \ --negative-demonstration demonstration_data/combined_negative.json \ --output results/five_positive_and_negative_output.json \ --show # Compare both models # Update q_compare_1v1.py first if needed vi q_compare_1v1.py python3 q_compare_1v1.pyParts of this project utilize software and images which are licensed under different conditions. An overview of these materials, licenses, and conditions is provided in the licenses subdirectory.
- https://docs.aiohttp.org/en/stable/deployment.html
- https://stackoverflow.com/questions/52569051/aiohttp-and-nginx-running-in-docker
- https://docs.gunicorn.org/en/stable/install.html
- https://docs.gunicorn.org/en/stable/install.html
- https://docs.docker.com/storage/volumes/
- https://docs.nginx.com/nginx/admin-guide/web-server/serving-static-content/
- http://nginx.org/en/docs/beginners_guide.html#static
- https://mkyong.com/html/html-tutorial-hello-world/
- https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Flexible_Box_Layout/Aligning_Items_in_a_Flex_Container
- https://www.w3schools.com/html/tryit.asp?filename=tryhtml_images_trulli
- https://www.digitalocean.com/community/tutorials/how-to-configure-nginx-to-use-custom-error-pages-on-ubuntu-14-04
- https://commons.wikimedia.org/wiki/File:Aft_(PSF).png
- https://hub.docker.com/_/postgres
- https://stackoverflow.com/questions/45128902/psycopg2-and-sql-injection-security
- https://docs.github.com/en/github/importing-your-projects-to-github/adding-an-existing-project-to-github-using-the-command-line
- https://pkgs.alpinelinux.org/package/edge/main/x86/postgresql-dev
- https://www.w3schools.com/css/css_howto.asp
- https://www.dummies.com/web-design-development/html5-and-css3/how-to-use-an-external-style-sheet-for-html5-and-css3-programming/
- https://www.w3schools.com/howto/tryit.asp?filename=tryhow_css_topnav
- https://stackoverflow.com/questions/8722163/how-to-assign-multiple-classes-to-an-html-container
- https://www.w3schools.com/colors/colors_names.asp
- https://www.w3schools.com/howto/howto_css_fixed_footer.asp
- https://stackoverflow.com/questions/45764517/how-to-return-redirect-response-from-aiohttp-web-server
- http://demos.aiohttp.org/en/latest/tutorial.html#middlewares
- https://www.w3schools.com/css/css3_gradients.asp
- https://stackoverflow.com/questions/29573489/nginx-failing-to-load-css-and-js-files-mime-type-error
- https://stackoverflow.com/questions/2242086/how-to-detect-the-screen-resolution-with-javascript
- https://stackoverflow.com/questions/15615552/get-div-height-with-plain-javascript
- https://stackoverflow.com/questions/19484544/set-height-of-div-to-height-of-another-div-through-css
- https://www.w3schools.com/js/js_functions.asp
- https://stackoverflow.com/questions/807878/how-to-make-javascript-execute-after-page-load
- https://stackoverflow.com/questions/34796085/how-to-stick-footer-to-bottom-not-fixed-even-with-scrolling/34796186
- https://stackoverflow.com/questions/19039628/how-to-calculate-height-of-viewable-area-i-e-window-height-minus-address-bo
- https://www.w3schools.com/jsref/event_onresize.asp
- https://freesvg.org/nemeth-flying-machine
- https://www.w3schools.com/css/tryit.asp?filename=trycss3_border-radius
- https://stackoverflow.com/questions/54845686/nginx-wont-serve-svg-files
- https://www.w3schools.com/tags/tryit.asp?filename=tryhtml_table_test
- https://www.w3schools.com/html/tryit.asp?filename=tryhtml_form_submit
- https://developer.mozilla.org/en-US/docs/Web/HTML/Attributes/minlength
- https://stackoverflow.com/questions/1297449/change-image-size-with-javascript
- https://stackoverflow.com/questions/9686538/align-labels-in-form-next-to-input
- https://freesvg.org/international-space-station-vector-drawing
- https://www.w3schools.com/cssref/css_units.asp
- https://www.w3schools.com/jsref/tryit.asp?filename=tryjsref_form_submit
- https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/length
- https://stackoverflow.com/questions/6199773/how-to-enable-disable-an-html-button-based-on-scenarios
- https://stackoverflow.com/questions/195951/how-can-i-change-an-elements-class-with-javascript
- https://stackoverflow.com/questions/3547035/javascript-getting-html-form-values
- https://stackoverflow.com/questions/32459646/removing-the-shadow-from-a-button
- https://stackoverflow.com/questions/15110484/javascript-how-to-append-div-in-begining-of-another-div
- https://developer.mozilla.org/en-US/docs/Web/API/ParentNode/prepend
- https://stackoverflow.com/questions/16584121/change-div-id-by-javascript
- https://stackoverflow.com/questions/596467/how-do-i-convert-a-float-number-to-a-whole-number-in-javascript
- https://stackoverflow.com/questions/11722400/programmatically-change-the-src-of-an-img-tag
- https://stackoverflow.com/questions/21727317/how-to-check-confirm-password-field-in-form-without-reloading-page
- https://stackoverflow.com/questions/39449739/aiohttp-how-to-retrieve-the-data-body-in-aiohttp-server-from-requests-get
- https://stackoverflow.com/questions/52246796/await-a-method-and-assign-a-variable-to-the-returned-value-with-asyncio
- https://stackoverflow.com/questions/46428889/keeping-pycache-out-of-my-repository-when-adding-committing-from-pythonany
- https://www.w3schools.com/css/tryit.asp?filename=trycss_table_align_center
- https://stackoverflow.com/questions/29775797/fetch-post-json-data
- https://github.com/ritua2/gib/blob/master/middle-layer/.env
- https://github.com/ritua2/gib/blob/master/middle-layer/docker-compose.yml
- https://hub.docker.com/_/redis
- https://www.psycopg.org/docs/module.html#psycopg2.connect
- https://www.postgresqltutorial.com/postgresql-create-table/
- https://stackoverflow.com/questions/50070877/postgres-psycopg2-create-table
- https://www.postgresql.org/docs/8.0/sql-createuser.html
- http://oliviertech.com/python/generate-SHA512-hash-from-a-String/
- https://stackoverflow.com/questions/4244896/dynamically-access-object-property-using-variable
- https://stackoverflow.com/questions/45018338/javascript-fetch-api-how-to-save-output-to-variable-as-an-object-not-the-prom/45018619
- https://tldrlegal.com/license/apache-license-2.0-(apache-2.0)
- http://www.apache.org/licenses/LICENSE-2.0.txt
- https://github.com/mozilla/bleach
- https://bleach.readthedocs.io/en/latest/clean.html
- https://github.com/aio-libs/aiohttp/blob/master/examples/web_cookies.py
- https://stackoverflow.com/questions/26745519/converting-dictionary-to-json
- https://github.com/js-cookie/js-cookie
- https://docs.aiohttp.org/en/stable/web_reference.html
- https://docs.python.org/3/library/sys.html
- https://docs.python.org/3/howto/argparse.html
- https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet
- https://dillinger.io/
- https://stackoverflow.com/questions/9215658/plot-a-circle-with-pyplot
- https://matplotlib.org/3.1.1/gallery/lines_bars_and_markers/fill.html
- https://stackoverflow.com/questions/2849286/python-matplotlib-subplot-how-to-set-the-axis-range
- https://www.w3schools.com/python/ref_keyword_assert.asp
- https://stackoverflow.com/questions/26226816/argparse-making-required-flags
- https://shapely.readthedocs.io/en/stable/manual.html
- https://gis.stackexchange.com/questions/95670/creating-shapely-linestring-from-two-points
- https://docs.blender.org/manual/en/latest/getting_started/installing/linux.html
- https://www.w3schools.com/html/tryit.asp?filename=tryhtml_table
- https://commons.wikimedia.org/wiki/File:Car_in_Black_Rock_Desert.jpg
- https://smallbusiness.chron.com/crop-circle-out-picture-gimp-36366.html
- https://splidejs.com/getting-started/
- https://www.w3schools.com/tags/att_script_defer.asp
- https://splidejs.com/
- https://stackoverflow.com/questions/15121343/how-to-center-a-p-element-inside-a-div-container
- https://upload.wikimedia.org/wikipedia/commons/thumb/6/69/Storsj%C3%B6n_i_Vindelns_kommun.jpg/1280px-Storsj%C3%B6n_i_Vindelns_kommun.jpg
- https://web.dev/browser-level-image-lazy-loading/
- https://davidwalsh.name/lazyload-image-fade
- https://developer.mozilla.org/en-US/docs/Web/API/Element/removeAttribute
- Asked some of Carlos' friends for feedback on the front-end's look
- https://stackoverflow.com/questions/534839/how-to-create-a-guid-uuid-in-python
- https://aioredis.readthedocs.io/en/v1.3.0/examples.html
- https://aioredis.readthedocs.io/en/v1.3.0/mixins.html
- https://aioredis.readthedocs.io/en/v1.3.0/api_reference.html
- https://redis.io/commands/expire
- https://www.w3schools.com/tags/tryit.asp?filename=tryhtml5_script_async
- https://wiki.freecadweb.org/Topological_data_scripting
- https://json2html.com/
- https://json2html.com/examples/
- https://stackoverflow.com/questions/684672/how-do-i-loop-through-or-enumerate-a-javascript-object
- https://api.jquery.com/jQuery.isEmptyObject/
- https://code.jquery.com/
- https://www.quackit.com/html/howto/how_to_make_a_background_image_not_repeat.cfm
- https://stackoverflow.com/questions/1085801/get-selected-value-in-dropdown-list-using-javascript
- https://select2.org/getting-started/installation
- https://select2.org/getting-started/basic-usage
- https://www.w3schools.com/jsref/jsref_length_array.asp
- https://stackoverflow.com/questions/30650961/functional-way-to-iterate-over-range-es6-7
- https://stackoverflow.com/questions/10879045/how-to-set-opacity-in-parent-div-and-not-affect-in-child-div
- https://github.com/jonobr1/two.js/
- https://two.js.org/
- https://www.geeksforgeeks.org/python-os-path-isfile-method/
- https://www.w3schools.com/jsref/met_element_remove.asp
- https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Errors/Missing_colon_after_property_id
- https://jsonlint.com/
- https://stackoverflow.com/questions/596467/how-do-i-convert-a-float-number-to-a-whole-number-in-javascript
- https://www.w3schools.com/howto/tryit.asp?filename=tryhow_css_list_without_bullets
- https://www.w3schools.com/howto/howto_css_list_without_bullets.asp
- https://code.tutsplus.com/tutorials/drawing-with-twojs--net-32024
- jonobr1/two.js#144
- Previous and continuing coursework materials
- https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/Default_parameters
- https://stackoverflow.com/questions/21227287/make-div-scrollable
- https://matplotlib.org/stable/gallery/color/named_colors.html
- https://en.wikiversity.org/wiki/Python_Programming/Classes
- https://en.wikipedia.org/wiki/Q-learning
- Previous Q-learning homework assignment, provided in CS 5100 (Northeastern University)
- https://docs.python.org/3/library/random.html
- Artificial Intelligence A Modern Approach Third Edition (Stuart Russell, Peter Norvig)
- https://stackoverflow.com/questions/927358/how-do-i-undo-the-most-recent-local-commits-in-git
- https://en.wikipedia.org/wiki/Taxicab_geometry
- https://docs.python.org/3/library/heapq.html
- https://wiki.python.org/moin/TimeComplexity
- https://stackoverflow.com/questions/33282368/plotting-a-2d-heatmap-with-matplotlib
- https://stackoverflow.com/questions/36343928/python-heatmap-plot-colorbar
- https://stackoverflow.com/questions/8396101/invert-image-displayed-by-imshow-in-matplotlib
- https://stackoverflow.com/questions/1527803/generating-random-whole-numbers-in-javascript-in-a-specific-range
- https://www.w3schools.com/js/js_classes.asp
- https://www.w3schools.com/js/js_comparisons.asp
- https://www.w3schools.com/jsref/tryit.asp?filename=tryjsref_concat
- https://stackoverflow.com/questions/3396754/onkeypress-vs-onkeyup-and-onkeydown
- https://stackoverflow.com/questions/24028225/addeventlistener-keypress-doesnt-register-key-presses
- https://css-tricks.com/snippets/javascript/javascript-keycodes/
- https://stackoverflow.com/questions/2647867/how-can-i-determine-if-a-variable-is-undefined-or-null
- https://stackoverflow.com/questions/31746182/docker-compose-wait-for-container-x-before-starting-y
- https://stackoverflow.com/questions/20895290/count-number-of-files-within-a-directory-in-linux