ppo-pytorch

Here are 2 public repositories matching this topic...

RuvenGuna94 / Dialogue-Summary-remove-toxic-text-PPO

Fine-tuning FLAN-T5 with PPO and PEFT to generate less toxic text summaries. This notebook leverages Meta AI's hate speech reward model and utilizes RLHF techniques for improved safety.

nlp toxic-comment-classification hate-speech-detection toxicity-analysis ppo-pytorch dialogue-summarization generative-ai detoxification reward-model

Updated Jan 4, 2025
Jupyter Notebook

bantu-4879 / Atari_Games-Deep_Reinforcement_Learning

Star

This repository hosts Jupyter notebooks showcasing the training of Atari games using a variety of Deep Reinforcement Learning (RL) algorithms such as Proximal Policy Optimization (PPO), Deep Deterministic Policy Gradient (DDPG), Deep Q-Networks (DQN), Advantage Actor-Critic (A2C), and more.

deep-reinforcement-learning gymnasium atari-games dqn-pytorch ppo-pytorch stablebaselines3

Updated Jun 10, 2024
Jupyter Notebook

Improve this page

Add a description, image, and links to the ppo-pytorch topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the ppo-pytorch topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly