CodeUltraFeedback: aligning large language models to coding preferences (TOSEM 2025)
code-generation dpo large-language-models reinforcement-learning-from-human-feedback llm-as-a-judge codeultrafeedback
- Updated
Jun 25, 2024 - Python
CodeUltraFeedback: aligning large language models to coding preferences (TOSEM 2025)
Add a description, image, and links to the codeultrafeedback topic page so that developers can more easily learn about it.
To associate your repository with the codeultrafeedback topic, visit your repo's landing page and select "manage topics."