Top 23 Python fine-tuning Projects

LLaMA-Factory

1 8 64,310 9.7 Python

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Project mention: Llama-Factory: Unified, Efficient Fine-Tuning for 100 Open LLMs | news.ycombinator.com | 2025-09-18
Stream

getstream.io featured

Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
unsloth

2 29 49,610 9.9 Python

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.

Project mention: Unsloth – Train LLMs 2x faster with 70% less VRAM | news.ycombinator.com | 2025-12-10
llama_index

3 82 45,989 9.9 Python

LlamaIndex is the leading framework for building LLM-powered agents over your data.

Project mention: How to Build a RAG Solution with Llama Index, ChromaDB, and Ollama | dev.to | 2025-11-04

Step 2: Set up LlamaIndex and Chroma DB
CosyVoice

4 1 18,177 9.2 Python

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Project mention: CosyVoice 2025 Complete Guide: The Ultimate Multi-lingual Text-to-Speech Solution | dev.to | 2025-12-15

git clone --recursive https://github.com/FunAudioLLM/CosyVoice.git cd CosyVoice # If submodule cloning fails due to network issues git submodule update --init --recursive
OpenLLM

5 29 12,013 8.0 Python

Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.

Project mention: Your 2025 Roadmap to Becoming an AI Engineer for Free for Vue.js Developers | dev.to | 2025-08-06

REST APIs to connect AI models to Vue.js apps (example 1, example 2).
ludwig

6 3 11,636 0.0 Python

Low-code framework for building custom LLMs, neural networks, and other AI models
h2o-llmstudio

7 13 4,757 7.2 Python

H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://docs.h2o.ai/h2o-llmstudio/
InfluxDB

www.influxdata.com featured

InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
Kiln

8 15 4,485 10.0 Python

Easily build AI systems with Evals, RAG, Agents, fine-tuning, synthetic data, and more.

Project mention: DeepFabric – Generate High-Quality Synthetic Datasets at Scale | news.ycombinator.com | 2025-09-26
cognita

9 6 4,303 9.4 Python

RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry

Project mention: Lists of open-source frameworks for building RAG applications | dev.to | 2025-01-02

Ideal For: Enterprises seeking a robust framework for large-scale AI applications. GitHub Repository
lorax

10 4 3,569 8.9 Python

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
OML-1.0-Fingerprinting

11 2 3,533 5.8 Python

OML 1.0 via Fingerprinting: Open, Monetizable, and Loyal AI

Project mention: OML 1.0 via Fingerprinting: Open, Monetizable, and Loyal AI | news.ycombinator.com | 2025-07-07
xTuring

12 31 2,664 7.6 Python

Build, personalize and control your own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHXuSJEk6
SimpleTuner

13 2 2,673 10.0 Python

A general fine-tuning kit geared toward image/video/audio diffusion models.

Project mention: FLUX.1 Kontext | news.ycombinator.com | 2025-05-29

Just use https://github.com/bghira/SimpleTuner
I was able to run this script to train a Lora myself without spending any time learning the underlying python libraries.
OneTrainer

14 3 2,649 9.5 Python

OneTrainer is a one-stop solution for all your stable diffusion training needs.
maestro

15 1 2,647 8.7 Python

streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL (by roboflow)
YiVal

16 2 2,114 9.6 Python

Your Automatic Prompt Engineering Assistant for GenAI Applications
dstack

17 22 1,982 9.8 Python

dstack is an open-source control plane for running development, training, and inference jobs on GPUs—across hyperscalers, neoclouds, or on-prem.

Project mention: Orchestrating GPUs in data centers and private clouds | news.ycombinator.com | 2025-02-18

Super excited to hear any feedback.
[1] https://github.com/dstackai/dstack/issues/2184
custom-diffusion

18 11 1,969 2.3 Python

Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023)
DB-GPT-Hub

19 1 1,947 3.3 Python

A repository that contains models, datasets, and fine-tuning techniques for DB-GPT, with the purpose of enhancing model performance in Text-to-SQL
penzai

20 6 1,830 6.3 Python

A JAX research toolkit for building, editing, and visualizing neural networks.

Project mention: Defeating Nondeterminism in LLM Inference | news.ycombinator.com | 2025-09-10

I thought this was pretty well known (at least in the JAX/XLA world). I've hit this many times and got batch variance explained to me before: https://github.com/google-deepmind/penzai/issues/82 and
LongWriter

21 2 1,792 5.6 Python

[ICLR 2025] LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
Curator

22 7 1,285 9.6 Python

Scalable data pre processing and curation toolkit for LLMs (by NVIDIA-NeMo)

Project mention: Curator: Scalable data pre processing and curation toolkit for LLMs | news.ycombinator.com | 2025-08-20
LLM-Adapters

23 2 1,217 7.3 Python

Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models"
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python fine-tuning discussion

Python fine-tuning related posts

Unsloth – Train LLMs 2x faster with 70% less VRAM

1 project | news.ycombinator.com | 10 Dec 2025
DeepFabric – Generate High-Quality Synthetic Datasets at Scale

4 projects | news.ycombinator.com | 26 Sep 2025
Llama-Factory: Unified, Efficient Fine-Tuning for 100 Open LLMs

2 projects | news.ycombinator.com | 18 Sep 2025
Defeating Nondeterminism in LLM Inference

5 projects | news.ycombinator.com | 10 Sep 2025
Show HN: Kiln – AI Boilerplate with Evals, Fine-Tuning, Synthetic Data, and Git

2 projects | news.ycombinator.com | 28 Jul 2025
Qwen3-235B-A22B-Thinking-2507

2 projects | news.ycombinator.com | 25 Jul 2025
One Input, Multiple AI Minds: Meet the New MultiMindSDK LLM Router

1 project | dev.to | 11 Jul 2025
A note from our sponsor - Stream
getstream.io | 24 Dec 2025

Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure. Learn more →

Index

What are some of the best open-source fine-tuning projects in Python? This list will help you:

#	Project	Stars
1	LLaMA-Factory	64,310
2	unsloth	49,610
3	llama_index	45,989
4	CosyVoice	18,177
5	OpenLLM	12,013
6	ludwig	11,636
7	h2o-llmstudio	4,757
8	Kiln	4,485
9	cognita	4,303
10	lorax	3,569
11	OML-1.0-Fingerprinting	3,533
12	xTuring	2,664
13	SimpleTuner	2,673
14	OneTrainer	2,649
15	maestro	2,647
16	YiVal	2,114
17	dstack	1,982
18	custom-diffusion	1,969
19	DB-GPT-Hub	1,947
20	penzai	1,830
21	LongWriter	1,792
22	Curator	1,285
23	LLM-Adapters	1,217

Python fine-tuning

Top 23 Python fine-tuning Projects

Python fine-tuning discussion

Python fine-tuning related posts

Unsloth – Train LLMs 2x faster with 70% less VRAM

DeepFabric – Generate High-Quality Synthetic Datasets at Scale

Llama-Factory: Unified, Efficient Fine-Tuning for 100 Open LLMs

Defeating Nondeterminism in LLM Inference

Show HN: Kiln – AI Boilerplate with Evals, Fine-Tuning, Synthetic Data, and Git

Qwen3-235B-A22B-Thinking-2507

One Input, Multiple AI Minds: Meet the New MultiMindSDK LLM Router

Index

Did you know that Python is the 2nd most popular programming language based on number of references?

Did you know that Python is
the 2nd most popular programming language
based on number of references?