InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →
Top 23 Python Transformer Projects
-
transformers
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Project mention: Run Big LLMs on Small GPUs: A Hands-On Guide to 4-bit Quantization and QLoRA | dev.to | 2025-11-27Hugging Face Transformers: https://github.com/huggingface/transformers
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
- Project mention: Getting Started with Mooncake: Installation, Execution & Troubleshooting | dev.to | 2025-12-11
git clone -b v0.8.5 https://github.com/vllm-project/vllm.git --recursive cd vllm python use_existing_torch.py
-
nn
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
-
There certainly are issues on Linux as well. The Detectron2 library alone has several hundred issues related to incorrect versions of something: https://github.com/facebookresearch/detectron2/issues
The mmdetection library (https://github.com/open-mmlab/mmdetection/issues) also has hundreds of version-related issues. Admittedly, that library has not seen any updates for over a year now, but it is sad that things just break and become basically unusable on modern Linux operating systems because NVIDIA can't stop breaking backwards and forwards compatibility for what is essentially just fancy matrix multiplication.
- Project mention: 2025 Voice AI Guide: How to Make Your Own Real-Time Voice Agent (Part-1) | dev.to | 2025-09-20
FishSpeech — Natural dialogue flow
- Project mention: A ranked list of machine learning Python libraries. Updated weekly | news.ycombinator.com | 2025-01-31
-
No, it's not Harmony; Z.ai has their own format, which they modified slightly for this release (by removing the required newlines from their previous format). You can see their tool call parsing code here: https://github.com/sgl-project/sglang/blob/34013d9d5a591e3c0...
-
Stream
Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
-
-
-
RWKV-LM
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and free sentence embedding.
Project mention: Ask HN: Is anybody building an alternative transformer? | news.ycombinator.com | 2025-02-14You can see all the development directly from them: https://github.com/BlinkDL/RWKV-LM
Last week version 7 was released and every time they make significant improvements.
-
PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
-
I am a prev:SWE Intern at AWS India(was extended return offer), in the Q for Business Team, where I work on integration of BidirectionalStreaming using TTS models to integrate low latency voice mode usage in our application; and an active contributor to Pickle's Glass[ https://github.com/pickle-com/glass ], and HyprNote[ https://www.linkedin.com/company/hyprnote/ ].
My previous experiences include:
Experience: WorldQuant (Quant Research), AstraZeneca (ML engineer for drug discovery using LLMs), TransHumanity(AI Engineering solving Traffic), Alma [ https://www.linkedin.com/company/tryalma/ ] (AI engineering to simplify the visa process for immigration), Univ. of Missouri (biomedical GraphRAG research).
Competitions: Top 0.016% in Amazon ML Challenge 2024 (12th/75,000 teams) and ranked 11th/50,000 teams in Amazon HackOn 2024, UC Berkeley AGENTX-2025.
Open Source: EleutherAI's LMEvaluationHarness[ https://github.com/EleutherAI/lm-evaluation-harness ], HuggingFace's nanoVLM( https://github.com/huggingface/nanoVLM )[integrating metrics and GRPO training techniques], Pickle's Glass[added search functionality, adding knowledge base support for enterprises] , HyprNote[ https://github.com/fastrepl/hyprnote ] [removing silent responses and quantizing models for effective summarization]
I've attached all my profile links, hereby.
-
Resource: TGI (Text Generation Inference)
-
petals
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
Project mention: Petals: Run large language models at home, BitTorrent‑style | news.ycombinator.com | 2025-05-27 -
-
PaddleSeg
Easy-to-use image segmentation library with awesome pre-trained model zoo, supporting wide-range of practical tasks in Semantic Segmentation, Interactive Segmentation, Panoptic Segmentation, Image Matting, 3D Segmentation, etc.
-
manga-image-translator
Translate manga/image 一键翻译各类图片内文字 https://cotrans.touhou.ai/ (no longer working)
-
LMFlow
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
-
- Project mention: Gradient Descent on Token Input Embeddings: A ModernBERT experiment | dev.to | 2025-06-23
ModernBERT-large was chosen because it is relatively lightweight model with a strong visualization suite and a simplified attention mask (full cross-attention) that is easy to reason about. It would be interesting to see if the results in this post hold across other models.
-
-
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python Transformer discussion
Python Transformer related posts
-
Qwen3-Omni-Flash-2025-12-01:a next-generation native multimodal large model
-
Run Big LLMs on Small GPUs: A Hands-On Guide to 4-bit Quantization and QLoRA
-
Heretic: Automatic censorship removal for language models
-
Transformers Tutorial
-
Llama 4 Smells Bad
-
Complete Large Language Model (LLM) Learning Roadmap
-
Multilspy: Building a common LSP client handtuned for all Language servers
- A note from our sponsor - InfluxDB www.influxdata.com | 22 Dec 2025
Index
What are some of the best open-source Transformer projects in Python? This list will help you:
| # | Project | Stars |
|---|---|---|
| 1 | transformers | 154,054 |
| 2 | vllm | 65,886 |
| 3 | nn | 64,821 |
| 4 | mmdetection | 32,017 |
| 5 | fish-speech | 24,379 |
| 6 | best-of-ml-python | 22,954 |
| 7 | sglang | 21,434 |
| 8 | faster-whisper | 19,503 |
| 9 | LaTeX-OCR | 16,037 |
| 10 | RWKV-LM | 14,228 |
| 11 | PaddleSpeech | 12,430 |
| 12 | lm-evaluation-harness | 10,952 |
| 13 | text-generation-inference | 10,710 |
| 14 | petals | 9,844 |
| 15 | mmsegmentation | 9,381 |
| 16 | PaddleSeg | 9,249 |
| 17 | manga-image-translator | 9,052 |
| 18 | LMFlow | 8,492 |
| 19 | jukebox | 8,035 |
| 20 | bertviz | 7,839 |
| 21 | GPT2-Chinese | 7,592 |
| 22 | BERT-pytorch | 6,507 |
| 23 | Informer2020 | 6,347 |