Python Transformer

Open-source Python projects categorized as Transformer

Top 23 Python Transformer Projects

Transformer
  1. transformers

    🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

    Project mention: Run Big LLMs on Small GPUs: A Hands-On Guide to 4-bit Quantization and QLoRA | dev.to | 2025-11-27

    Hugging Face Transformers: https://github.com/huggingface/transformers

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Project mention: Getting Started with Mooncake: Installation, Execution & Troubleshooting | dev.to | 2025-12-11

    git clone -b v0.8.5 https://github.com/vllm-project/vllm.git --recursive cd vllm python use_existing_torch.py

  4. nn

    🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

  5. mmdetection

    OpenMMLab Detection Toolbox and Benchmark

    Project mention: PYX: The next step in Python packaging | news.ycombinator.com | 2025-08-13

    There certainly are issues on Linux as well. The Detectron2 library alone has several hundred issues related to incorrect versions of something: https://github.com/facebookresearch/detectron2/issues

    The mmdetection library (https://github.com/open-mmlab/mmdetection/issues) also has hundreds of version-related issues. Admittedly, that library has not seen any updates for over a year now, but it is sad that things just break and become basically unusable on modern Linux operating systems because NVIDIA can't stop breaking backwards and forwards compatibility for what is essentially just fancy matrix multiplication.

  6. fish-speech

    SOTA Open Source TTS

    Project mention: 2025 Voice AI Guide: How to Make Your Own Real-Time Voice Agent (Part-1) | dev.to | 2025-09-20

    FishSpeech — Natural dialogue flow

  7. best-of-ml-python

    🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.

    Project mention: A ranked list of machine learning Python libraries. Updated weekly | news.ycombinator.com | 2025-01-31
  8. sglang

    SGLang is a fast serving framework for large language models and vision language models.

    Project mention: GLM-4.7: Advancing the Coding Capability | news.ycombinator.com | 2025-12-22

    No, it's not Harmony; Z.ai has their own format, which they modified slightly for this release (by removing the required newlines from their previous format). You can see their tool call parsing code here: https://github.com/sgl-project/sglang/blob/34013d9d5a591e3c0...

  9. Stream

    Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.

    Stream logo
  10. faster-whisper

    Faster Whisper transcription with CTranslate2

  11. LaTeX-OCR

    pix2tex: Using a ViT to convert images of equations into LaTeX code.

  12. RWKV-LM

    RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and free sentence embedding.

    Project mention: Ask HN: Is anybody building an alternative transformer? | news.ycombinator.com | 2025-02-14

    You can see all the development directly from them: https://github.com/BlinkDL/RWKV-LM

    Last week version 7 was released and every time they make significant improvements.

  13. PaddleSpeech

    Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

  14. lm-evaluation-harness

    A framework for few-shot evaluation of language models.

    Project mention: Ask HN: Who is hiring? (September 2025) | news.ycombinator.com | 2025-09-01

    I am a prev:SWE Intern at AWS India(was extended return offer), in the Q for Business Team, where I work on integration of BidirectionalStreaming using TTS models to integrate low latency voice mode usage in our application; and an active contributor to Pickle's Glass[ https://github.com/pickle-com/glass ], and HyprNote[ https://www.linkedin.com/company/hyprnote/ ].

    My previous experiences include:

    Experience: WorldQuant (Quant Research), AstraZeneca (ML engineer for drug discovery using LLMs), TransHumanity(AI Engineering solving Traffic), Alma [ https://www.linkedin.com/company/tryalma/ ] (AI engineering to simplify the visa process for immigration), Univ. of Missouri (biomedical GraphRAG research).

    Competitions: Top 0.016% in Amazon ML Challenge 2024 (12th/75,000 teams) and ranked 11th/50,000 teams in Amazon HackOn 2024, UC Berkeley AGENTX-2025.

    Open Source: EleutherAI's LMEvaluationHarness[ https://github.com/EleutherAI/lm-evaluation-harness ], HuggingFace's nanoVLM( https://github.com/huggingface/nanoVLM )[integrating metrics and GRPO training techniques], Pickle's Glass[added search functionality, adding knowledge base support for enterprises] , HyprNote[ https://github.com/fastrepl/hyprnote ] [removing silent responses and quantizing models for effective summarization]

    I've attached all my profile links, hereby.

  15. text-generation-inference

    Large Language Model Text Generation Inference

    Project mention: Complete Large Language Model (LLM) Learning Roadmap | dev.to | 2025-04-11

    Resource: TGI (Text Generation Inference)

  16. petals

    🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading

    Project mention: Petals: Run large language models at home, BitTorrent‑style | news.ycombinator.com | 2025-05-27
  17. mmsegmentation

    OpenMMLab Semantic Segmentation Toolbox and Benchmark.

  18. PaddleSeg

    Easy-to-use image segmentation library with awesome pre-trained model zoo, supporting wide-range of practical tasks in Semantic Segmentation, Interactive Segmentation, Panoptic Segmentation, Image Matting, 3D Segmentation, etc.

  19. manga-image-translator

    Translate manga/image 一键翻译各类图片内文字 https://cotrans.touhou.ai/ (no longer working)

  20. LMFlow

    An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

  21. jukebox

    Code for the paper "Jukebox: A Generative Model for Music"

  22. bertviz

    BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)

    Project mention: Gradient Descent on Token Input Embeddings: A ModernBERT experiment | dev.to | 2025-06-23

    ModernBERT-large was chosen because it is relatively lightweight model with a strong visualization suite and a simplified attention mask (full cross-attention) that is easy to reason about. It would be interesting to see if the results in this post hold across other models.

  23. GPT2-Chinese

    Chinese version of GPT2 training code, using BERT tokenizer.

  24. BERT-pytorch

    Google AI 2018 BERT pytorch implementation

  25. Informer2020

    The GitHub repository for the paper "Informer" accepted by AAAI 2021.

  26. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Transformer discussion

Python Transformer related posts

  • Qwen3-Omni-Flash-2025-12-01:a next-generation native multimodal large model

    3 projects | news.ycombinator.com | 10 Dec 2025
  • Run Big LLMs on Small GPUs: A Hands-On Guide to 4-bit Quantization and QLoRA

    6 projects | dev.to | 27 Nov 2025
  • Heretic: Automatic censorship removal for language models

    2 projects | news.ycombinator.com | 16 Nov 2025
  • Transformers Tutorial

    1 project | news.ycombinator.com | 17 Jul 2025
  • Llama 4 Smells Bad

    4 projects | news.ycombinator.com | 24 Apr 2025
  • Complete Large Language Model (LLM) Learning Roadmap

    14 projects | dev.to | 11 Apr 2025
  • Multilspy: Building a common LSP client handtuned for all Language servers

    4 projects | news.ycombinator.com | 16 Dec 2024
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 22 Dec 2025
    InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →

Index

What are some of the best open-source Transformer projects in Python? This list will help you:

# Project Stars
1 transformers 154,054
2 vllm 65,886
3 nn 64,821
4 mmdetection 32,017
5 fish-speech 24,379
6 best-of-ml-python 22,954
7 sglang 21,434
8 faster-whisper 19,503
9 LaTeX-OCR 16,037
10 RWKV-LM 14,228
11 PaddleSpeech 12,430
12 lm-evaluation-harness 10,952
13 text-generation-inference 10,710
14 petals 9,844
15 mmsegmentation 9,381
16 PaddleSeg 9,249
17 manga-image-translator 9,052
18 LMFlow 8,492
19 jukebox 8,035
20 bertviz 7,839
21 GPT2-Chinese 7,592
22 BERT-pytorch 6,507
23 Informer2020 6,347

Sponsored
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com

Did you know that Python is
the 2nd most popular programming language
based on number of references?