InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →
Top 23 Python Bert Projects
-
transformers
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Project mention: Run Big LLMs on Small GPUs: A Hands-On Guide to 4-bit Quantization and QLoRA | dev.to | 2025-11-27Hugging Face Transformers: https://github.com/huggingface/transformers
-
Stream
Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
-
-
- Project mention: Gradient Descent on Token Input Embeddings: A ModernBERT experiment | dev.to | 2025-06-23
ModernBERT-large was chosen because it is relatively lightweight model with a strong visualization suite and a simplified attention mask (full cross-attention) that is easy to reason about. It would be interesting to see if the results in this post hold across other models.
-
-
-
awesome-pretrained-chinese-nlp-models
Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
-
-
-
-
-
beir
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
Project mention: Gemini Embedding: Powering RAG and context engineering | news.ycombinator.com | 2025-07-31It's always worth checking out the MTEB leaderboard: https://huggingface.co/spaces/mteb/leaderboard
There are some good open models there that have longer context limits and fewer dimensions.
The benchmarks are just a guide. It's best to build a test dataset with your own data. This is a good example of that: https://github.com/beir-cellar/beir/wiki/Load-your-custom-da...
Another benefit of having your own test dataset, is that it can grow as your data grows. And you can quickly test new models to see how it performs with YOUR data.
-
-
-
-
SparK
[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling" (by keyu-tian)
-
contextualized-topic-models
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).
-
-
Transformers4Rec
Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation and works with PyTorch.
-
UForm
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
-
detoxify
Trained models & code to predict toxic comments on all 3 Jigsaw Toxic Comment Challenges. Built using ⚡ Pytorch Lightning and 🤗 Transformers. For access to our API, please email us at contact@unitary.ai.
Open-source toxicity detection model (based on BERT) https://github.com/unitaryai/detoxify
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python Bert discussion
Python Bert related posts
-
Show HN: Haystack – Review pull requests like you wrote them yourself
-
The AI Ethics Toolkit for Developers
-
Building AI Agents with Haystack and Gaia Node: A Practical Guide
-
Building a Prompt-Based Crypto Trading Platform with RAG and Reddit Sentiment Analysis using Haystack
-
Show HN: A Medical Research Agent Built with BioMCP and Haystack
-
Show HN: An adaptive classifier that detects hallucinations in LLM/RAG outputs
-
Adaptive Classification for Automatic LLM Temperature Optimization
- A note from our sponsor - InfluxDB www.influxdata.com | 22 Dec 2025
Index
What are some of the best open-source Bert projects in Python? This list will help you:
| # | Project | Stars |
|---|---|---|
| 1 | transformers | 154,054 |
| 2 | PaddleNLP | 12,879 |
| 3 | clip-as-service | 12,752 |
| 4 | bertviz | 7,839 |
| 5 | BERTopic | 7,257 |
| 6 | BERT-pytorch | 6,507 |
| 7 | awesome-pretrained-chinese-nlp-models | 5,485 |
| 8 | KeyBERT | 4,070 |
| 9 | Top2Vec | 3,094 |
| 10 | adapters | 2,791 |
| 11 | DeBERTa | 2,153 |
| 12 | AliceMind | 2,047 |
| 13 | beir | 2,023 |
| 14 | jiant | 1,666 |
| 15 | scibert | 1,655 |
| 16 | ModernBERT | 1,594 |
| 17 | SparK | 1,356 |
| 18 | contextualized-topic-models | 1,254 |
| 19 | BERT-NER | 1,245 |
| 20 | Transformers4Rec | 1,234 |
| 21 | UForm | 1,206 |
| 22 | detoxify | 1,157 |
| 23 | nncf | 1,110 |