Top 23 Python image-classification Projects

ultralytics

1 37 50,002 9.9 Python

Ultralytics YOLO 🚀

Project mention: Why DETRs are replacing YOLOs for real-time object detection | news.ycombinator.com | 2025-11-22

> The YOLO series is developed and maintained by Ultralytics. All YOLO code and weights are released under the AGPL-3.0 license.The YOLO series is developed and maintained by Ultralytics. All YOLO code and weights are released under the AGPL-3.0 license.
The original author of YOLO and the Darknet framework [1] issued the code under pretty much every license you wish to use [2]. My preferred fork by AlexeyAB is under an equally permissive license [3].
Ultralytics then created their own model under the AGPL-3.0 license [4], which probably would never stand up in a court as they have the model from the likes of YOLOv3 in their source [5].
This entire article is flawed anyway, because they don't state which YOLOv11 model they are using or compare the accuracy. They appear to have just taken the pre-trained models and assumed it's apples-to-apples. They could have at least compared YOLO11n/s/m/l/x,
[1] https://pjreddie.com/darknet/yolo/
[2] https://github.com/pjreddie/darknet
[3] https://github.com/AlexeyAB/darknet
[4] https://github.com/ultralytics/ultralytics
[5] https://github.com/ultralytics/ultralytics/tree/main/ultraly...
Stream

getstream.io featured

Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
pytorch-image-models

2 37 36,033 9.6 Python

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
vit-pytorch

3 11 24,698 8.0 Python

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Swin-Transformer

4 23 15,313 2.9 Python

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
pytorch-grad-cam

5 5 12,468 7.0 Python

Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
fiftyone

6 32 10,160 10.0 Python

Refine high-quality datasets and visual AI models

Project mention: Launch HN: Enhanced Radar (YC W25) – A safety net for air traffic control | news.ycombinator.com | 2025-03-04

Are there already bird not a bird datasets?
Procedures for creating "bird on Multispectral plane radar and video" dataset(s):
Tag birds on the dashcam video with timecoded sensor data and a segmentation and annotation tool.
Pinch to zoom, auto-edge detect, classification probability, sensor status
voxel51/fiftyone does segmentation and annotation with video and possibly Multispectral data: https://github.com/voxel51/fiftyone
InternVL

7 1 9,625 7.7 Python

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Project mention: InternVL3: Unified Multimodal AI Training Outperforms Open-Source Rivals | dev.to | 2025-04-17

InternVL3 marks a significant advancement in the InternVL model series, implementing a native multimodal pre-training approach that fundamentally transforms how vision-language models learn. Unlike most leading multimodal large language models (MLLMs) that adapt text-only models to handle visual inputs through complex post-hoc alignment, InternVL3 jointly acquires multimodal and linguistic capabilities in a single unified pre-training stage.
InfluxDB

www.influxdata.com featured

InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
gluon-cv

8 2 5,917 0.0 Python

Gluon CV Toolkit

Project mention: Gluon: a GPU programming language based on the same compiler stack as Triton | news.ycombinator.com | 2025-09-17
PaddleClas

9 2 5,768 6.9 Python

A treasure chest for visual classification and recognition powered by PaddlePaddle
mmpretrain

10 2 3,757 2.9 Python

OpenMMLab Pre-training Toolbox and Benchmark
hub

11 1 3,523 1.3 Python

A library for transfer learning by reusing parts of TensorFlow models. (by tensorflow)
catalyst

12 1 3,366 0.0 Python

Accelerated deep learning R&D (by catalyst-team)
autodistill

13 21 2,549 6.4 Python

Images to inference with no labeling (use foundation models to train supervised models).

Project mention: Segment Anything 3 | news.ycombinator.com | 2025-11-19

We (Roboflow) have had early access to this model for the past few weeks. It's really, really good. This feels like a seminal moment for computer vision. I think there's a real possibility this launch goes down in history as "the GPT Moment" for vision.
The two areas I think this model is going to be transformative in the immediate term are for rapid prototyping and distillation.
Two years ago we released autodistill[1], an open source framework that uses large foundation models to create training data for training small realtime models. I'm convinced the idea was right, but too early; there wasn't a big model good enough to be worth distilling from back then. SAM3 is finally that model (and will be available in Autodistill today).
We are also taking a big bet on SAM3 and have built it into Roboflow as an integral part of the entire build and deploy pipeline[2], including a brand new product called Rapid[3], which reimagines the computer vision pipeline in a SAM3 world. It feels really magical to go from an unlabeled video to a fine-tuned realtime segmentation model with minimal human intervention in just a few minutes (and we rushed the release of our new SOTA realtime segmentation model[4] last week because it's the perfect lightweight complement to the large & powerful SAM3).
We also have a playground[5] up where you can play with the model and compare it to other VLMs.
[1] https://github.com/autodistill/autodistill
[2] https://blog.roboflow.com/sam3/
[3] https://rapid.roboflow.com
[4] https://github.com/roboflow/rf-detr
[5] https://playground.roboflow.com
ailia-models

14 4 2,296 9.1 Python

The collection of pre-trained, state-of-the-art AI models for ailia SDK
efficientnet

15 9 2,097 0.0 Python

Implementation of EfficientNet model. Keras and TensorFlow Keras.
fastdup

16 19 1,802 7.3 Python

fastdup is a powerful, free tool designed to rapidly generate valuable insights from image and video datasets. It helps enhance the quality of both images and labels, while significantly reducing data operation costs, all with unmatched scalability.
pytorch-toolbelt

17 1 1,562 6.3 Python

PyTorch extensions for fast R&D prototyping and Kaggle farming
Unsupervised-Classification

18 2 1,450 1.4 Python

SCAN: Learning to Classify Images without Labels, incl. SimCLR. [ECCV 2020]
poolformer

19 3 1,356 3.0 Python

PoolFormer: MetaFormer Is Actually What You Need for Vision (CVPR 2022 Oral)
private-detector

20 5 1,339 4.6 Python

Bumble's Private Detector - a pretrained model for detecting lewd images
involution

21 6 1,314 0.0 Python

[CVPR 2021] Involution: Inverting the Inherence of Convolution for Visual Recognition, a brand new neural operator
cleanvision

22 4 1,134 6.5 Python

Automatically find issues in image datasets and practice data-centric computer vision.
SimMIM

23 1 995 0.0 Python

This is an official implementation for "SimMIM: A Simple Framework for Masked Image Modeling".
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python image-classification discussion

Python image-classification related posts

Show HN: Energy-Efficient NAS via RBF Kernel Scoring (No GPU Training Needed)

1 project | news.ycombinator.com | 12 Jul 2025
Show HN: Local, automatic, image keywords, captions using metadata for storage

1 project | news.ycombinator.com | 1 Apr 2025
Alternatives to Cosine Similarity

1 project | news.ycombinator.com | 8 Oct 2024
I made a social media app

1 project | /r/webdev | 8 Dec 2023
Samsung expected to report 80% profit plunge as losses mount at chip business

3 projects | news.ycombinator.com | 10 Oct 2023
Is it easier to go from Pytorch to TF and Keras than the other way around?

1 project | /r/pytorch | 13 May 2023
Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows

2 projects | news.ycombinator.com | 9 Apr 2023
A note from our sponsor - InfluxDB
www.influxdata.com | 22 Dec 2025

InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →

Index

What are some of the best open-source image-classification projects in Python? This list will help you:

#	Project	Stars
1	ultralytics	50,002
2	pytorch-image-models	36,033
3	vit-pytorch	24,698
4	Swin-Transformer	15,313
5	pytorch-grad-cam	12,468
6	fiftyone	10,160
7	InternVL	9,625
8	gluon-cv	5,917
9	PaddleClas	5,768
10	mmpretrain	3,757
11	hub	3,523
12	catalyst	3,366
13	autodistill	2,549
14	ailia-models	2,296
15	efficientnet	2,097
16	fastdup	1,802
17	pytorch-toolbelt	1,562
18	Unsupervised-Classification	1,450
19	poolformer	1,356
20	private-detector	1,339
21	involution	1,314
22	cleanvision	1,134
23	SimMIM	995

Python image-classification

Top 23 Python image-classification Projects

Python image-classification discussion

Python image-classification related posts

Show HN: Energy-Efficient NAS via RBF Kernel Scoring (No GPU Training Needed)

Show HN: Local, automatic, image keywords, captions using metadata for storage

Alternatives to Cosine Similarity

I made a social media app

Samsung expected to report 80% profit plunge as losses mount at chip business

Is it easier to go from Pytorch to TF and Keras than the other way around?

Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows

Index

Did you know that Python is the 2nd most popular programming language based on number of references?

Did you know that Python is
the 2nd most popular programming language
based on number of references?