Python image-classification

Open-source Python projects categorized as image-classification

Top 23 Python image-classification Projects

image-classification
  1. ultralytics

    Ultralytics YOLO 🚀

    Project mention: Why DETRs are replacing YOLOs for real-time object detection | news.ycombinator.com | 2025-11-22

    > The YOLO series is developed and maintained by Ultralytics. All YOLO code and weights are released under the AGPL-3.0 license.The YOLO series is developed and maintained by Ultralytics. All YOLO code and weights are released under the AGPL-3.0 license.

    The original author of YOLO and the Darknet framework [1] issued the code under pretty much every license you wish to use [2]. My preferred fork by AlexeyAB is under an equally permissive license [3].

    Ultralytics then created their own model under the AGPL-3.0 license [4], which probably would never stand up in a court as they have the model from the likes of YOLOv3 in their source [5].

    This entire article is flawed anyway, because they don't state which YOLOv11 model they are using or compare the accuracy. They appear to have just taken the pre-trained models and assumed it's apples-to-apples. They could have at least compared YOLO11n/s/m/l/x,

    [1] https://pjreddie.com/darknet/yolo/

    [2] https://github.com/pjreddie/darknet

    [3] https://github.com/AlexeyAB/darknet

    [4] https://github.com/ultralytics/ultralytics

    [5] https://github.com/ultralytics/ultralytics/tree/main/ultraly...

  2. Stream

    Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.

    Stream logo
  3. pytorch-image-models

    The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

  4. vit-pytorch

    Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

  5. Swin-Transformer

    This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".

  6. pytorch-grad-cam

    Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.

  7. fiftyone

    Refine high-quality datasets and visual AI models

    Project mention: Launch HN: Enhanced Radar (YC W25) – A safety net for air traffic control | news.ycombinator.com | 2025-03-04

    Are there already bird not a bird datasets?

    Procedures for creating "bird on Multispectral plane radar and video" dataset(s):

    Tag birds on the dashcam video with timecoded sensor data and a segmentation and annotation tool.

    Pinch to zoom, auto-edge detect, classification probability, sensor status

    voxel51/fiftyone does segmentation and annotation with video and possibly Multispectral data: https://github.com/voxel51/fiftyone

  8. InternVL

    [CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

    Project mention: InternVL3: Unified Multimodal AI Training Outperforms Open-Source Rivals | dev.to | 2025-04-17

    InternVL3 marks a significant advancement in the InternVL model series, implementing a native multimodal pre-training approach that fundamentally transforms how vision-language models learn. Unlike most leading multimodal large language models (MLLMs) that adapt text-only models to handle visual inputs through complex post-hoc alignment, InternVL3 jointly acquires multimodal and linguistic capabilities in a single unified pre-training stage.

  9. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  10. gluon-cv

    Gluon CV Toolkit

    Project mention: Gluon: a GPU programming language based on the same compiler stack as Triton | news.ycombinator.com | 2025-09-17
  11. PaddleClas

    A treasure chest for visual classification and recognition powered by PaddlePaddle

  12. mmpretrain

    OpenMMLab Pre-training Toolbox and Benchmark

  13. hub

    A library for transfer learning by reusing parts of TensorFlow models. (by tensorflow)

  14. catalyst

    Accelerated deep learning R&D (by catalyst-team)

  15. autodistill

    Images to inference with no labeling (use foundation models to train supervised models).

    Project mention: Segment Anything 3 | news.ycombinator.com | 2025-11-19

    We (Roboflow) have had early access to this model for the past few weeks. It's really, really good. This feels like a seminal moment for computer vision. I think there's a real possibility this launch goes down in history as "the GPT Moment" for vision.

    The two areas I think this model is going to be transformative in the immediate term are for rapid prototyping and distillation.

    Two years ago we released autodistill[1], an open source framework that uses large foundation models to create training data for training small realtime models. I'm convinced the idea was right, but too early; there wasn't a big model good enough to be worth distilling from back then. SAM3 is finally that model (and will be available in Autodistill today).

    We are also taking a big bet on SAM3 and have built it into Roboflow as an integral part of the entire build and deploy pipeline[2], including a brand new product called Rapid[3], which reimagines the computer vision pipeline in a SAM3 world. It feels really magical to go from an unlabeled video to a fine-tuned realtime segmentation model with minimal human intervention in just a few minutes (and we rushed the release of our new SOTA realtime segmentation model[4] last week because it's the perfect lightweight complement to the large & powerful SAM3).

    We also have a playground[5] up where you can play with the model and compare it to other VLMs.

    [1] https://github.com/autodistill/autodistill

    [2] https://blog.roboflow.com/sam3/

    [3] https://rapid.roboflow.com

    [4] https://github.com/roboflow/rf-detr

    [5] https://playground.roboflow.com

  16. ailia-models

    The collection of pre-trained, state-of-the-art AI models for ailia SDK

  17. efficientnet

    Implementation of EfficientNet model. Keras and TensorFlow Keras.

  18. fastdup

    fastdup is a powerful, free tool designed to rapidly generate valuable insights from image and video datasets. It helps enhance the quality of both images and labels, while significantly reducing data operation costs, all with unmatched scalability.

  19. pytorch-toolbelt

    PyTorch extensions for fast R&D prototyping and Kaggle farming

  20. Unsupervised-Classification

    SCAN: Learning to Classify Images without Labels, incl. SimCLR. [ECCV 2020]

  21. poolformer

    PoolFormer: MetaFormer Is Actually What You Need for Vision (CVPR 2022 Oral)

  22. private-detector

    Bumble's Private Detector - a pretrained model for detecting lewd images

  23. involution

    [CVPR 2021] Involution: Inverting the Inherence of Convolution for Visual Recognition, a brand new neural operator

  24. cleanvision

    Automatically find issues in image datasets and practice data-centric computer vision.

  25. SimMIM

    This is an official implementation for "SimMIM: A Simple Framework for Masked Image Modeling".

  26. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python image-classification discussion

Python image-classification related posts

  • Show HN: Energy-Efficient NAS via RBF Kernel Scoring (No GPU Training Needed)

    1 project | news.ycombinator.com | 12 Jul 2025
  • Show HN: Local, automatic, image keywords, captions using metadata for storage

    1 project | news.ycombinator.com | 1 Apr 2025
  • Alternatives to Cosine Similarity

    1 project | news.ycombinator.com | 8 Oct 2024
  • I made a social media app

    1 project | /r/webdev | 8 Dec 2023
  • Samsung expected to report 80% profit plunge as losses mount at chip business

    3 projects | news.ycombinator.com | 10 Oct 2023
  • Is it easier to go from Pytorch to TF and Keras than the other way around?

    1 project | /r/pytorch | 13 May 2023
  • Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows

    2 projects | news.ycombinator.com | 9 Apr 2023
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 22 Dec 2025
    InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →

Index

What are some of the best open-source image-classification projects in Python? This list will help you:

# Project Stars
1 ultralytics 50,002
2 pytorch-image-models 36,033
3 vit-pytorch 24,698
4 Swin-Transformer 15,313
5 pytorch-grad-cam 12,468
6 fiftyone 10,160
7 InternVL 9,625
8 gluon-cv 5,917
9 PaddleClas 5,768
10 mmpretrain 3,757
11 hub 3,523
12 catalyst 3,366
13 autodistill 2,549
14 ailia-models 2,296
15 efficientnet 2,097
16 fastdup 1,802
17 pytorch-toolbelt 1,562
18 Unsupervised-Classification 1,450
19 poolformer 1,356
20 private-detector 1,339
21 involution 1,314
22 cleanvision 1,134
23 SimMIM 995

Sponsored
Stream - Scalable APIs for Chat, Feeds, Moderation, & Video.
Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
getstream.io

Did you know that Python is
the 2nd most popular programming language
based on number of references?