Python Computer Vision

Open-source Python projects categorized as Computer Vision

Top 23 Python Computer Vision Projects

Computer Vision
  1. Face Recognition

    The world's simplest facial recognition api for Python and the command line

    Project mention: Show HN: Real-time privacy protection for smart glasses | news.ycombinator.com | 2025-08-11

    Did you look at egoblur? its a lot more effective at face detection than https://github.com/ageitgey/face_recognition granted, you'd have to do your own face matching to do exception.

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. ultralytics

    Ultralytics YOLO 🚀

    Project mention: Why DETRs are replacing YOLOs for real-time object detection | news.ycombinator.com | 2025-11-22

    > The YOLO series is developed and maintained by Ultralytics. All YOLO code and weights are released under the AGPL-3.0 license.The YOLO series is developed and maintained by Ultralytics. All YOLO code and weights are released under the AGPL-3.0 license.

    The original author of YOLO and the Darknet framework [1] issued the code under pretty much every license you wish to use [2]. My preferred fork by AlexeyAB is under an equally permissive license [3].

    Ultralytics then created their own model under the AGPL-3.0 license [4], which probably would never stand up in a court as they have the model from the likes of YOLOv3 in their source [5].

    This entire article is flawed anyway, because they don't state which YOLOv11 model they are using or compare the accuracy. They appear to have just taken the pre-trained models and assumed it's apples-to-apples. They could have at least compared YOLO11n/s/m/l/x,

    [1] https://pjreddie.com/darknet/yolo/

    [2] https://github.com/pjreddie/darknet

    [3] https://github.com/AlexeyAB/darknet

    [4] https://github.com/ultralytics/ultralytics

    [5] https://github.com/ultralytics/ultralytics/tree/main/ultraly...

  4. supervision

    We write your reusable computer vision tools. 💜

    Project mention: Show HN: Plug-and-play Python utils for any computer-vision pipeline | news.ycombinator.com | 2025-07-21
  5. EasyOCR

    Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

    Project mention: Using Docling’s OCR features with RapidOCR | dev.to | 2025-04-03
  6. d2l-en

    Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.

  7. pytorch-CycleGAN-and-pix2pix

    Image-to-Image Translation in PyTorch

  8. vit-pytorch

    Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

  9. Stream

    Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.

    Stream logo
  10. datasets

    🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools

    Project mention: Training with Big Data on Any Cloud | dev.to | 2025-06-20

    Hugging Face Datasets -- the library that lets you download and manage datasets from the Hugging Face Hub, as well as being a convenient vendor-neutral interface for your own datasets.

  11. gaussian-splatting

    Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"

    Project mention: Show HN: Blurry – Host, share, and embed Gaussian Splatting models | news.ycombinator.com | 2025-05-29

    Hey HN,

    I noticed that there is a few tools to create and edit 3D Gaussian Splatting[1] models, but not so many tools for robust hosting, sharing, and presentation of these models, so I built Blurry!

    Unlike other platforms, I’m specifically focusing on the best 3DGS in-browser viewer experience you can seamlessly embed on websites or Notion docs, with more places coming soon!

    Some of the potential use cases are professional training, product and rental space marketing, and construction business. But at this stage with Blurry, I’m specifically targeting people/businesses that already use 3D scanning to a certain degree but are lacking an easy-to-use and performant hosting platform for their 3DGS models.

    I’m shipping new features and improvements very quickly. Two big things on the current roadmap are first-person camera controls (especially for indoor splats), and a support for really large models (possibly done by streaming of the model as you move around).

    Would love your thoughts and feedback pls!

    [1] For those who don’t know, 3D Gaussian Splatting is a fairly new method to reconstruct a 3D model from real pictures and videos (https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/).

  12. vision

    Datasets, Transforms and Models specific to Computer Vision

    Project mention: The Speed of VITs and CNNs | news.ycombinator.com | 2025-05-04

    ConvNeXT's architecture contains an AdaptiveAvgPool2d layer: https://github.com/pytorch/vision/blob/5f03dc524bdb7529bb4f2...

    This means that you can split your image into tiles, process each tile individually, average the results, apply a final classification layer to the average and get exactly the same result. For reference, see the demonstration below.

    You could of course do exactly the same thing with a vision transformer instead of a convolutional neural network.

    That being said, architecture is wildly overemphasized in my opinion. Data is everything.

     import torch, torchvision.models

  13. labelme

    Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).

    Project mention: Convert LabelMe Annotations to YOLO Format with labelme-to-yolo | dev.to | 2024-12-29

    Convert LabelMe format into YoloV7 format for instance segmentation.

  14. facenet

    Face recognition using Tensorflow

  15. open_clip

    An open source implementation of CLIP.

    Project mention: Cross-Modal Embeddings: Bridging AI Modalities | dev.to | 2025-11-21

    OpenCLIP: Open Source Implementation

  16. fashion-mnist

    A MNIST-like fashion product database. Benchmark :point_down:

  17. pytorch-grad-cam

    Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.

  18. ludwig

    Low-code framework for building custom LLMs, neural networks, and other AI models

  19. segmentation_models.pytorch

    Semantic segmentation models with 500+ pretrained convolutional and transformer-based backbones.

  20. nerfstudio

    A collaboration friendly studio for NeRFs

  21. Kornia

    🐍 Geometric Computer Vision Library for Spatial AI

  22. fiftyone

    Refine high-quality datasets and visual AI models

    Project mention: Launch HN: Enhanced Radar (YC W25) – A safety net for air traffic control | news.ycombinator.com | 2025-03-04

    Are there already bird not a bird datasets?

    Procedures for creating "bird on Multispectral plane radar and video" dataset(s):

    Tag birds on the dashcam video with timecoded sensor data and a segmentation and annotation tool.

    Pinch to zoom, auto-edge detect, classification probability, sensor status

    voxel51/fiftyone does segmentation and annotation with video and possibly Multispectral data: https://github.com/voxel51/fiftyone

  23. autogluon

    Fast and Accurate ML in 3 Lines of Code

    Project mention: Gluon: a GPU programming language based on the same compiler stack as Triton | news.ycombinator.com | 2025-09-17

    Amazon (+ Microsoft) already released a language for ML called gluon 8 years ago: https://aws.amazon.com/blogs/aws/introducing-gluon-a-new-lib...

    autogluon is popular as well: https://github.com/autogluon/autogluon

  24. U-2-Net

    The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."

  25. RobustVideoMatting

    Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

  26. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Computer Vision discussion

Python Computer Vision related posts

  • Show HN: Dinotool – a foundation model vector embedding CLI

    1 project | news.ycombinator.com | 4 Dec 2025
  • Deep Dive: Building Real-Time Facial Emotion Detection on Raspberry Pi with YOLOv11

    4 projects | dev.to | 23 Nov 2025
  • Segment Anything 3

    2 projects | news.ycombinator.com | 19 Nov 2025
  • Comic Translate: Unleash the Power of GPT-4 for Flawless Global Comic Translation

    1 project | dev.to | 13 Nov 2025
  • Show HN: Open-Source LaTeX OCR, Alternative to Mathpix/SimpleTex

    1 project | news.ycombinator.com | 12 Nov 2025
  • Show HN: Automate Robot Data Quality Improvement

    1 project | news.ycombinator.com | 27 Oct 2025
  • A toolkit for improving the quality of your LeRobot datasets

    1 project | news.ycombinator.com | 14 Oct 2025
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 22 Dec 2025
    InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →

Index

What are some of the best open-source Computer Vision projects in Python? This list will help you:

# Project Stars
1 Face Recognition 55,756
2 ultralytics 50,002
3 supervision 36,171
4 EasyOCR 28,597
5 d2l-en 27,350
6 pytorch-CycleGAN-and-pix2pix 24,804
7 vit-pytorch 24,698
8 datasets 21,001
9 gaussian-splatting 19,908
10 vision 17,390
11 labelme 15,398
12 facenet 14,207
13 open_clip 13,140
14 fashion-mnist 12,555
15 pytorch-grad-cam 12,468
16 ludwig 11,636
17 segmentation_models.pytorch 11,182
18 nerfstudio 11,009
19 Kornia 10,938
20 fiftyone 10,160
21 autogluon 9,716
22 U-2-Net 9,572
23 RobustVideoMatting 9,112

Sponsored
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com

Did you know that Python is
the 2nd most popular programming language
based on number of references?