Python Statistics

Open-source Python projects categorized as Statistics

Top 23 Python Statistic Projects

  1. scikit-learn

    scikit-learn: machine learning in Python

    Project mention: The Gorman Paradox: Where Are All the AI-Generated Apps? | news.ycombinator.com | 2025-12-14

    Another conspicuous thing is the lack of vibe-coded PRs on mature open source projects. Maybe it's because these projects have erected policies limiting AI contributions, but given the high scores on SWEBench, you'd expect _something_ to come of it?

    And yet in real world use you get stuff like https://github.com/scikit-learn/scikit-learn/pull/32101

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. ydata-profiling

    1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

    Project mention: The DuckDB Local UI | news.ycombinator.com | 2025-03-12

    WhatTheDuck does SQL with duckdb-wasm IIRC

    Pygwalker does open-source descriptive statistics and charts from pandas dataframes: https://github.com/Kanaries/pygwalker

    ydata-profiling does Exploratory Data Analysis (EDA) with Pandas and Spark DataFrames and integrates with various apps: https://github.com/ydataai/ydata-profiling

  4. statsmodels

    Statsmodels: statistical modeling and econometrics in Python

  5. imbalanced-learn

    A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning

  6. boltons

    🔩 Like builtins, but boltons. 250+ constructs, recipes, and snippets which extend (and rely on nothing but) the Python standard library. Nothing like Michael Bolton.

  7. Tautulli

    A Python based monitoring and tracking tool for Plex Media Server.

    Project mention: Tautulli: Plex Media Server Observability | news.ycombinator.com | 2025-09-10
  8. statsforecast

    Lightning ⚡️ fast forecasting with statistical and econometric models.

    Project mention: This Week In Python | dev.to | 2025-03-21

    statsforecast – Forecasting with statistical and econometric models

  9. Stream

    Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.

    Stream logo
  10. github-stats

    Better GitHub statistics images for your profile, with stats from private repos too

  11. eiten

    Statistical and Algorithmic Investing Strategies for Everyone

  12. sweetviz

    Visualize and compare datasets, target values and associations, with one line of code.

  13. maloja

    Self-hosted music scrobble database to create personal listening statistics and charts

  14. uncertainty-baselines

    High-quality implementations of standard and SOTA methods on a variety of tasks.

  15. causal-learn

    Causal Discovery in Python. It also includes (conditional) independence tests and score functions.

  16. pycm

    Multi-class confusion matrix library in Python

  17. geomstats

    Computations and statistics on manifolds with geometric structures.

  18. hierarchicalforecast

    Probabilistic Hierarchical forecasting 👑 with statistical and econometric methods.

  19. pytensor

    PyTensor allows you to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays.

  20. meteostat-python

    Access and analyze historical weather and climate data with Python.

  21. sportsipy

    A free sports API written for python

  22. popmon

    Monitor the stability of a Pandas or Spark dataframe ⚙︎

  23. pypinfo

    Easily view PyPI download statistics via Google's BigQuery.

  24. fitter

    Fit data to many distributions

  25. Contributions-Importer-For-Github

    This tool helps users to import contributions to GitHub from private git repositories, or from public repositories that are not hosted in GitHub.

  26. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Statistics discussion

Python Statistics related posts

  • Predicting Tomorrow's Tremors: A Machine Learning Approach to Earthquake Nowcasting in California

    3 projects | dev.to | 3 Jul 2025
  • Statsforecast: Fast Python forecasting with statistical and econometric models

    1 project | news.ycombinator.com | 20 Mar 2025
  • Tea Tasting: Python package for statistical analysis of A/B tests

    1 project | news.ycombinator.com | 24 Aug 2024
  • tea-tasting VS confidence - a user suggested alternative

    2 projects | 16 Aug 2024
  • The Truth About Linear Regression

    3 projects | news.ycombinator.com | 30 Jul 2024
  • Show HN: Aurora – Problem solving focused statistical and ML software toolkit

    1 project | news.ycombinator.com | 21 Jul 2024
  • How to Build a Logistic Regression Model: A Spam-filter Tutorial

    1 project | dev.to | 5 May 2024
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 22 Dec 2025
    InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →

Index

What are some of the best open-source Statistic projects in Python? This list will help you:

# Project Stars
1 scikit-learn 64,346
2 ydata-profiling 13,311
3 statsmodels 11,158
4 imbalanced-learn 7,070
5 boltons 6,808
6 Tautulli 6,245
7 statsforecast 4,632
8 github-stats 3,314
9 eiten 3,079
10 sweetviz 3,054
11 maloja 1,583
12 uncertainty-baselines 1,551
13 causal-learn 1,518
14 pycm 1,490
15 geomstats 1,433
16 hierarchicalforecast 720
17 pytensor 570
18 meteostat-python 546
19 sportsipy 536
20 popmon 509
21 pypinfo 440
22 fitter 408
23 Contributions-Importer-For-Github 378

Sponsored
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com

Did you know that Python is
the 2nd most popular programming language
based on number of references?