Python Exploration

Open-source Python projects categorized as Exploration

Top 4 Python Exploration Projects

Exploration
  1. ydata-profiling

    1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

    Project mention: The DuckDB Local UI | news.ycombinator.com | 2025-03-12

    WhatTheDuck does SQL with duckdb-wasm IIRC

    Pygwalker does open-source descriptive statistics and charts from pandas dataframes: https://github.com/Kanaries/pygwalker

    ydata-profiling does Exploratory Data Analysis (EDA) with Pandas and Spark DataFrames and integrates with various apps: https://github.com/ydataai/ydata-profiling

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. sweetviz

    Visualize and compare datasets, target values and associations, with one line of code.

  4. HouseExpo

    HouseExpo: A Large-scale 2D Indoor Layout Dataset

  5. roam-prototype

    Explore a procedurally-generated 2D world and interact with your surroundings.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Exploration discussion

Python Exploration related posts

  • Data exploration is not dead

    1 project | news.ycombinator.com | 24 Jun 2023
  • Explore your data in a single line of code

    1 project | news.ycombinator.com | 24 Jun 2023
  • Which preprocessing steps to improve the performance of a naive bayes classifier

    1 project | /r/learnmachinelearning | 23 Jun 2023
  • Ydata-Profiling and Dask

    1 project | news.ycombinator.com | 19 May 2023
  • 🧠 ydata-profiling + Dask!

    1 project | /r/datascience | 19 May 2023
  • Open Source or free tools/scripts for Data Profiling to help understand data and it's quality?

    1 project | /r/BusinessIntelligence | 15 May 2023
  • Dataset Correlation question for a DDOS attack

    1 project | /r/datasets | 2 Mar 2023
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 23 Dec 2025
    InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →

Index

What are some of the best open-source Exploration projects in Python? This list will help you:

# Project Stars
1 ydata-profiling 13,311
2 sweetviz 3,054
3 HouseExpo 133
4 roam-prototype 1

Sponsored
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com

Did you know that Python is
the 2nd most popular programming language
based on number of references?