SaaSHub helps you find the best software and product alternatives Learn more →
Top 14 Jupyter Notebook data-engineering Projects
-
-
Stream
Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
-
data-engineering-zoomcamp
Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here 👇🏼
I’m a Software Engineer who used Airflow to complete the data-engineering-zoomcamp by Datatalks.club and build my Fitbit ETL pipeline.
-
-
hamilton
Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
Project mention: Apache Hamilton (incubating): a Python library for DAGs of data transformations | news.ycombinator.com | 2025-08-03 -
-
-
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
pyspark-tutorial
PySpark Tutorial for Beginners - Practical Examples in Jupyter Notebook with Spark version 3.4.1. The tutorial covers various topics like Spark Introduction, Spark Installation, Spark RDD Transformations and Actions, Spark DataFrame, Spark SQL, and more. It is completely free on YouTube and is beginner-friendly without any prerequisites. (by coder2j)
-
uber-expenses-tracking
The goal of this project is to track the expenses of Uber Rides and Uber Eats through data Engineering processes using technologies such as Apache Airflow, AWS Redshift and Power BI.
-
emerging-solutions-toolbox
The Emerging Solutions Toolbox is a collection of solutions created by Snowflake's Solution Innovation Team (SIT) that consists of demos, helpers, and frameworks to help you get the most out of Snowflake.
❄️ https://github.com/Snowflake-Labs/emerging-solutions-toolbox/tree/main/helper-prompt-template-runner
-
-
-
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Jupyter Notebook data-engineering discussion
Jupyter Notebook data-engineering related posts
-
The NoFluff Cheatsheet for the Airflow 3 Fundamentals
-
A better way to search Hacker News using LLMs
-
Apache Hamilton (incubating): a Python library for DAGs of data transformations
-
Data Engineering Concepts: A project based introduction
-
Peer Review 1: Analyzing Poland's Real Estate Market (Part 1)
-
Study Notes 2.2.7: Managing Schedules and Backfills with BigQuery in Kestra
-
Study Note DE Zoomcamp 1.2.4 - Dockerizing the Ingestion Script
- A note from our sponsor - SaaSHub www.saashub.com | 23 Dec 2025
Index
What are some of the best open-source data-engineering projects in Jupyter Notebook? This list will help you:
| # | Project | Stars |
|---|---|---|
| 1 | Made-With-ML | 44,375 |
| 2 | data-engineering-zoomcamp | 33,949 |
| 3 | mlops-course | 3,222 |
| 4 | hamilton | 2,340 |
| 5 | Data-Engineering-Projects | 960 |
| 6 | practical-data-engineering | 719 |
| 7 | snowflake-demo-notebooks | 321 |
| 8 | pyspark-tutorial | 134 |
| 9 | uber-expenses-tracking | 121 |
| 10 | emerging-solutions-toolbox | 56 |
| 11 | 60-Days-of-Data-Science-and-ML | 27 |
| 12 | Data-Engineering-Portfolio | 14 |
| 13 | fenic-examples | 12 |
| 14 | data-engineering-nd | 9 |