Jupyter Notebook data-engineering

Open-source Jupyter Notebook projects categorized as data-engineering

Top 14 Jupyter Notebook data-engineering Projects

data-engineering
  1. Made-With-ML

    Learn how to design, develop, deploy and iterate on production-grade ML applications.

  2. Stream

    Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.

    Stream logo
  3. data-engineering-zoomcamp

    Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here 👇🏼

    Project mention: The NoFluff Cheatsheet for the Airflow 3 Fundamentals | dev.to | 2025-12-19

    I’m a Software Engineer who used Airflow to complete the data-engineering-zoomcamp by Datatalks.club and build my Fitbit ETL pipeline.

  4. mlops-course

    Learn how to design, develop, deploy and iterate on production-grade ML applications.

  5. hamilton

    Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

    Project mention: Apache Hamilton (incubating): a Python library for DAGs of data transformations | news.ycombinator.com | 2025-08-03
  6. Data-Engineering-Projects

    Personal Data Engineering Projects

  7. practical-data-engineering

    Practical Data Engineering: A Hands-On Real-Estate Project Guide

  8. snowflake-demo-notebooks

    Collection of Snowflake Notebook demos, tutorials, and examples

    Project mention: All Data and AI Weekly #182 - 24-March-2025 | dev.to | 2025-03-24
  9. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  10. pyspark-tutorial

    PySpark Tutorial for Beginners - Practical Examples in Jupyter Notebook with Spark version 3.4.1. The tutorial covers various topics like Spark Introduction, Spark Installation, Spark RDD Transformations and Actions, Spark DataFrame, Spark SQL, and more. It is completely free on YouTube and is beginner-friendly without any prerequisites. (by coder2j)

  11. uber-expenses-tracking

    The goal of this project is to track the expenses of Uber Rides and Uber Eats through data Engineering processes using technologies such as Apache Airflow, AWS Redshift and Power BI.

  12. emerging-solutions-toolbox

    The Emerging Solutions Toolbox is a collection of solutions created by Snowflake's Solution Innovation Team (SIT) that consists of demos, helpers, and frameworks to help you get the most out of Snowflake.

    Project mention: All Data and AI Weekly #190 - May 19, 2025 | dev.to | 2025-05-19

    ❄️ https://github.com/Snowflake-Labs/emerging-solutions-toolbox/tree/main/helper-prompt-template-runner

  13. 60-Days-of-Data-Science-and-ML

    60 Days of Data Science and ML

  14. Data-Engineering-Portfolio

    I'm learning how to build data pipelines to work with large datasets. (:

  15. fenic-examples

    A collection of example projects demonstrating the capabilities of Fenic

    Project mention: A better way to search Hacker News using LLMs | news.ycombinator.com | 2025-11-19
  16. data-engineering-nd

    Projects of the Udacity Data Engineering Nanodegree Program.

  17. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Jupyter Notebook data-engineering discussion

Jupyter Notebook data-engineering related posts

  • The NoFluff Cheatsheet for the Airflow 3 Fundamentals

    2 projects | dev.to | 19 Dec 2025
  • A better way to search Hacker News using LLMs

    1 project | news.ycombinator.com | 19 Nov 2025
  • Apache Hamilton (incubating): a Python library for DAGs of data transformations

    1 project | news.ycombinator.com | 3 Aug 2025
  • Data Engineering Concepts: A project based introduction

    2 projects | dev.to | 14 May 2025
  • Peer Review 1: Analyzing Poland's Real Estate Market (Part 1)

    2 projects | dev.to | 30 Apr 2025
  • Study Notes 2.2.7: Managing Schedules and Backfills with BigQuery in Kestra

    3 projects | dev.to | 4 Feb 2025
  • Study Note DE Zoomcamp 1.2.4 - Dockerizing the Ingestion Script

    1 project | dev.to | 4 Feb 2025
  • A note from our sponsor - SaaSHub
    www.saashub.com | 23 Dec 2025
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source data-engineering projects in Jupyter Notebook? This list will help you:

# Project Stars
1 Made-With-ML 44,375
2 data-engineering-zoomcamp 33,949
3 mlops-course 3,222
4 hamilton 2,340
5 Data-Engineering-Projects 960
6 practical-data-engineering 719
7 snowflake-demo-notebooks 321
8 pyspark-tutorial 134
9 uber-expenses-tracking 121
10 emerging-solutions-toolbox 56
11 60-Days-of-Data-Science-and-ML 27
12 Data-Engineering-Portfolio 14
13 fenic-examples 12
14 data-engineering-nd 9

Sponsored
Stream - Scalable APIs for Chat, Feeds, Moderation, & Video.
Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
getstream.io

Did you know that Jupyter Notebook is
the 13th most popular programming language
based on number of references?