delta-lake

🏂 A machine learning model that performs topic classification of news articles for media bias analysis. Final project for UC Berkeley MIDS 266 (Natural Language Processing)

machine-learning natural-language-processing plotly jupyter-notebook pandas databricks delta-lake all-the-news

Updated Dec 15, 2023
HTML

lamastex / spark-gdelt-examples

Star

Example applications of GDELT mass media intelligence data

finance apache-spark time-series trends gas gdelt oil gdelt-data gdelt-events mass-media delta-lake persons-of-interest events-of-interest

Updated Jan 25, 2022
HTML

easonlai / databricks_delta_table_samples

Star

This is a code sample repository for demonstrating how to perform Databricks Delta Table operations.

python pyspark delta databricks pyspark-notebook databricks-notebooks delta-lake deltalake

Updated May 24, 2022
HTML

lamastex / mep

Star

Project MEP: Meme Evolution programme. A terraformed multi-language library to do statistical experiments in Twitter.

python r scala twitter twitter-api terraform experiments delta-lake twitter-schemas

Updated Oct 17, 2022
HTML

JosephNjiru / modern-water-data-platform

Star

End-to-end water data platform built with PySpark, a Medallion Lakehouse, and DataOps principles (CI/CD, Testing). A local-first, containerised data platform (Docker). A governed Medallion Lakehouse with (Data Quality), and DataHub (Governance). Features Medallion architecture, automated data quality, and CI/CD.

python docker devops apache-spark dashboard etl s3 pyspark minio business-intelligence water-utility containerization dasboard data-governance delta-lake lakehouse medallion-architecture data-engineering-x

Updated Nov 15, 2025
HTML

AlbusDracoSam / Delta_Lake.io

Star

Distributed Computing with Spark

spark delta-lake

Updated Dec 15, 2021
HTML

Igor-C-Assuncao / mvp-engenharia-dados-scania

Star

Pipeline de Engenharia de Dados (Databricks Free Edition) para o SCANIA Component X Dataset: ingestão via Volumes, Delta Lake e arquitetura Medalhão (Bronze→Silver→Gold), modelagem em Esquema Estrela e dashboards/SQL para manutenção preditiva

etl pyspark data-engineering databricks data-quality spark-sql predictive-maintenance star-schema delta-lake medallion-architecture industry-4-0 bronze-silver-gold scania-dataset

Updated Dec 21, 2025
HTML

henry-richard7 / datacraft-framework

Star

A framework that eliminates the dependency on Apache Spark by leveraging delta-rs for the creation and management of Delta Lake tables. This framework follows Medallion architecture.

ingestion delta-lake ingestion-framework polars

Updated Jun 19, 2025
HTML

tseringjsherpa / TPCH-ETL-pipeline-databricks

Star

TPC-H ETL pipeline on Databricks with PySpark and Delta Lake for ingestion, transformation, analysis, and a BI-ready denormalized warehouse.

sql data-warehouse pyspark data-engineering data-modeling databricks tpc-h etl-pipeline delta-lake

Updated Dec 3, 2025
HTML

Improve this page

Add a description, image, and links to the delta-lake topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the delta-lake topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

delta-lake

Here are 12 public repositories matching this topic...

tikal-fuseday / delta-architecture

adidas / lakehouse-engine-docs

lamastex / spark-trend-calculus-examples

cricksmaidiene / snowplough

lamastex / spark-gdelt-examples

easonlai / databricks_delta_table_samples

lamastex / mep

JosephNjiru / modern-water-data-platform

AlbusDracoSam / Delta_Lake.io

Igor-C-Assuncao / mvp-engenharia-dados-scania

henry-richard7 / datacraft-framework

tseringjsherpa / TPCH-ETL-pipeline-databricks

Improve this page

Add this topic to your repo