Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
- Updated
Aug 26, 2022 - Python
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
One framework to develop, deploy and operate data workflows with Python and SQL.
Data pipeline performing ETL to AWS Redshift using Spark, orchestrated with Apache Airflow
Data Engineering Project with Hadoop HDFS and Kafka
Project demonstrating how to automate Prefect 2.0 deployments to AWS ECS Fargate
Code examples showing flow deployment to various types of infrastructure
Let your pipe lines flow thru the Python code in xonsh.
Agentic Data Integrator that helps you build production-ready data pipelines so you can connect to more systems, faster. You run it in your terminal as a workflow wizard.
Deploy a Prefect flow to serverless AWS Lambda function
Apache Spark Guide
A end-to-end real-time stock market data pipeline with Python, AWS EC2, Apache Kafka, and Cassandra Data is processed on AWS EC2 with Apache Kafka and stored in a local Cassandra database.
Analysis of 311 Service Requests for the City of NYC (from 2010 to 2023) Tech: Prefect cloud, dbt core, BigQuery, Compute Engine, CloudRun, Artifact Registry, Terraform, Docker
A fully serverless, event-driven data pipeline that ingests, enriches, validates, and visualizes real-time news data using AWS services. Designed for cost-efficient, scalable deployment using only free-tier AWS services.
End-to-end data engineering pipeline with various technologies to ingest real time data.
ETL pipeline combined with supervised learning and grid search to classify text messages sent during a disaster event
Challenge to job: Data Scientist
An end-to-end data pipeline for building Data Lake and supporting report using Apache Spark.
Data Engineering pipeline hosted entirely in the AWS ecosystem utilizing DocumentDB as the database
Add a description, image, and links to the data-engineering-pipeline topic page so that developers can more easily learn about it.
To associate your repository with the data-engineering-pipeline topic, visit your repo's landing page and select "manage topics."