Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
- Updated
Dec 23, 2025 - Python
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
An orchestration platform for the development, production, and observation of data assets.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
Preswald is a WASM packager for Python-based interactive data apps: bundle full complex data workflows, particularly visualizations, into single files, runnable completely in-browser, using Pyodide, DuckDB, Pandas, and Plotly, Matplotlib, etc. Build dashboards, reports, and notebooks that run offline, load fast, and share like a document.
A system for agentic LLM-powered data processing and ETL
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
Easy Data Preparation with latest LLMs-based Operators and Pipelines.
Concurrent Python made simple
This dbt package captures metadata, artifacts, and test results so you can detect anomalies, monitor data quality, and build metadata tables. It powers Elementary OSS and feeds the wider context layer used by Elementary Cloud’s full Data & AI Control Plane.
One framework to develop, deploy and operate data workflows with Python and SQL.
Work with your web service, database, and streaming schemas in a single format.
Easy to use cluster-compute software.
Relational data pipelines for the science lab
A System for Optimized Semantic Computation
Cloud-native, data onboarding architecture for Google Cloud Datasets
Infra for scalable and reliable AI agents
Data pipelines from re-usable components
Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation, validation and loading of data from S3 -> Redshift -> S3
Conductor OSS SDK for Python programming language
Add a description, image, and links to the data-pipelines topic page so that developers can more easily learn about it.
To associate your repository with the data-pipelines topic, visit your repo's landing page and select "manage topics."