An (experimental) ETL toolkit based on DuckDB and PRQL.
Pyper relies on YAML to describe workflows. We use pydantic to model how a workflow file should look. Here's an example that showcases a simple workflow:
# myworkflow.yaml extract: provider: local uri: file:///mnt/ssd/projects/pyper/invoices.csv register: my_data_source transform: lang: prql backend: duckdb query: | from my_data_source filter billing_country == "USA" group [customer_id] ( aggregate [ sum total, count, ] ) load: provider: local uri: file:///mnt/ssd/projects/pyper/invoices_usa.csvThen, using Python:
import pyper pyper.workflow('myworkflow.yaml').exec()