You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Bulk Stash is a docker rclone service to sync, or copy, files between different storage services. For example, you can copy files either to or from a remote storage services like Amazon S3 to Google Cloud Storage, or locally from your laptop to a remote storage.
A fully incremental model, that transforms raw web event data generated by the Snowplow JavaScript tracker into a series of derived tables of varying levels of aggregation.
Built Apache Airflow DAGs to automate Yahoo Finance stock data ingestion, storage, and querying, then extended with a Python log analyzer to monitor execution errors. Demonstrates orchestration, scheduling, operator use, and pipeline monitoring.
To use dbt as an orchestration tool to process a static file and join two data sources together. This repository can be used as a template example of creating a dbt pipeline with testing. See the two simple sets below to using the dbt pipeline to generate tables in BigQuery (GCP).
GCP-based Regulatory Reporting Lakehouse — Tier-1 Swiss Bank (Simulated Case Study):- Documentation-only repo illustrating a cloud-native data lakehouse architecture for regulatory reporting on Google Cloud Platform (GCS + BigQuery + Dataflow + Composer). Includes ADRs, runbooks, and compliance data contracts.
Install and operations guide for running Scalyr Agent 2 as the SentinelOne Collector on Rocky Linux 9 (including air‑gapped scenarios), without requiring Docker.