Who this is for: platform/data engineers working on shipment visibility, cold-chain monitoring, alerting and dashboards.
What you’ll get: a reproducible way (GPT12-X) to synthesize telemetry, inject incidents, publish to MQTT or save as NDJSON/CSV, so you can validate alert rules and dashboards quickly and safely.
Why simulate?
Real data is expensive, slow to obtain and hard to control. Synthetic data lets you:
- Reproduce delays/route deviations/temperature breaches/door events/etc. on demand
- Regression-test alert/dash changes in minutes
- Cover edge cases (GPS jamming/drift, sudden battery drop, humidity spikes)
- Exercise the whole pipeline: MQTT → streaming jobs → alerts/dashboards, or NDJSON/CSV → lakehouse/BI
Incident types we simulate
- DELAY — prolonged standstill (warehouse/traffic)
- ROUTE_DEVIATION — geofence/corridor deviation
- TEMP_EXCURSION — cold-chain breach
- SHOCK — handling impact
- DOOR_OPEN — unauthorized door event
- BATTERY_DRAIN — abnormal battery drop
- GPS_JAMMING — missing/invalid GNSS fix
- HUMIDITY_ANOMALY — humidity out of range
Unified data model
{ device_id: string; shipment_id: string; ts: string; // ISO timestamp lat: number | null; lon: number | null; speed_kph: number | null; temp_c: number | null; humidity: number | null; shock_g: number | null; door_open: boolean | null; battery_pct: number | null; events: string[]; // e.g. ["TEMP_EXCURSION"] meta?: { route: string; step: number; } // helpful for replay/debug }
Sample record
{ "device_id": "ELK-SIM-204913", "shipment_id": "SHP-1736389123456-0", "ts": "2025-09-08T06:02:00.000Z", "lat": 34.0643, "lon": -118.2519, "speed_kph": 42.7, "temp_c": 5.1, "humidity": 67, "shock_g": 0, "door_open": false, "battery_pct": 82, "events": ["DELAY"], "meta": { "route": "US-LA-CHI-NYC", "step": 2 } }
GPT12-X at a glance
GPT12-X is a single Node.js CLI script that generates tracks along predefined routes and injects incidents at configurable rates, then:
- publishes live to MQTT (for streaming consumption), or
- writes NDJSON/CSV for offline analysis/replay.
Prereqs: Node.js ≥ 18
CLI options (quick reference)
Option | Type | Default | Description |
---|---|---|---|
--route | string | US-LA-CHI-NYC | Predefined route (e.g. US-LA-CHI-NYC , CN-SZ-SH ) |
--minutes | number | 180 | Total duration (minutes) |
--interval | number | 60 | Sampling / MQTT publish interval (s) |
--shipments | number | 1 | Concurrent shipments |
--coldchain | boolean | true | 2–8 °C baseline (if false → ambient) |
--incident-rate | number | 0.18 | Incident intensity (per hour) |
--mqtt | string | – | mqtt(s)://host:port |
--topic | string | sim/telemetry | MQTT topic |
--out | string | – | Output NDJSON path |
--csv | string | – | Output CSV path |
--username / --password | string | – | MQTT auth |
--insecure | boolean | false | Allow self-signed cert (test only) |
Tips
- In production tests, keep a fixed
--interval
for steady event cadence. - Interpret
--incident-rate
as per-hour average (Poisson-like): 0.25 ≈ 1 time every 4 hours.
Quickstart (copy & run)
# Two shipments, 3 hours, 60-second interval; write NDJSON node gpt12x-sim.js --minutes 180 --interval 60 --shipments 2 --out gpt12x.ndjson # Publish to local MQTT (topic sim/telemetry) for 60 minutes node gpt12x-sim.js --mqtt mqtt://localhost:1883 --topic sim/telemetry --minutes 60 # Generate CSV on the CN South → East route node gpt12x-sim.js --route CN-SZ-SH --minutes 120 --csv gpt12x.csv
Use it to validate alerts & dashboards
Rules
- Temperature breach (
TEMP_EXCURSION
): trigger immediately; confirm notification within SLA; clear when temp returns to safe range for N minutes. - Route deviation: compare against geofence/corridor; require M consecutive deviations to escalate.
- Delay/stall: near-zero speed & minimal positional delta for T minutes.
- Door: open outside authorized stops triggers alert (combine with geofence).
- Battery: sudden drops or slope above threshold → warn; below lower bound → escalate.
- GPS quality: mark
GPS_JAMMING
/drift, trigger self-check and data-quality flags.
Visualization
- Map polyline + incident bubbles; link with timeline brushing.
- Time series (temp/speed/battery) with colored anomalies.
- Incident histograms by type/time/route/device.
- Data-quality tiles: GNSS accuracy, gaps, deviation rate, incident coverage, etc.
Pipelines
- Streaming: MQTT → Flink/Spark/Kafka Streams → alert service/metrics store (Influx/TSDB/ClickHouse).
- Batch: NDJSON/CSV → Lakehouse (Iceberg/Hudi/Delta) → BI/Notebook.
FAQ
Q: Synthetic ≠ real. How to close the gap?
A: Parameterize the generator with distributions from real devices (speed/dwell, temp drift, congestion windows, etc.).
Q: Can I mix synthetic with real?
A: Yes. Tag synthetic records (e.g., meta.synthetic=true
) and load-test throughput/latency & alert false-negatives/positives.
Q: More realistic routes?
A: Import multi-segment Polyline/GeoJSON or use road-network APIs (OSRM/Valhalla/Mapbox Directions) and add congestion models.
Extensions
- More sensors: light, CO₂, tilt, vibration spectrum
- Road-network & congestion modeling by POI/time-of-day
- Statistical control of intensity/duration (Poisson, Exponential, Gaussian mixture)
- Mixed fleets across routes/time-zones/holidays
- Blend with real devices for stress tests
License & Disclaimer
- Code & examples for educational/testing under MIT.
- TLV/payload examples are demonstrative, not any vendor’s production protocol/spec.
CTA
Want the full script (or a Git/Gist link), plus ready-made routes and incident-distribution templates?
Tell me where to host it and I’ll add it to this post (or as an Appendix).
Top comments (0)