- Notifications
You must be signed in to change notification settings - Fork 323
feat: Add service map (beta) #1319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🦋 Changeset detectedLatest commit: 153ace9 The changes in this PR will be included in the next version bump. This PR includes changesets to release 2 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
| The latest updates on your projects. Learn more about Vercel for GitHub.
|
PR Review: Service Map FeatureCritical Issues✅ No critical issues found. Code Quality ObservationsGood:
Minor suggestions (non-blocking):
Notes
|
E2E Test Results✅ All tests passed • 39 passed • 3 skipped • 304s
|
f93eb6c to ef624b4 Compare ef624b4 to 4859a8e Compare 4859a8e to 255b27e Compare 255b27e to 5e3e41e Compare | FROM ServerSpans | ||
| LEFT JOIN ClientSpans | ||
| ON ServerSpans.traceId = ClientSpans.traceId | ||
| AND ServerSpans.parentSpanId = ClientSpans.spanId |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perf: not within the scope of this ticket. I'm concerned about the performance implications here, since the default schema doesn't have indexes on traceId or spanId.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this join in particular is expected to not be very performant. Sampling is included in this PR to attempt to minimize the issue, but there are additional proposed steps to improve performance in the future.
| navigateToTraceSearch({ | ||
| dateRange, | ||
| source, | ||
| where: `${source.serviceNameExpression} = '${serviceName}' AND ${source.spanKindExpression} IN ('Server', 'Consumer')`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
security: we should probably escape the serviceName here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, I've wrapped these with SqlString
| format: 'JSON', | ||
| abort_signal: signal, | ||
| clickhouse_settings: { | ||
| max_execution_time: 60, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we specify join_algorithm ? maybe 'auto' for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added join_algorithm: auto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome feature. I’m really excited about this! 🎉

Closes HDX-2699
Summary
This PR adds a Service Map feature to HyperDX, based on (sampled) trace data.
Demo
Screen.Recording.2025-10-31.at.2.33.16.PM.mov
How the service map is constructed
The service map is created by querying client-server (or producer-consumer) relationships from a Trace source. Two spans have a client-server/producer-consumer relationship if (a) they have the same trace ID and (b) the server/consumer's parent span ID is equal to the client/producer's span ID. This is accomplished via a self-join on the Trace table (the query can be found in
useServiceMap.ts.To help keep this join performant, user's can set a sampling level as low as 1% and up to 100%. Lower sampling levels will result in fewer rows being joined, and thus a faster service map load. Sampling is done on
cityHash64(TraceId)to ensure that either a trace is included in its entirety or not included at all.