Skip to content

Use X-Request-Id for correlation ID from trusted /internal API requests

Right now it's not possible to trace a request all the way through if Gitaly has to make internal API calls back to GitLab Rails. For example, with the new LFS smudge filter added to support gitlab#15079 (closed), we have this sequence of events:

sequenceDiagram
 autonumber
 Client->>+Workhorse: GET /group/project/-/archive/master.zip
 Workhorse->>+Rails: GET /group/project/-/archive/master.zip
	Rails->>+Workhorse: Gitlab-Workhorse-Send-Data git-archive
 Workhorse->>Gitaly: SendArchiveRequest
 Gitaly->>Git: git archive master
 Git->>Smudge: OID 12345
 Smudge->>+Workhorse: GET /internal/api/v4/lfs?oid=12345&gl_repository=project-1234
 Workhorse->>+Rails: GET /internal/api/v4/lfs?oid=12345&gl_repository=project-1234
 Rails->>+Workhorse: Gitlab-Workhorse-Send-Data send-url
 Workhorse->>Smudge: <LFS data>
 Smudge->>Git: <LFS data>
 Git->>Gitaly: <streamed data>
 Gitaly->>Workhorse: <streamed data>
 Workhorse->>Client: master.zip

At step 7, the LFS smudge filter makes an API call back to Rails. Even if it attempts to preserve the same correlation ID through the request, at step 8 Workhorse will generate a new correlation ID and use that. As a result, the logs for the /api/v4/internal calls all have unique correlation IDs when they should share the same one.

The same issue happens when a Gitaly hook has to make an API call to check /api/v4/internal/allowed, but it's perhaps not as pronounced because there's only 1 or 2 calls.

Some ideas to fix this:

  1. Add a trusted IP or hostname block that allows Workhorse to use X-Request-Id if it is available.
  2. Send a signed token that Workhorse can decode, and if it checks out use the provided correlation ID.

This probably ties into LabKit. What do you think, @andrewn @reprazent?