Skip to content

Support AWS S3 multipart uploads via scoped temporary credentials

Status update (2023-07-17)

Problem(s)

  1. Uploading a large cache to AWS S3 can at times be slow or result in errors such as context deadline exceeded (Client.Timeout exceeded while awaiting headers). In one customer escalation, the file to upload is ~600mb slack thread

  2. AWS has a single request limit of 5GB. If the cache is greater than the limit, then the upload will fail.

Other customer and user input

  • Why does gitlab-runner cache put object using self http client? The lib package in "github.com/minio/minio-go/v6" already has PresignedPutObject method with multipart uploads

  • When cache uploading large file (test >1.5G), the runner generates an error:

image

Additional details

  • Today in the runner, we upload the cache as one big blob to a pre-signed upload URL. However, using a pre-signed upload URL means that there is no sharing of credentials with the job environment. The pre-signed url means we don't need to share S3 credentials with the job environment, but it will be definitely less efficient than performing the upload function using, for example, the AWS CLI.

  • context deadline exceeded (Client.Timeout exceeded while awaiting headers) is from the Runner. When starting cache upload request we attach a context with defined timeout to it. That timeout is by default set to 10 minutes. If the request is not handled within that time, context is cancelled and you see that error. And Runner will by default try to repeat the upload operation two more times.

  • Users can define the CACHE_REQUEST_TIMEOUT. The default value = 10 minutes.

Proposal

{placeholder for solution proposal pending the work on the linked spike.}

Edited by Claire Brya