Skip to content

Conversation

@Benehiko
Copy link
Member

@Benehiko Benehiko commented Nov 29, 2024

After some digging I found that we don't actually explicitly return an error on the pullImage function here. Once the context is cancelled the response is closed since the API client uses http.NewRequestWithContext which then exits the jsonmessage.DisplayJSONMessagesToStream(responseBody, out, nil) without any errors.

The following API request to ContainerCreate then immediately fails since the context is closed, which then returns an error.

See this section of the code, it fails to find the image, then pulls the image, and continues to retry creating a container - on this second retry it only then returns with the context error.
https://github.com/docker/cli/blob/master/cli/command/container/create.go#L271-L289

Writing a test for this means that we are building into the test an assumption that we won't error on the pullImage function, but rather on the second call to ContainerCreate. I'll push what I have, but I honestly think that we will get tripped up by the pullImage function again since it's not context aware - actually more so the jsonmessage package it is using is actually the root cause found in Moby here. It's a bit of a hornets nest since it gets used in many places and by refactoring/improving this it could cause edge case bugs.

- What I did
Delay stripping the cancel from the function ctx so we can still cancel the createContainer function.

containerID, err := createContainer(ctx, dockerCli, containerCfg, &runOpts.createOptions)

- How I did it

- How to verify it
Unit test

- Description for the changelog

Fixed bug preventing image pulls from being cancelled during `docker run`.

- A picture of a cute animal (not mandatory but encouraged)

@codecov-commenter
Copy link

codecov-commenter commented Nov 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 58.54%. Comparing base (47a4f70) to head (530cf09).
Report is 4 commits behind head on 27.x.

Additional details and impacted files
@@ Coverage Diff @@ ## 27.x #5654 +/- ## ======================================= Coverage 58.54% 58.54% ======================================= Files 346 346 Lines 29335 29333 -2 ======================================= + Hits 17173 17174 +1  + Misses 11184 11182 -2  + Partials 978 977 -1 
@thaJeztah thaJeztah changed the title [27.x] fix: ctx on run image pull [27.x backport] fix: ctx on run image pull Nov 29, 2024
@thaJeztah thaJeztah added this to the 27.4.0 milestone Nov 29, 2024
@thaJeztah
Copy link
Member

not a blocker (but if you want to re-apply, let me know); we try to use the -x option when cherry-picking, so that a reference to the original commit is added in the commit message, e.g.;

git commit -s -S -x 30a73ff19c407eee75de489355154ce1f35231a8

The x option adds an extra line to the commit message, which can be useful to correlate changes and/or track-back origin;

(cherry picked from commit 30a73ff19c407eee75de489355154ce1f35231a8) 
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (but left a suggestion, let me know if you want to fix up the cherry-pick)

This patch fixes the context cancellation behaviour for the `runContainer` function, specifically the `createContainer` function introduced in this commit docker@991b130. It delays stripping the `cancel` from the context passed into the `runContainer` function so that the `createContainer` function can be cancelled gracefully by a SIGTERM/SIGINT. This is especially true when the requested image does not exist and `docker run` needs to `pull` the image before creating the container. Although this patch does gracefully cancel the `runContainer` function it does not address the root cause. Some functions in the call path are not context aware, such as `pullImage`. Future work would still be necessary to ensure a consistent behaviour in the CLI. Signed-off-by: Alano Terblanche <18033717+Benehiko@users.noreply.github.com> (cherry picked from commit 30a73ff) Signed-off-by: Alano Terblanche <18033717+Benehiko@users.noreply.github.com>
Signed-off-by: Alano Terblanche <18033717+Benehiko@users.noreply.github.com>
@Benehiko
Copy link
Member Author

@thaJeztah updated the cherry-pick with your suggestions

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

thanks! Let me bring this one in 👍

@thaJeztah thaJeztah merged commit 6fd4825 into docker:27.x Nov 29, 2024
87 checks passed
@Benehiko Benehiko deleted the fix-run-ctx-27.x branch November 29, 2024 13:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment