- Notifications
You must be signed in to change notification settings - Fork 1.2k
🐛 fix issue when webhook server refreshing cert #260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| /cc @anfernee |
| @mengqiy: GitHub didn't allow me to request PR reviews from the following users: anfernee. Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
5b51c4f to e05117f Compare | /lgtm it doesn't fix #191 tho. |
| @anfernee: changing LGTM is restricted to assignees, and only kubernetes-sigs/controller-runtime repo collaborators may be assigned issues. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
droot left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks good. will wait for the tests before lgtm.
e05117f to 5c464d1 Compare | /hold |
5c464d1 to 6c8993d Compare | Made some additional changes in pkg/webhook/server.go and added tests. |
| @@ -0,0 +1,148 @@ | |||
| /* | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a test case to test the case where the server exceeds cert refresh interval? you can change defaultCertRefreshInterval to a smaller interview. I am worried that if you start a server on the same port immediately after shutdown, you will have port conflict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC
https://github.com/kubernetes-sigs/controller-runtime/pull/260/files#diff-8b0412ec4fd52af1419bd0fcd0f2e101R121 is doing what you ask. The server restarts multiple times and rotates the certs.
| case <-stop: | ||
| return nil | ||
| case e := <-errCh: | ||
| return e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ideally if you start server here, it should just work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO if there is an unexpected error, we should surface the error instead of keeping retrying.
| log.Info("server is shutting down to reload the certificates.") | ||
| err = srv.Shutdown(context.Background()) | ||
| shutdownHappend = true | ||
| err = s.httpServer.Shutdown(context.Background()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is Shutdown is async, meaning return of shutdown doesn't necessarily mean the server is already done. But return of ListenAndServeTLS guarantees it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shutdown gracefully shuts down the server without interrupting any active connections. Shutdown works by first closing all open listeners, then closing all idle connections, and then waiting indefinitely for connections to return to idle and then shut down
Per https://golang.org/pkg/net/http/#Server.Shutdown, it is graceful shutdown and synchronized call. Did I miss anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused of those 2 methods, you are right. pls ignore it.
| return err | ||
| } | ||
| timer = time.Tick(wait.Jitter(defaultCertRefreshInterval, 0.1)) | ||
| go serveFn() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this might have port conflict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the question here is that shutting down the server will unbind the port, but if it will complete in time before the server tries to bind the same port next time.
I have been searching online for this question for quite a while. But I'm still not sure what the 100% correct thing to do here :/
It seems when the Listener is closed, it should have unbinded the port.
Some tests I added in this PR rotates the cert and reloads the server for multiple times. I haven't seen any issue about port conflict.
I'm open to suggestions if you have a better solution. If not, I will probably merge it as is :)
43a3f31 to acd82f6 Compare | } | ||
| | ||
| cg := &generator.SelfSignedCertGenerator{} | ||
| s.CertDir, err = ioutil.TempDir("/tmp", "controller-runtime-") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we doing cleanup of this temp dir ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
Added code to cleanup the temp dir.
https://github.com/kubernetes-sigs/controller-runtime/pull/260/files#diff-8b0412ec4fd52af1419bd0fcd0f2e101R159
acd82f6 to 09998a3 Compare | PTAL |
| /lgtm |
| @anfernee: changing LGTM is restricted to assignees, and only kubernetes-sigs/controller-runtime repo collaborators may be assigned issues. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
| [APPROVALNOTIFIER] This PR is APPROVED Approval requirements bypassed by manually added approval. This pull-request has been approved by: mengqiy The full list of commands accepted by this bot can be found here. The pull request process is described here Needs approval from an approver in each of these files: Approvers can indicate their approval by writing |
fixes #191
fixes #192
Tests are on the way.