Skip to content

Conversation

@isaac-dasan
Copy link
Member

Signed-off-by: GitHub noreply@github.com

Reason for Change:
This is a perf fix. The particular error message is retries with faster backOffInterval than default error messages by the kubelet.

Issue Fixed:

Requirements:

  • uses [conventional commit messages]
  • includes documentation
  • adds unit tests
  • relevant PR labels added

Notes:

Signed-off-by: GitHub <noreply@github.com>
Copilot AI review requested due to automatic review settings July 1, 2025 21:44
@isaac-dasan isaac-dasan requested a review from a team as a code owner July 1, 2025 21:44
@isaac-dasan isaac-dasan requested a review from paulyufan2 July 1, 2025 21:44
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR aligns the mtpnc is not ready error message with the kubelet’s expected “network is not ready” string and adds a comment explaining the rationale.

  • Changed errMTPNCNotReady message to include “network is not ready” prefix
  • Added comment referencing kubelet retry logic
  • Existing unit tests and documentation adapters are included
Comments suppressed due to low confidence (3)

cns/middlewares/k8sSwiftV2.go:20

  • Referencing a specific commit hash can become outdated. Consider linking to stable Kubernetes documentation or describing the relevant code path more generally to improve long-term maintainability.
// In AKS, for kubelet to retry faster it is particularly looking for "network is not ready" error string. 

cns/middlewares/k8sSwiftV2.go:22

  • [nitpick] Consider renaming this error variable to errNetworkNotReady or similar to reflect the updated message and improve consistency between the variable name and its content.
errMTPNCNotReady = errors.New("network is not ready - mtpnc is not ready") 

cns/middlewares/k8sSwiftV2.go:22

  • [nitpick] Ensure that the kubelet recognizes the exact prefix it expects (e.g., "network is not ready"). If only that substring matters for retry behavior, you may simplify the error string to avoid potential mismatches.
errMTPNCNotReady = errors.New("network is not ready - mtpnc is not ready") 
@isaac-dasan isaac-dasan enabled auto-merge July 1, 2025 22:07
Copy link
Contributor

@QxBytes QxBytes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like from upstream code this will reduce the retry interval from ~10 seconds to ~1 second-- wondering if there are concerns if the cns never comes up or if this fast retry will exhaust resources (ex: x10 more fail logs if cns never comes up)? Was there a particular incident or reasoning for needing the 10 sec improvement?

@isaac-dasan
Copy link
Member Author

it looks like from upstream code this will reduce the retry interval from ~10 seconds to ~1 second-- wondering if there are concerns if the cns never comes up or if this fast retry will exhaust resources (ex: x10 more fail logs if cns never comes up)? Was there a particular incident or reasoning for needing the 10 sec improvement?

To clarify this doesn't happen when "CNS doesn't come up" instead this error message is thrown when MTPNC is not found by CNS. Yes this will make 10x more error logs in CNS and CNI, but this is the only way to address perf bottle neck related to swiftv2 podStartupLatency. It is not acceptable for customers to be waiting for a pod for > 30 secs , but today we are at > 50 secs. This fix will shave off 10 to 15 secs and we must take it.

This is in swiftv2 multitenant code path only which is exceedingly small number of clusters in AKS today, so not all CNI error logs will be noisy.

@isaac-dasan isaac-dasan disabled auto-merge July 1, 2025 22:41
@isaac-dasan
Copy link
Member Author

isaac-dasan commented Jul 2, 2025

/azp run Azure Container Networking PR

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).
Copy link
Contributor

@sheylatrudo sheylatrudo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@isaac-dasan isaac-dasan enabled auto-merge July 3, 2025 16:16
@isaac-dasan isaac-dasan added this pull request to the merge queue Jul 3, 2025
Merged via the queue into master with commit f0a96a1 Jul 3, 2025
15 of 16 checks passed
@isaac-dasan isaac-dasan deleted the users/isaac/swiftv2/networkerrormsg branch July 3, 2025 20:44
isaac-dasan added a commit that referenced this pull request Jul 4, 2025
update network error msg to match kubelet expectations Signed-off-by: GitHub <noreply@github.com>
isaac-dasan added a commit that referenced this pull request Jul 11, 2025
update network error msg to match kubelet expectations Signed-off-by: GitHub <noreply@github.com>
github-merge-queue bot pushed a commit that referenced this pull request Jul 11, 2025
…found and not ready in CNS (#3775) * fix: update network error msg to match kubelet expectations (#3768) update network error msg to match kubelet expectations Signed-off-by: GitHub <noreply@github.com> * fix: update error message for MTPNC not found and not ready in CNS (#3776) squash commit to prevent github CLA issue Signed-off-by: GitHub <noreply@github.com> --------- Signed-off-by: GitHub <noreply@github.com>
NihaNallappagari pushed a commit to NihaNallappagari/azure-container-networking that referenced this pull request Sep 4, 2025
update network error msg to match kubelet expectations Signed-off-by: GitHub <noreply@github.com>
sivakami-projects pushed a commit that referenced this pull request Oct 23, 2025
update network error msg to match kubelet expectations Signed-off-by: GitHub <noreply@github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

6 participants