Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jun 4, 2025

When configured for IPAMv2, CNS watches Pods to calculate IP demand for dynamic Podsubnet scenarios. However, CNS was overcounting the IP demand by including Pods in terminal states (Succeeded/Failed), even though these Pods no longer have network sandboxes and their IPs have been released.

Problem

The PodIPDemandListener function in cns/ipampool/v2/adapter.go was counting all Pods that matched the filter criteria (hostNetwork: false), including terminal Pods. This caused:

  • Overestimation of IP demand on the Node
  • Potential IP pool scaling issues
  • IPs being held unnecessarily when Jobs complete but Pods aren't garbage collected immediately

Solution

Modified PodIPDemandListener to filter out Pods in terminal phases (PodSucceeded and PodFailed) when calculating IP demand. According to the Kubernetes Pod lifecycle documentation, these terminal phases are non-retriable and indicate the Pod is dead.

Changes

  • Modified cns/ipampool/v2/adapter.go: Updated PodIPDemandListener to exclude terminal Pods from demand count
  • Added cns/ipampool/v2/adapter_test.go: Comprehensive unit tests covering various Pod phase scenarios

Testing

Added unit tests that verify:

  • Empty Pod lists return 0 demand
  • Running and Pending Pods are counted correctly
  • Terminal Pods (Succeeded/Failed) are excluded from demand
  • Mixed scenarios with both active and terminal Pods work correctly

Compatibility

This change is compatible with SwiftV2 multitenancy scenarios, as terminal Pods should not contribute to IP demand regardless of their tenancy configuration.

Fixes #3696.

Warning

Firewall rules blocked me from connecting to one or more addresses

I tried to connect to the following addresses, but was blocked by firewall rules:

  • http://168.63.129.16:80/machine/
    • Triggering command: /usr/bin/python3 -u bin/WALinuxAgent-2.13.1.1-py3.9.egg -collect-logs (http block)

If you need me to access, download, or install something from one of these locations, you can either:


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Co-authored-by: rejain456 <155685406+rejain456@users.noreply.github.com>
Copilot AI changed the title [WIP] CNS overcounting Pod demand in IPAMv2 Fix CNS IP demand overcounting by filtering terminal Pods in IPAMv2 Jun 4, 2025
Copilot AI requested a review from rejain456 June 4, 2025 20:29
Copilot finished work on behalf of rejain456 June 4, 2025 20:29
@rejain456
Copy link
Contributor

@copilot, please fix the issue you noted for line: "for _, pod := range pods {" with the comment: "rangeValCopy: each iteration copies 1112 bytes (consider pointers or indexing) (gocritic)"

Co-authored-by: rejain456 <155685406+rejain456@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jun 4, 2025

@copilot, please fix the issue you noted for line: "for _, pod := range pods {" with the comment: "rangeValCopy: each iteration copies 1112 bytes (consider pointers or indexing) (gocritic)"

Fixed the gocritic rangeValCopy issue by changing the loop from for _, pod := range pods { to for i := range pods { to avoid copying 1112 bytes per Pod on each iteration. The change maintains identical functionality while improving performance. (commit 12d2834)

Warning

Firewall rules blocked me from connecting to one or more addresses

I tried to connect to the following addresses, but was blocked by firewall rules:

  • http://168.63.129.16:80/machine/
    • Triggering command: /usr/bin/python3 -u bin/WALinuxAgent-2.13.1.1-py3.9.egg -collect-logs (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot finished work on behalf of rejain456 June 4, 2025 22:11
Co-authored-by: rbtr <2940321+rbtr@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jun 4, 2025

Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details.

Warning

Firewall rules blocked me from connecting to one or more addresses

I tried to connect to the following addresses, but was blocked by firewall rules:

  • http://168.63.129.16:80/machine/
    • Triggering command: /usr/bin/python3 -u bin/WALinuxAgent-2.13.1.1-py3.9.egg -collect-logs (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot finished work on behalf of rbtr June 4, 2025 23:35
rbtr
rbtr previously approved these changes Jun 4, 2025
…ndexer Co-authored-by: rbtr <2940321+rbtr@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jun 4, 2025

Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details.

Warning

Firewall rules blocked me from connecting to one or more addresses

I tried to connect to the following addresses, but was blocked by firewall rules:

  • http://168.63.129.16:80/machine/
    • Triggering command: /usr/bin/python3 -u bin/WALinuxAgent-2.13.1.1-py3.9.egg -collect-logs (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot finished work on behalf of rbtr June 4, 2025 23:54
@rbtr rbtr marked this pull request as ready for review June 4, 2025 23:59
@rbtr rbtr requested a review from a team as a code owner June 4, 2025 23:59
@rbtr rbtr requested a review from timraymond June 4, 2025 23:59
@rbtr
Copy link
Collaborator

rbtr commented Jun 4, 2025

/azp run Azure Container Networking PR

@rbtr rbtr requested a review from Copilot June 4, 2025 23:59
@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances IP demand calculation by excluding terminal Pods and updates the reconciler to index Pod phases.

  • Adds a new status.phase indexer in the Pod reconciler
  • Modifies PodIPDemandListener to filter out Succeeded/Failed Pods
  • Introduces unit tests covering active vs. terminal Pod scenarios

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
cns/kubecontroller/pod/reconciler.go Renamed notifier comment and added status.phase indexer
cns/ipampool/v2/adapter.go Updated PodIPDemandListener to count only non-terminal Pods
cns/ipampool/v2/adapter_test.go Added tests verifying exclusion of terminal Pods from demand counts
Comments suppressed due to low confidence (4)

cns/ipampool/v2/adapter.go:32

  • Add or update the function comment to describe that this listener now filters out terminal pod phases (Succeeded/Failed) before sending the count.
func PodIPDemandListener(ch chan<- int) func([]v1.Pod) { 

cns/ipampool/v2/adapter.go:37

  • [nitpick] Consider replacing the index variable i with a more descriptive name (e.g., idx) or use for _, pod := range pods for clarity.
for i := range pods { 

cns/ipampool/v2/adapter_test.go:32

  • [nitpick] The test name includes a pending pod; consider renaming to something like multiple active pods to better reflect the scenario.
name: "multiple running pods", 
Copy link
Collaborator

@rbtr rbtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linting is failing on the changed files and needs to be fixed:

cns/ipampool/v2/adapter.go:45: File is not `gci`-ed with --skip-generated -s standard -s default (gci) cns/ipampool/v2/adapter_test.go:101: File is not `gci`-ed with --skip-generated -s standard -s default (gci) cns/kubecontroller/pod/reconciler.go:83: File is not `gci`-ed with --skip-generated -s standard -s default (gci) cns/ipampool/v2/adapter.go:46: File is not `gofumpt`-ed (gofumpt) cns/ipampool/v2/adapter_test.go:102: File is not `gofumpt`-ed (gofumpt) 

bump

Copilot finished work on behalf of rbtr June 5, 2025 19:14
Copy link
Collaborator

@rbtr rbtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linting is failing on the changed files and needs to be fixed:

cns/ipampool/v2/adapter.go:45: File is not `gci`-ed with --skip-generated -s standard -s default (gci) cns/ipampool/v2/adapter_test.go:101: File is not `gci`-ed with --skip-generated -s standard -s default (gci) cns/kubecontroller/pod/reconciler.go:83: File is not `gci`-ed with --skip-generated -s standard -s default (gci) cns/ipampool/v2/adapter.go:46: File is not `gofumpt`-ed (gofumpt) cns/ipampool/v2/adapter_test.go:102: File is not `gofumpt`-ed (gofumpt) 
Copilot finished work on behalf of rbtr June 5, 2025 19:43
@rbtr
Copy link
Collaborator

rbtr commented Jun 6, 2025

@copilot address my previous comment on linting

Co-authored-by: rbtr <2940321+rbtr@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jun 6, 2025

@copilot address my previous comment on linting

Fixed all linting issues by removing trailing blank lines from the modified files. The gci and gofumpt formatting errors have been resolved. (commit 64a7ba1)

Copilot finished work on behalf of rbtr June 6, 2025 17:26
@rbtr
Copy link
Collaborator

rbtr commented Jun 6, 2025

/azp run Azure Container Networking PR

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).
@rbtr rbtr enabled auto-merge June 6, 2025 20:24
@rbtr rbtr added this pull request to the merge queue Jun 6, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jun 6, 2025
@rbtr rbtr added this pull request to the merge queue Jun 9, 2025
Merged via the queue into master with commit 4aa36ae Jun 9, 2025
15 of 16 checks passed
@rbtr rbtr deleted the copilot/fix-3696 branch June 9, 2025 18:31
sivakami-projects pushed a commit that referenced this pull request Oct 23, 2025
…3697) * Initial plan for issue * Fix CNS IP demand overcounting by filtering terminal Pods in IPAMv2 Co-authored-by: rejain456 <155685406+rejain456@users.noreply.github.com> * Fix gocritic rangeValCopy performance issue in PodIPDemandListener Co-authored-by: rejain456 <155685406+rejain456@users.noreply.github.com> * Implement server-side Pod filtering with indexer for status.phase Co-authored-by: rbtr <2940321+rbtr@users.noreply.github.com> * Revert to client-side filtering strategy while keeping status.phase indexer Co-authored-by: rbtr <2940321+rbtr@users.noreply.github.com> * Fix linting issues: remove trailing blank lines in modified files Co-authored-by: rbtr <2940321+rbtr@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: rejain456 <155685406+rejain456@users.noreply.github.com> Co-authored-by: rbtr <2940321+rbtr@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

4 participants