Connection limit for thick daemon. by juliusmh · Pull Request #1458 · k8snetworkplumbingwg/multus-cni

juliusmh · 2025-11-06T08:31:11Z

Closes: #1346

Summary by CodeRabbit

Release Notes

New Features
- Added configurable connection limit for the CNI server to constrain maximum concurrent connections.
Tests
- Added end-to-end test scenarios for connection limit functionality.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

cmd/multus-daemon/main.go

coveralls · 2025-11-13T14:01:41Z

coverage: 49.68% (-0.04%) from 49.721%
when pulling eb7efb0 on juliusmh:jmh/limit_listener
into f42e0bd on k8snetworkplumbingwg:master.

Signed-off-by: Julius Hinze <[email protected]>

SchSeba

is it possible to add a test for this one?
maybe put the number really low just to be sure we will get a retry from the CRI system to try to start the pod again?

also just a general question can you try to run a pprof to see what part of the code is wasting memory potentially we can improve that area in the code.

coderabbitai · 2025-12-08T16:20:15Z

Walkthrough

This PR implements connection limiting for the multus daemon's CNI server to control concurrent request handling during pod burst scenarios. It adds a configurable connectionLimit parameter to the daemon configuration, applies a listener wrapper to enforce the limit, provides test infrastructure with template manifests and an e2e test script, and vendors the required golang.org/x/net/netutil package.

Changes

Cohort / File(s)	Change Summary
Daemon Configuration & Types `cmd/multus-daemon/main.go`, `pkg/server/types.go`	Adds `ConnectionLimit` field to daemon config struct and applies `netutil.LimitListener` wrapper to the CNI server listener when configured; imports `golang.org/x/net/netutil`.
E2E Test Infrastructure `.github/workflows/kind-e2e.yml`, `e2e/test-connection-limit.sh`, `e2e/templates/many-pods.yml.j2`, `e2e/templates/multus-daemonset-thick.yml.j2`	Adds new workflow step to run connection limit test; adds shell script to create multi-pod deployment, wait for readiness, and cleanup; adds Jinja2 template for 6-replica centos Deployment; configures `connectionLimit: 1` in daemon ConfigMap.
Vendor Dependencies `vendor/golang.org/x/net/netutil/listen.go`, `vendor/modules.txt`	Vendors `golang.org/x/net/netutil` package implementing `LimitListener` wrapper that caps concurrent connections using a semaphore channel; adds module entry to vendor manifest.

Sequence Diagram

sequenceDiagram
    participant Pod1 as Pod 1
    participant Pod2 as Pod 2
    participant Pod3 as Pod 3
    participant LimitListener
    participant Semaphore as Semaphore (capacity: 1)
    participant Daemon as CNI Daemon

    Pod1->>LimitListener: connect
    LimitListener->>Semaphore: acquire()
    Semaphore-->>LimitListener: acquired (slot 0 occupied)
    LimitListener->>Daemon: Accept()
    Daemon-->>Pod1: connection established

    Pod2->>LimitListener: connect
    LimitListener->>Semaphore: acquire()
    Note over Semaphore: blocked (at capacity)
    
    Pod3->>LimitListener: connect
    LimitListener->>Semaphore: acquire()
    Note over Semaphore: blocked (at capacity)

    Pod1->>LimitListener: close connection
    LimitListener->>Semaphore: release()
    Semaphore-->>LimitListener: slot available
    
    LimitListener->>Semaphore: acquire() [Pod2]
    Semaphore-->>LimitListener: acquired
    LimitListener->>Daemon: Accept() [Pod2]
    Daemon-->>Pod2: connection established

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

vendor/golang.org/x/net/netutil/listen.go: Semaphore-based connection wrapping and cleanup logic requires careful verification of race conditions, proper channel handling, and connection lifecycle management.
cmd/multus-daemon/main.go: Configuration integration and conditional listener wrapping; verify config parsing and logging.
e2e/test-connection-limit.sh: Shell script correctness, resource cleanup, and timeout handling.
Config propagation: Ensure ConnectionLimit field is correctly threaded through daemon startup with proper defaults and validation.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Connection limit for thick daemon' clearly and specifically describes the main change—adding connection limit functionality to the multus daemon in thick mode, which aligns with the primary objective.
Linked Issues check	✅ Passed	The PR implements the requested feature from issue #1346: adds a configurable connectionLimit option to constrain parallel CNI requests, includes configuration support, server-side listener wrapping, e2e tests, and deployment templates with the feature enabled.
Out of Scope Changes check	✅ Passed	All changes are directly related to implementing the connection limit feature: config struct extension, daemon logic to apply limits, vendor library for the limiting mechanism, test infrastructure, and e2e validation. No unrelated changes detected.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

juliusmh · 2025-12-08T16:26:50Z

Thanks for the comment @SchSeba, I added some tests but I'm not well versed in this repository.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (6)

e2e/templates/multus-daemonset-thick.yml.j2 (1)

100-101: Clarify whether connectionLimit: 1 should be global for all thick‑mode e2e runs

Hard‑coding connectionLimit: 1 in the shared thick DaemonSet config means every thick‑mode e2e job now runs with single‑connection concurrency, not just the connection‑limit test. That’s safe but might unnecessarily slow other tests and hides behavior under higher limits.

Consider either:

Making the limit configurable (e.g., via a Jinja/env parameter) and defaulting to “no limit” for general tests while overriding to 1 only in the connection‑limit scenario, or

Adding a brief comment in this ConfigMap explaining that 1 is intentionally low to exercise the limit behavior in e2e.

.github/workflows/kind-e2e.yml (1)

88-90: Scope the “Test connection limit” step to thick mode (optional)

Right now this step runs for every matrix entry, including non‑thick manifests where connectionLimit isn’t configured. That’s functionally fine but makes the step less clearly tied to the thick‑daemon feature and adds a bit of redundant test time.

Consider either:

Adding an if: ${{ matrix.multus-manifest == 'multus-daemonset-thick.yml' }} guard (like the subdirectory chaining tests), or

Renaming the step to reflect that it’s a generic “many pods” sanity test for non‑thick runs, if you want the broader coverage.

pkg/server/types.go (1)

73-83: Document ConnectionLimit semantics in ControllerNetConf

The new ConnectionLimit *int field is a clean way to plumb the option through config, and using a pointer keeps it backward compatible. Right now, its behavior (nil or ≤0 means “no limit”, positive values cap concurrent connections) is only implicit from the daemon logic.

Consider adding a short comment on the field or in the struct doc to spell this out for users of the API, e.g., “maximum concurrent CNI server connections; nil or ≤0 disables limiting”.
cmd/multus-daemon/main.go (1)
32-33: Connection limit wiring looks good; consider validating non‑positive values

The listener wrapping with:
if limit := daemonConfig.ConnectionLimit; limit != nil && *limit > 0 {
    logging.Debugf("connection limit: %d", *limit)
    l = netutil.LimitListener(l, *limit)
}
cleanly keeps existing behavior for legacy configs and only applies LimitListener when explicitly set to a positive value.

Two small follow‑ups to consider:

Treat explicitly configured 0 or negative values as misconfiguration (log a warning or error) instead of silently behaving as “no limit”, so bad config is visible.

Ensure the user‑facing config docs mention that only positive values enable limiting and that nil/0 (depending on how you want to treat 0) results in no concurrency cap.

Also applies to: 171-174
e2e/test-connection-limit.sh (1)
1-10: Consider adding a trap for cleanup (and optionally stricter shell flags)

The script correctly exercises the “many pods” scenario and cleans up on the happy path. If kubectl create or kubectl wait fails, though, set -o errexit will exit before the delete runs, leaving resources around until the cluster is torn down.

Optionally, you could:

Add a simple trap to always attempt cleanup:
trap 'kubectl delete -f yamls/many-pods.yml >/dev/null 2>&1 || true' EXIT
And, if you want more robust scripting, add set -o nounset -o pipefail in line with other e2e scripts.

Not required for correctness, but it tightens test hygiene.
e2e/templates/many-pods.yml.j2 (1)

1-23: Verify whether the test pod really needs privileged: true

This Deployment is only running sleep, so it may not need full privileged privileges. If there’s no dependency on host networking features or special capabilities, consider tightening the securityContext (dropping privileged: true or replacing it with narrower capabilities) to keep the e2e fixtures as minimal‑privilege as possible.

If other tests or the CNI setup rely on this being privileged, keeping it as‑is is fine.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f42e0bd and 95c131e.

📒 Files selected for processing (8)

.github/workflows/kind-e2e.yml (1 hunks)
cmd/multus-daemon/main.go (2 hunks)
e2e/templates/many-pods.yml.j2 (1 hunks)
e2e/templates/multus-daemonset-thick.yml.j2 (1 hunks)
e2e/test-connection-limit.sh (1 hunks)
pkg/server/types.go (1 hunks)
vendor/golang.org/x/net/netutil/listen.go (1 hunks)
vendor/modules.txt (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

cmd/multus-daemon/main.go (2)

pkg/logging/logging.go (1)

Debugf (126-128)

vendor/golang.org/x/net/netutil/listen.go (1)

LimitListener (16-22)

🔇 Additional comments (2)

vendor/modules.txt (1)

204-217: Vendored golang.org/x/net/netutil entry is consistent

The new golang.org/x/net/netutil line fits correctly under the existing golang.org/x/net module section and matches the added vendored package and import usage.

vendor/golang.org/x/net/netutil/listen.go (1)

1-87: Vendored LimitListener implementation looks standard

This netutil.LimitListener implementation matches the usual upstream pattern (semaphore‑based cap, done channel, wrapped Conn releasing on Close). Keeping it as a straight vendored copy is good for future upstream syncs; I wouldn’t tweak behavior here unless you discover a specific bug and can upstream the fix.

juliusmh mentioned this pull request Nov 6, 2025

Connection limit for thick daemon. #1437

Closed

karampok reviewed Nov 13, 2025

View reviewed changes

cmd/multus-daemon/main.go Outdated Show resolved Hide resolved

Add connection limit for thick daemon.

eb7efb0

Signed-off-by: Julius Hinze <[email protected]>

juliusmh force-pushed the jmh/limit_listener branch from db9bce5 to eb7efb0 Compare November 26, 2025 11:20

juliusmh requested a review from karampok November 26, 2025 11:21

SchSeba reviewed Dec 2, 2025

View reviewed changes

SchSeba mentioned this pull request Dec 2, 2025

[OOMKilled] High memory consumption #1346

Open

Add tests for connection limit

95c131e

coderabbitai bot reviewed Dec 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Connection limit for thick daemon.#1458

Connection limit for thick daemon.#1458
juliusmh wants to merge 2 commits intok8snetworkplumbingwg:masterfrom
juliusmh:jmh/limit_listener

juliusmh commented Nov 6, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

Uh oh!

coveralls commented Nov 13, 2025 •

edited

Loading

Uh oh!

SchSeba left a comment

Uh oh!

coderabbitai bot commented Dec 8, 2025 •

edited

Loading

Uh oh!

juliusmh commented Dec 8, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

Conversation

juliusmh commented Nov 6, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

Uh oh!

coveralls commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SchSeba left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

juliusmh commented Dec 8, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

juliusmh commented Nov 6, 2025 •

edited by coderabbitai bot

Loading

coveralls commented Nov 13, 2025 •

edited

Loading

coderabbitai bot commented Dec 8, 2025 •

edited

Loading