Skip to content

DAOS-3985 control: Add ControlInterface to server config#17367

Merged
mjmac merged 1 commit intomasterfrom
mjmac/DAOS-3985
Feb 19, 2026
Merged

DAOS-3985 control: Add ControlInterface to server config#17367
mjmac merged 1 commit intomasterfrom
mjmac/DAOS-3985

Conversation

@mjmac
Copy link
Contributor

@mjmac mjmac commented Jan 12, 2026

By default, the control plane server binds to 0.0.0.0, which
means that it is listening to all addresses on all interfaces.
In some cases, the admin may prefer to specify a single interface
to be used for control plane traffic.

When control_iface is set in daos_server.yml, the server will
use the lowest IPv4 address on that interface as both the listen
address and the address recorded in the management database. If
the prometheus listener is configured, it will also use the same
address used for the control interface.

Features: control
Signed-off-by: Michael MacDonald [email protected]

@github-actions
Copy link

github-actions bot commented Jan 12, 2026

Ticket title is 'Allow Control Plane to bind to specific interfaces'
Status is 'In Progress'
Labels: 'triaged'
https://daosio.atlassian.net/browse/DAOS-3985

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds a control_interface configuration option to the DAOS server configuration, allowing administrators to bind the control plane listener to a specific network interface instead of the default behavior of listening on all interfaces (0.0.0.0). When configured, the server uses the first (lowest) IPv4 address from the specified interface for both the control plane listener and the Prometheus telemetry exporter.

Changes:

  • Added control_interface configuration parameter with validation and error handling
  • Modified control plane and Prometheus listener binding logic to use the configured interface
  • Comprehensive test coverage including unit tests and functional tests

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated no comments.

Show a summary per file
File Description
utils/config/daos_server.yml Added documentation for the new control_interface configuration parameter
src/control/server/config/server.go Added ControlInterface field to Server struct and WithControlInterface method
src/control/server/config/server_test.go Updated test to include the new configuration option
src/control/server/config/faults.go Added fault definitions for invalid control interface and address mismatch scenarios
src/control/fault/code/codes.go Added fault codes for control interface errors
src/control/server/server.go Modified initNetwork to look up and validate the control interface
src/control/server/server_utils.go Implemented core logic including getFirstIPv4Addr and updated getControlAddr and createListener
src/control/server/server_utils_test.go Comprehensive unit tests for the new functionality
src/control/server/telemetry.go Updated Prometheus exporter to use the configured bind address
src/control/lib/telemetry/promexp/httpd.go Added BindAddress field to ExporterConfig for binding to specific addresses
src/tests/ftest/util/server_utils_params.py Added control_interface parameter to test infrastructure
src/tests/ftest/control/daos_server_config.yaml Added test cases for control interface validation
src/tests/ftest/control/daos_server_config.py Added functional test for control interface configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@daosbuild3
Copy link
Collaborator

Test stage NLT on EL 8.8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos/job/PR-17367/2/display/redirect

@daosbuild3
Copy link
Collaborator

Test stage Unit Test on EL 8.8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos/job/PR-17367/2/display/redirect

@mjmac mjmac self-assigned this Jan 12, 2026
@daosbuild3
Copy link
Collaborator

Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17367/4/execution/node/1318/log

@daosbuild3
Copy link
Collaborator

Test stage Functional Hardware Large MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17367/5/execution/node/1318/log

@daosbuild3
Copy link
Collaborator

Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17367/5/execution/node/1308/log

@daosbuild3
Copy link
Collaborator

Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17367/6/execution/node/427/log

@daosbuild3
Copy link
Collaborator

Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17367/7/execution/node/1317/log

@daosbuild3
Copy link
Collaborator

@daosbuild3
Copy link
Collaborator

@daosbuild3
Copy link
Collaborator

@daosbuild3
Copy link
Collaborator

@daosbuild3
Copy link
Collaborator

@daosbuild3
Copy link
Collaborator

Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17367/11/execution/node/1328/log

@mjmac mjmac force-pushed the mjmac/DAOS-3985 branch 2 times, most recently from 0efe88a to bc4f884 Compare February 11, 2026 20:38
By default, the control plane server binds to 0.0.0.0, which
means that it is listening to all addresses on all interfaces.
In some cases, the admin may prefer to specify a single interface
to be used for control plane traffic.

When control_iface is set in daos_server.yml, the server will
use the first IPv4 address on that interface as both the listen
address and the address recorded in the management database. If
the prometheus listener is configured, it will also use the first
address found for the control interface.

Features: control
Signed-off-by: Michael MacDonald <[email protected]>
@mjmac mjmac marked this pull request as ready for review February 13, 2026 16:28
@mjmac mjmac requested review from a team as code owners February 13, 2026 16:28
Copy link
Contributor

@daltonbohning daltonbohning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ftest LGTM

Copy link
Contributor

@knard38 knard38 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sound strange to mix IPv4 as the default interface and interface name.
Why not using IPv4 address instead of control interface ?

@tanabarr
Copy link
Contributor

Sound strange to mix IPv4 as the default interface and interface name. Why not using IPv4 address instead of control interface ?

I guess so the administrator doesn't have to worry about the networking environment and how the hostname gets resolved. for a large number of hosts would be tedious having to fetch IP addresses for each interface on each host. just my thoughts

Copy link
Contributor

@kjacque kjacque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work. Just had a couple questions to further my understanding, none of them blocking either way.

@mjmac
Copy link
Contributor Author

mjmac commented Feb 18, 2026

Sound strange to mix IPv4 as the default interface and interface name. Why not using IPv4 address instead of control interface ?

I guess so the administrator doesn't have to worry about the networking environment and how the hostname gets resolved. for a large number of hosts would be tedious having to fetch IP addresses for each interface on each host. just my thoughts

Yes, exactly. Setting an ipv4 address would require the admin to generate a custom server yaml for every host, which is completely counter to the project's configuration philosophy, and would be kind of a nightmare to manage at Aurora scale.

@knard38
Copy link
Contributor

knard38 commented Feb 18, 2026

Sound strange to mix IPv4 as the default interface and interface name. Why not using IPv4 address instead of control interface ?

I guess so the administrator doesn't have to worry about the networking environment and how the hostname gets resolved. for a large number of hosts would be tedious having to fetch IP addresses for each interface on each host. just my thoughts

Yes, exactly. Setting an ipv4 address would require the admin to generate a custom server yaml for every host, which is completely counter to the project's configuration philosophy, and would be kind of a nightmare to manage at Aurora scale.

Sounds reasonable. Thanks for the explanation.

@mjmac mjmac merged commit 91cd4c4 into master Feb 19, 2026
53 of 57 checks passed
@mjmac mjmac deleted the mjmac/DAOS-3985 branch February 19, 2026 01:38
@mchaarawi
Copy link
Contributor

mchaarawi commented Feb 20, 2026

it sounds like this is causing some CI failures in CI unit tests?
https://jenkins-3.daos.hpc.amslabs.hpecorp.net/blue/organizations/jenkins/daos-stack%2Fdaos/detail/master/599/tests

or maybe not since it passed in this PR.. maybe an intermittent problem, but im not sure if it was seen before this PR.

@mchaarawi
Copy link
Contributor

it sounds like this is causing some CI failures in CI unit tests? https://jenkins-3.daos.hpc.amslabs.hpecorp.net/blue/organizations/jenkins/daos-stack%2Fdaos/detail/master/599/tests

or maybe not since it passed in this PR.. maybe an intermittent problem, but im not sure if it was seen before this PR.

sorry please disregard.. there is a ticket already for this issue that seems to happen for a while now..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

8 participants