DAOS-3985 control: Add ControlInterface to server config#17367
Conversation
|
Ticket title is 'Allow Control Plane to bind to specific interfaces' |
7b73d8a to
e685b64
Compare
There was a problem hiding this comment.
Pull request overview
This pull request adds a control_interface configuration option to the DAOS server configuration, allowing administrators to bind the control plane listener to a specific network interface instead of the default behavior of listening on all interfaces (0.0.0.0). When configured, the server uses the first (lowest) IPv4 address from the specified interface for both the control plane listener and the Prometheus telemetry exporter.
Changes:
- Added
control_interfaceconfiguration parameter with validation and error handling - Modified control plane and Prometheus listener binding logic to use the configured interface
- Comprehensive test coverage including unit tests and functional tests
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| utils/config/daos_server.yml | Added documentation for the new control_interface configuration parameter |
| src/control/server/config/server.go | Added ControlInterface field to Server struct and WithControlInterface method |
| src/control/server/config/server_test.go | Updated test to include the new configuration option |
| src/control/server/config/faults.go | Added fault definitions for invalid control interface and address mismatch scenarios |
| src/control/fault/code/codes.go | Added fault codes for control interface errors |
| src/control/server/server.go | Modified initNetwork to look up and validate the control interface |
| src/control/server/server_utils.go | Implemented core logic including getFirstIPv4Addr and updated getControlAddr and createListener |
| src/control/server/server_utils_test.go | Comprehensive unit tests for the new functionality |
| src/control/server/telemetry.go | Updated Prometheus exporter to use the configured bind address |
| src/control/lib/telemetry/promexp/httpd.go | Added BindAddress field to ExporterConfig for binding to specific addresses |
| src/tests/ftest/util/server_utils_params.py | Added control_interface parameter to test infrastructure |
| src/tests/ftest/control/daos_server_config.yaml | Added test cases for control interface validation |
| src/tests/ftest/control/daos_server_config.py | Added functional test for control interface configuration |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Test stage NLT on EL 8.8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos/job/PR-17367/2/display/redirect |
e685b64 to
702ab86
Compare
|
Test stage Unit Test on EL 8.8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos/job/PR-17367/2/display/redirect |
702ab86 to
3f67fa8
Compare
3f67fa8 to
f31edf1
Compare
|
Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17367/4/execution/node/1318/log |
|
Test stage Functional Hardware Large MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17367/5/execution/node/1318/log |
|
Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17367/5/execution/node/1308/log |
|
Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17367/6/execution/node/427/log |
f31edf1 to
75de05d
Compare
|
Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17367/7/execution/node/1317/log |
75de05d to
37fdfc0
Compare
|
Test stage Functional on EL 8.8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17367/8/execution/node/1011/log |
|
Test stage Test RPMs on EL 8.6 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17367/8/execution/node/1086/log |
|
Test stage NLT on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-17367/9/testReport/ |
|
Test stage Test RPMs on EL 8.6 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17367/9/execution/node/1014/log |
37fdfc0 to
ad3e628
Compare
|
Test stage NLT on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-17367/11/testReport/ |
|
Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17367/11/execution/node/1328/log |
0efe88a to
bc4f884
Compare
By default, the control plane server binds to 0.0.0.0, which means that it is listening to all addresses on all interfaces. In some cases, the admin may prefer to specify a single interface to be used for control plane traffic. When control_iface is set in daos_server.yml, the server will use the first IPv4 address on that interface as both the listen address and the address recorded in the management database. If the prometheus listener is configured, it will also use the first address found for the control interface. Features: control Signed-off-by: Michael MacDonald <[email protected]>
bc4f884 to
2f0b9af
Compare
knard38
left a comment
There was a problem hiding this comment.
Sound strange to mix IPv4 as the default interface and interface name.
Why not using IPv4 address instead of control interface ?
I guess so the administrator doesn't have to worry about the networking environment and how the hostname gets resolved. for a large number of hosts would be tedious having to fetch IP addresses for each interface on each host. just my thoughts |
kjacque
left a comment
There was a problem hiding this comment.
Nice work. Just had a couple questions to further my understanding, none of them blocking either way.
Yes, exactly. Setting an ipv4 address would require the admin to generate a custom server yaml for every host, which is completely counter to the project's configuration philosophy, and would be kind of a nightmare to manage at Aurora scale. |
Sounds reasonable. Thanks for the explanation. |
|
it sounds like this is causing some CI failures in CI unit tests? or maybe not since it passed in this PR.. maybe an intermittent problem, but im not sure if it was seen before this PR. |
sorry please disregard.. there is a ticket already for this issue that seems to happen for a while now.. |
By default, the control plane server binds to 0.0.0.0, which
means that it is listening to all addresses on all interfaces.
In some cases, the admin may prefer to specify a single interface
to be used for control plane traffic.
When control_iface is set in daos_server.yml, the server will
use the lowest IPv4 address on that interface as both the listen
address and the address recorded in the management database. If
the prometheus listener is configured, it will also use the same
address used for the control interface.
Features: control
Signed-off-by: Michael MacDonald [email protected]