[HLD] Virtual NUT Testbed (vNUT) — KVM-based virtual NUT testing#22977
[HLD] Virtual NUT Testbed (vNUT) — KVM-based virtual NUT testing#22977
Conversation
|
/azp run |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
|
/azp run |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
banidoru
left a comment
There was a problem hiding this comment.
Good HLD overall — well-structured, clear deployment/teardown flow, and good use of mermaid diagrams. A few areas need attention:
- TG device metadata inconsistency: The lab devices CSV lists the TG as
IxiaChassis/ixia, but the actual container isdocker-ptf. This will confuse readers and may break inventory-driven logic. - Hardcoded credentials in example inventory: Should explicitly note these are placeholders and recommend using Ansible Vault.
- Missing resource requirements: No mention of host RAM/CPU/disk needed to run 3 DUT containers + 1 TG.
- Missing error handling/troubleshooting guidance: What happens if container launch fails, veth creation fails, or services don't come up?
- Security note needed:
--privilegedand Docker socket mount are significant security implications worth calling out.
banidoru
left a comment
There was a problem hiding this comment.
Overall a well-structured HLD. The document clearly covers architecture, deployment, and teardown. A few areas need attention: internal consistency between the deployment overview and implementation details, missing explanation of key parameters, and the limitations section could be more actionable. See inline comments.
banidoru
left a comment
There was a problem hiding this comment.
Good HLD overall — well-structured with clear sections covering architecture, deployment, and teardown. Several concerns already raised by prior reviewers are valid (Ixia metadata mismatch, tg_api_server relevance, hardcoded passwords, security implications). Additional observations below.
|
/azp run |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
|
/azp run |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
banidoru
left a comment
There was a problem hiding this comment.
All prior feedback has been thoroughly addressed. The document is now well-structured with proper security notes, resource requirements, configurable subnet documentation, two-phase container launch explanation, expanded limitations section, and complete CLI flag documentation. No new issues found. LGTM.
banidoru
left a comment
There was a problem hiding this comment.
Re-review of iteration 2: All previous feedback has been thoroughly addressed. Key improvements include:
- Security warning for
--privilegedand Docker socket mount - Resource requirements section added
tg_api_serverandauto_recover: 'True'string usage explained with inline comments- IxiaChassis/Force10-S6000 metadata documented as framework constraints with known limitations noted
- Credential handling updated to reference cSONiC credential resolution path
- Two-phase container networking explained (default bridge → br-mgmt for static IPs)
- Deployment Step 3 now consistent with Section 8's two-phase description
- Management subnet documented as configurable via
defaults/main.yml - veth naming (
vmprefix collision withvstestbeds) documented withvnprefix under consideration - Orphaned container detection guidance added
- All
run_tests.shflags (-m individual,-a False,-u) now documented - Limitations section expanded with test compatibility categories and performance notes
The document is well-structured and comprehensive. LGTM.
banidoru
left a comment
There was a problem hiding this comment.
Re-review (iteration 2): All prior feedback has been addressed. The document now includes security warnings, resource requirements, subnet configurability notes, comprehensive flag documentation, hash input format details, HwSku/IxiaChassis metadata explanations, and consistent deployment step descriptions. Well-structured HLD — LGTM.
banidoru
left a comment
There was a problem hiding this comment.
All reviewers approved on re-review. All prior feedback addressed. LGTM.
|
/azp run |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
banidoru
left a comment
There was a problem hiding this comment.
Re-review (iteration 3) summary:
The document has improved significantly. All 20 previously resolved threads remain properly addressed — good work on the credential handling, security notes, resource requirements, naming documentation, and consistency fixes.
3 open threads remain unresolved:
- Topology placement (line 48) — The
nut-2tiersexample topology is still embedded in the Architecture section rather than moved to a dedicated example section as requested. - Manual container setup (line 73) — The manual
docker runapproach is still shown alongside the recommendedsetup-container.sh. Consider removing or collapsing the manual approach. - Example co-location (line 100) — The example topology details should be co-located with the testbed YAML definition section.
1 open thread resolved:
- Credential handling (line 125) — Now properly references
group_varscredential resolution instead of hardcoded passwords. ✅
No new issues found in this iteration. The document is well-structured and thorough. The remaining items are organizational suggestions from another reviewer.
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
banidoru
left a comment
There was a problem hiding this comment.
Re-review (iteration 3) summary:
Previously resolved threads: 20/24 threads confirmed addressed and remain resolved. The doc has improved significantly — resource requirements, security notes, credential handling, and implementation details are all well-documented.
3 threads remain open (from maintainer feedback):
- Topology as example (line 48):
nut-2tiersdetails are still embedded in Section 2 (Architecture) rather than in a dedicated example section. - setup-container only (line 73): Manual
docker runapproach is still shown alongside the recommendedsetup-container.shpath. Consider removing or collapsing it. - Example co-location (line 100): Related to #1 — example topology should be moved near the YAML definition.
1 thread resolved in this iteration: Credentials section (line 125) now properly references group_vars without hardcoded passwords.
No new issues found in the latest diff. The document is well-structured and comprehensive. Addressing the 3 remaining maintainer comments about document organization would complete this PR.
|
/azp run |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
banidoru
left a comment
There was a problem hiding this comment.
Iteration 3 re-review: all prior review threads are resolved. The HLD is comprehensive and well-structured — architecture, deployment, teardown, test execution, and limitations are clearly documented. The Ansible role and vnut_network.py module are clean with good idempotency handling. The doc now addresses previous feedback on resource requirements, credential handling, security notes, configurable subnets, hash naming, and the vm prefix collision risk. No new issues found. LGTM.
|
/azp run |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
|
/azp run |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
|
/azp run |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
|
Addressed review feedback in commit f870571:
|
|
/azp run |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Add high-level design document for the vNUT testbed feature at docs/testbed/README.testbed.VirtualNUT.md. Covers architecture, prerequisites, deployment, teardown, test execution, implementation details, and a complete nut-2tiers example with mermaid resource diagram showing exact VM names, bridge names, veth pairs, and interface mappings. Signed-off-by: r12f <r12f.code@gmail.com>
|
/azp run |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
yxieca
left a comment
There was a problem hiding this comment.
AI agent on behalf of Ying. Reviewed; no issues found.
yxieca
left a comment
There was a problem hiding this comment.
AI agent on behalf of Ying. Reviewed; no issues found.
|
AI agent on behalf of Ying.\n- Avoid using test_ prefix for fixtures per review guidance. |
yxieca
left a comment
There was a problem hiding this comment.
AI agent on behalf of Ying. Reviewed; no issues found.
yxieca
left a comment
There was a problem hiding this comment.
AI agent on behalf of Ying. Reviewed; no issues found.
yxieca
left a comment
There was a problem hiding this comment.
AI agent on behalf of Ying. Reviewed; no issues found.
Description of PR
Summary:
Add High-Level Design document for the Virtual NUT Testbed (vNUT), which enables running sonic-mgmt NUT tests against KVM-based virtual SONiC instances without physical hardware.
The HLD covers architecture, prerequisites, testbed definition, deployment/teardown via
testbed-cli.sh, test execution, implementation details, and a completenut-2tiersexample.Type of change
Back port request
Approach
What is the motivation for this PR?
Provide design documentation for the vNUT testbed feature (implementation: #22976).
How did you do it?
Added
docs/testbed/README.testbed.VirtualNUT.mdcovering:br1management bridgeHow did you verify/test it?
All documented commands verified against working vNUT deployment:
Any platform specific information?
N/A
Documentation
This PR is the documentation itself. Implementation PR: #22976