SONiC switch High-scale IPv6 BGP test plan#16759
SONiC switch High-scale IPv6 BGP test plan#16759sm-xu wants to merge 24 commits intosonic-net:masterfrom
Conversation
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
| @@ -0,0 +1,32 @@ | |||
| # Test Objective | |||
| This test aims to verify the scalability and stability of 256 BGP sessions and 10K IPv6 routes in a 2-tier network. It evaluates the DUT’s ability to establish and maintain BGP sessions, ensures proper route learning, and measures BGP update convergence time under various conditions. | |||
There was a problem hiding this comment.
setup is 1 BGP session per port.
so we are not limited to 256 BGP sessions
| @@ -0,0 +1,32 @@ | |||
| # Test Objective | |||
| This test aims to verify the scalability and stability of 256 BGP sessions and 10K IPv6 routes in a 2-tier network. It evaluates the DUT’s ability to establish and maintain BGP sessions, ensures proper route learning, and measures BGP update convergence time under various conditions. | |||
There was a problem hiding this comment.
mutli-tier network should still work.
| This test aims to verify the scalability and stability of 256 BGP sessions and 10K IPv6 routes in a 2-tier network. It evaluates the DUT’s ability to establish and maintain BGP sessions, ensures proper route learning, and measures BGP update convergence time under various conditions. | ||
|
|
||
| # Test Setup | ||
|  |
| @@ -0,0 +1,32 @@ | |||
| # Test Objective | |||
There was a problem hiding this comment.
actually, better split the route scale into another test. for bgp session, we can use 4 x Number of ports as route scale.
There was a problem hiding this comment.
for route scale test, we can do 40 x number of ports. but in separate test. in the route scale test, we need to check the latency very carefully.
| # Test Setup | ||
|  | ||
|
|
||
| 1. The testbed consists of four IXIA traffic generators (synchronized using a time-sync metronome) and five SONiC switches, where the BT1 switch is the Device Under Test (DUT). |
There was a problem hiding this comment.
the test should work with any number of ixias.
| 5. The routing configuration of the BT0 switches should ensure that all data traffic go through the DUT. | ||
|
|
||
| # Test Steps | ||
| 1. Assign a unique AS number to each of the five switches. |
There was a problem hiding this comment.
we should consider automate this using add-topo or deploy-mg.
all switches can share the same setup, and ask the IXIA to advertise the routes.
|
/azp run |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
docs/testplan/BGP_IPv6_test.md
Outdated
|
|
||
| 2. Between each of the four neighboring switches and the DUT: Configure X/Y BGP sessions. Each BGP session should have a dedicated pair of Ethernet ports (one on the DUT and the other on the neighboring device) whose IPv6 addresses are on the same subnet. Set up the BGP neighbors, device neighbors, and port IPv6 addresses for each BGP session. | ||
|
|
||
| 3. Monitor the BGP session establishment on the DUT using command “show ipv6 bgp summary”. Ensure all X BGP sessions are established without errors. |
There was a problem hiding this comment.
command should be quoted by `
docs/testplan/BGP_IPv6_test.md
Outdated
|
|
||
| 3. Monitor the BGP session establishment on the DUT using command “show ipv6 bgp summary”. Ensure all X BGP sessions are established without errors. | ||
|
|
||
| 4. In each neighboring switch: Configure a vlan, assign 2500 IPv6 addresses with the specified prefix length and add all the Ethernet ports connected to IXIA to the vlan. |
There was a problem hiding this comment.
the 2500 should be total number of port x10 (considering 1 port, 1 VLAN route)
docs/testplan/BGP_IPv6_test.md
Outdated
|
|
||
| In the above example, the DUT has 256 logical Ethernet ports and is connected to 4 neighboring switches, we will establish 64 BGP sessions between each neighbor and the DUT. | ||
|
|
||
| ## Test Steps |
There was a problem hiding this comment.
we should have 3 test cases:
- All session shutdown/enable time
- 1 session shutdown/enable time
- Fragmented failure links / Nexthop Group Member Scale Test
There was a problem hiding this comment.
There was a problem hiding this comment.
The 1st and 2nd cases are done. Working on the 3rd case...
There was a problem hiding this comment.
Re-wrote the 3rd case.
|
/azp run |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
|
/azp run |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
docs/testplan/BGP_IPv6_test.md
Outdated
| 1. In one of the T0 switches, run `show ipv6 bgp network <ipv6>/<prefix>` and find the number of nexthops that can be used to reach `<ipv6>/<prefix>`. | ||
| 2. Randomly pick half of the next hops and remove them. Run the show command again and record the convergence time. | ||
| 3. Restore the removed nexthops and record the convergence time again. | ||
| 4. Repeat this process and calculate the average convergence time of this scenario. |
There was a problem hiding this comment.
missing the metrics definition here.
|
/azp run |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
docs/testplan/BGP_IPv6_test.md
Outdated
|
|
||
| ## Test Steps | ||
|
|
||
| 1. Assign a unique AS number to each of the five switches. |
There was a problem hiding this comment.
The ASN should be coming from topology. we should mention this in the test setup section as topology introduction.
There was a problem hiding this comment.
Moved to test setup section.
docs/testplan/BGP_IPv6_test.md
Outdated
|
|
||
| 1. Assign a unique AS number to each of the five switches. | ||
|
|
||
| 2. Between each of the neighboring switches and the DUT: Configure X/Y BGP sessions. Each BGP session should have a dedicated pair of Ethernet ports (one on the DUT and the other on the neighboring device) whose IPv6 addresses are on the same subnet. Set up the BGP neighbors, device neighbors, and port IPv6 addresses for each BGP session. |
There was a problem hiding this comment.
with our current setup of multi-tier topology, we should not have any special setup for this one.
however, in the test setup, we can mention the approach we use to stress the BGP session on a device.
There was a problem hiding this comment.
e.g. with multiple T0 and a single T1 being used.
There was a problem hiding this comment.
I revised both test plans. Please review.
docs/testplan/BGP_IPv6_test.md
Outdated
|
|
||
| 2. Between each of the neighboring switches and the DUT: Configure X/Y BGP sessions. Each BGP session should have a dedicated pair of Ethernet ports (one on the DUT and the other on the neighboring device) whose IPv6 addresses are on the same subnet. Set up the BGP neighbors, device neighbors, and port IPv6 addresses for each BGP session. | ||
|
|
||
| 3. Monitor the BGP session establishment on the DUT using command `show ipv6 bgp summary`. Ensure all X BGP sessions are established without errors. |
There was a problem hiding this comment.
this step should be the first pretest step.
docs/testplan/BGP_IPv6_test.md
Outdated
|
|
||
| 3. Monitor the BGP session establishment on the DUT using command `show ipv6 bgp summary`. Ensure all X BGP sessions are established without errors. | ||
|
|
||
| 4. In each neighboring switch: Configure a vlan, assign `10*X` IPv6 addresses with the specified prefix length and add all the Ethernet ports connected to IXIA to the vlan. |
There was a problem hiding this comment.
might need to update this step with ixia advertising the routes?
docs/testplan/BGP_IPv6_test.md
Outdated
|
|
||
| ## Key Test Cases | ||
|
|
||
| ### One BGP Session Flap |
docs/testplan/BGP_IPv6_test.md
Outdated
|
|
||
| ### One BGP Session Flap | ||
|
|
||
| 1. One session down and up: Shut down one interface on the DUT. Wait till all routes advertised by the impacted BGP session are removed. |
There was a problem hiding this comment.
Port shutdown
Step 1: Start traffic
Step 2: Shutdown port
Step 3: Evaluate data path reaction time
Step 4: Evaluate route update time
repeat reverse for port startup
There was a problem hiding this comment.
we need to be precise on how to evaluate, which command to run and etc.
There was a problem hiding this comment.
Please check my revised version.
docs/testplan/BGP_IPv6_test.md
Outdated
|
|
||
| ### All BGP Sessions Down and Up Test | ||
|
|
||
| 1. Stop the BGP container on the DUT. Wait till all BGP routes are removed. |
There was a problem hiding this comment.
same as above. need to make it better and also we need to focus more on the data plane reaction time, since we have ixia in our testbed.
There was a problem hiding this comment.
Please check out my Teams message.
docs/testplan/BGP_IPv6_test.md
Outdated
|
|
||
| | Metric Name | Example Value | | ||
| | ----------------------------------------------- | ------------------- | | ||
| | `METRIC_NAME_BGP_CONVERGENCE_PORT_RESTART` | 15 | |
There was a problem hiding this comment.
test.bgp_scale.one_port_down.route_convergence_time_ms
test.bgp_scale.one_port_down.dp_response_time_ms
test.bgp_scale.all_port_down.route_convergence_time_ms
....
There was a problem hiding this comment.
Convergency time is measured in seconds, not milliseconds. Using a suffix like *_s, might cause confusion about what that is.
|
/azp run |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
|
/azp run |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
|
/azp run |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
|
Close in favor of #19564 |
Description of PR
Summary:
Fixes # (issue)
Type of change
Back port request
Approach
What is the motivation for this PR?
How did you do it?
How did you verify/test it?
Any platform specific information?
Supported testbed topology if it's a new test case?
Documentation