Skip to content
Closed
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 66 additions & 0 deletions docs/testplan/BGP_IPv6_test.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# SONiC Switch High-Scale IPv6 BGP Test

## Table of Contents

- [Test Objective](#test-objective)
- [Test Setup](#test-setup)
- [Test Steps](#test-steps)

## Test Objective

This test verifies the scalability and stability of multiple BGP sessions on a SONiC switch. BGP sessions will be established between each Ethernet logical port of the DUT and its neighboring devices. The test evaluates the DUT’s ability to initiate and maintain BGP sessions, validates proper route learning, and measures BGP update convergence time under various conditions.

## Test Setup

This test is designed to be topology-independent, meaning it does not assume or enforce a specific network layout. The only requirement is that the DUT is fully connected to handle full traffic loads under stress. All logical Ethernet ports are utilized to establish BGP sessions. Assuming the DUT has X logical Ethernet ports and is connected to Y neighboring switches, we will establish X/Y BGP sessions between each neighbor and the DUT.

![Test Setup](./example_layout.png)

In the above example, the DUT has 256 logical Ethernet ports and is connected to 4 neighboring switches, we will establish 64 BGP sessions between each neighbor and the DUT.

## Test Steps
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should have 3 test cases:

  1. All session shutdown/enable time
  2. 1 session shutdown/enable time
  3. Fragmented failure links / Nexthop Group Member Scale Test

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 1st and 2nd cases are done. Working on the 3rd case...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-wrote the 3rd case.


1. Assign a unique AS number to each of the five switches.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ASN should be coming from topology. we should mention this in the test setup section as topology introduction.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to test setup section.


2. Between each of the neighboring switches and the DUT: Configure X/Y BGP sessions. Each BGP session should have a dedicated pair of Ethernet ports (one on the DUT and the other on the neighboring device) whose IPv6 addresses are on the same subnet. Set up the BGP neighbors, device neighbors, and port IPv6 addresses for each BGP session.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with our current setup of multi-tier topology, we should not have any special setup for this one.

however, in the test setup, we can mention the approach we use to stress the BGP session on a device.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. with multiple T0 and a single T1 being used.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I revised both test plans. Please review.


3. Monitor the BGP session establishment on the DUT using command `show ipv6 bgp summary`. Ensure all X BGP sessions are established without errors.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this step should be the first pretest step.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


4. In each neighboring switch: Configure a vlan, assign `10*X` IPv6 addresses with the specified prefix length and add all the Ethernet ports connected to IXIA to the vlan.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might need to update this step with ixia advertising the routes?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revised.


5. Monitor the BGP route learning on the DUT by running `show ipv6 route bgp`. Verify the DUT learns and installs all routes.

## Key Test Cases

### One BGP Session Flap
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

### Case 1: ...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


1. One session down and up: Shut down one interface on the DUT. Wait till all routes advertised by the impacted BGP session are removed.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Port shutdown
Step 1: Start traffic
Step 2: Shutdown port
Step 3: Evaluate data path reaction time
Step 4: Evaluate route update time

repeat reverse for port startup

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to be precise on how to evaluate, which command to run and etc.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check my revised version.

2. Bring up the interface and measure the time for BGP session and route reestablishment.
3. Repeat this process and calculate the average update time of this scenario.

### All BGP Sessions Down and Up Test

1. Stop the BGP container on the DUT. Wait till all BGP routes are removed.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above. need to make it better and also we need to focus more on the data plane reaction time, since we have ixia in our testbed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check out my Teams message.

2. Bring up the BGP container and measure the time for BGP session and route reestablishment.
3. Repeat this process and calculate the average update time of this scenario.

### Nexthop Reduction and Restoration Test

1. In one of the T0 switches, run `show ipv6 bgp network <ipv6>/<prefix>` and find the number of nexthops that can be used to reach `<ipv6>/<prefix>`.
2. Randomly pick half of the next hops and remove them. Run the show command again and record the convergence time.
3. Restore the removed nexthops and record the convergence time again.
4. Repeat this process and calculate the average convergence time of this scenario.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing the metrics definition here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added. Thank you!


## Metrics

Save the BGP convergence time info to a database via the final metrics reporter interface provided by the SONiC team in `test_reporting` folder. An example of how to use the interface is provided in `telemetry` folder.

| Label | Example Value |
| ----------------------------------------------- | ------------------- |
| `METRIC_LABEL_DEVICE_ID` | switch-A |

| Metric Name | Example Value |
| ----------------------------------------------- | ------------------- |
| `METRIC_NAME_BGP_CONVERGENCE_PORT_RESTART` | 15 |
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test.bgp_scale.one_port_down.route_convergence_time_ms
test.bgp_scale.one_port_down.dp_response_time_ms
test.bgp_scale.all_port_down.route_convergence_time_ms
....

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convergency time is measured in seconds, not milliseconds. Using a suffix like *_s, might cause confusion about what that is.

| `METRIC_NAME_BGP_CONVERGENCE_CONTAINER_RESTART` | 72 |
| `METRIC_NAME_BGP_CONVERGENCE_NEXTHOP_CHANGE` | 60 |
Binary file added docs/testplan/example_layout.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading