-
Notifications
You must be signed in to change notification settings - Fork 1k
[test plan] Test plan for BGP scale test #15702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
bd9174d
9ec53f7
2b4603c
b191e21
f4dc3d9
dbb2477
16928f7
4cced59
4b491be
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,108 @@ | ||
| # BGP Scale Test Plan | ||
|
|
||
| - [Overview](#overview) | ||
| - [Scope](#scope) | ||
| - [Testbed](#testbed) | ||
| - [Setup configuration](#setup-configuration) | ||
| - [Test methodology](#test-methodology) | ||
| - [Test Cases](#test-cases) | ||
| - [BGP Sessions Flapping Test](BGP-Sessions-Flapping-Test) | ||
| - [Unisolation Test](Unisolation-Test) | ||
| - [Nexthop Group Member Scale Test](Nexthop-Group-Member-Scale-Test) | ||
|
|
||
|
|
||
| ## Overview | ||
|
|
||
| This test plan is to test if control/data plane can handle the initialization/flapping of numerous BGP session holding a lot routes, and estimate the impact on it. | ||
|
|
||
|
|
||
| ## Scope | ||
|
w1nda marked this conversation as resolved.
Outdated
|
||
|
|
||
| This test plan runs on any device running SONIC system with fully functioning configuration with numerouse BGP peers with count 256/512. | ||
|
|
||
| This test plan is dedicated to IPv6. | ||
|
w1nda marked this conversation as resolved.
Outdated
|
||
|
|
||
| This test plan shows if there is any service crush, if hardware resource run out, if device has acceptable performance and data/control plane availability. | ||
|
w1nda marked this conversation as resolved.
Outdated
|
||
|
|
||
|
|
||
|
w1nda marked this conversation as resolved.
|
||
| ## Testbed | ||
|
w1nda marked this conversation as resolved.
Outdated
|
||
|
|
||
| This test run on testbeds with topologies topo_t0-isolated-u254d2, topo_t0-isolated-u510d2, topo_t1-isolated-u2d254 and topo_t1-isolated-u2d510. | ||
|
|
||
| *Fig.1 topo_t0-isolated-u254d* | ||
|
w1nda marked this conversation as resolved.
Outdated
|
||
|  | ||
|
|
||
| *Fig.2 topo_t0-isolated-u510d2* | ||
|  | ||
|
|
||
| *Fig.3 topo_t1-isolated-u2d254* | ||
|  | ||
|
|
||
| *Fig.4 topo_t1-isolated-u2d510* | ||
|  | ||
|
|
||
|
|
||
| # Setup configuration | ||
| The count of routes from BGP peers is vital, we will leverage exabpg to advertise routes to all BGP peers, and those routes be be advertised to device under test finally. | ||
|
|
||
| When DUT is T0, via exabgp, firstly, we will advertise 511 routes with prefix length 120 to all peer T1 devices for simulating downstream routes (VLAN IPv6 addresses of T0s), secondly, we will dvertise 15 routes with prefix length 64 to all peer T1 devices for simulating upstream routes (Aggregated IPv6 addresses of T0s' VLAN on T2s), finally, the DUT T0 will receive those routes from BGP peers. | ||
|
w1nda marked this conversation as resolved.
Outdated
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it might be better to say - for each neighbor, we will advertise 1k routes in total: 512 /120 and 512 /128.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we will skip the T2 ones here. they won't make difference but can cause a lot confusions.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because we have 1 /120 and 1/128 on T0 DUT, I think the routes count are 511 /120 plus 511 /128, right?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's 511 or 512? |
||
|
|
||
| When DUT is T1, we won't mock any routes. | ||
|
w1nda marked this conversation as resolved.
Outdated
|
||
|
|
||
|
|
||
|
w1nda marked this conversation as resolved.
Outdated
|
||
| # Test methodology | ||
| For simulating the initialization of system, we shutdown all ports before test. | ||
|
w1nda marked this conversation as resolved.
Outdated
|
||
|
|
||
| For simulating BGP session flapping on DUT, we will shutdown port a little while and unshut the port. | ||
|
|
||
| For simulating BGP routes flapping on DUT, we will withdrawn routes on BGP peers via exabgp for a littele while and advertise routes again. | ||
|
|
||
| For checking if all expected routes are programed into ASIC, fristly, we will check routes count, secondly, we will keep sending packets for all routes, and check if all expected nexthop in same group receving packets for all routes. | ||
|
|
||
| For estimating data plane downtime, we will keep sending packets with fix interval, and observer packet drop count. | ||
|
|
||
|
|
||
| # Test Cases | ||
|
|
||
|
|
||
| ## BGP Sessions Flapping Test | ||
| ### Objective | ||
| When BGP sessions are flapping, make sure control plane is functional and data plane has no downtime or acceptable downtime. | ||
| ### Steps | ||
| 1. Pick N random ports to shut down. | ||
| 1. Start to sending packets with all routes in fix time interval to the rest ports via ptf. | ||
| 1. Shutdown the N ports and count packets received on ports. 1. Wait for the N BGP sessions are down and routes are stable. | ||
| 1. Stop sending packets | ||
| 1. Estamite data plane down time | ||
|
|
||
|
|
||
| ## Unisolation Test | ||
| ### Objective | ||
| In the worst senario, verify control/data plane have acceptable conergence time. | ||
| ### Steps | ||
| 1. Shut down all ports on device. | ||
| 1. Start to sending packets with all routes in fix time interval to all portes via ptf. | ||
| 1. Unshut all ports and count packets received on ports. | ||
| 1. Wait for routes are stable. | ||
| 1. Stop sending packets. | ||
| 1. Estamite control/data plane convergence time. | ||
|
|
||
|
|
||
| ## Nexthop Group Member Scale Test | ||
| ### Objective | ||
| When routes on BGP peers are flapping, make sure DUT's control plane is functional and data plane has no downtime or acceptable downtime. | ||
|
w1nda marked this conversation as resolved.
Outdated
|
||
| ### Steps | ||
| 1. Pick N random BGP peers to manipulate routes. | ||
| 1. Pick random half of common routes as RHoCRs. | ||
| #### Test Withdrawn | ||
| 1. Start to sending packets with RHoCRs with in fix time interval to all portes via ptf. | ||
| 1. Withdrawn RHoCRs | ||
| 1. Wait for routes are stable. | ||
| 1. Stop sending packets. | ||
| 1. Estamite data plane down time. | ||
| #### Test Advertising | ||
| 1. Start to sending packets with RHoCRs with in fix time interval to all portes via ptf. | ||
| 1. Advertise RHoCRs | ||
| 1. Wait for routes are stable. | ||
| 1. Stop sending packets. | ||
| 1. Estamite control/data plane convergence time. | ||
Uh oh!
There was an error while loading. Please reload this page.