Skip to content

Commit 40e10c4

Browse files
authored
Test plan for feature Generic Hash (#7524)
Test plan for the new feature Generic Hash. The HLD of Generic Hash: sonic-net/SONiC#1101 The test implementation is not included in this PR, will open another PR.
1 parent acc686e commit 40e10c4

1 file changed

Lines changed: 337 additions & 0 deletions

File tree

Lines changed: 337 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,337 @@
1+
# Generic Hash Test Plan
2+
3+
## Related documents
4+
5+
| **Document Name** | **Link** |
6+
|-------------------|----------|
7+
| SONiC Generic Hash | [[https://github.com/sonic-net/SONiC/doc/hash/hash-design.md](https://github.com/sonic-net/SONiC/blob/master/doc/hash/hash-design.md)]|
8+
9+
10+
## 1. Overview
11+
The hashing algorithm is used to make traffic-forwarding decisions for traffic exiting the switch.
12+
It makes hashing decisions based on values in various packet fields, as well as on the hash seed value.
13+
The packet fields used by the hashing algorithm varies by the configuration on the switch.
14+
15+
For ECMP, the hashing algorithm determines how incoming traffic is forwarded to the next-hop device.
16+
For LAG, the hashing algorithm determines how traffic is placed onto the LAG member links to manage
17+
bandwidth by evenly load-balancing traffic across the outgoing links.
18+
19+
Generic Hash is a feature which allows user to configure which hash fields suppose to be used by hashing algorithm by providing global switch hash configuration for ECMP and LAG.
20+
21+
The sonic-mgmt generic hash tests validate whether the hash configurations can be applied successfully and the hash behaviour is as expected.
22+
23+
## 2. Requirements
24+
25+
### 2.1 The Generic Hash feature supports the following functionality:
26+
1. Ethernet packet hashing configuration with inner/outer IP frames
27+
2. Global switch hash configuration for ECMP and LAG
28+
3. Warm/Fast reboot
29+
30+
### 2.2 This feature will support the following commands:
31+
32+
1. config: set switch hash global configuration
33+
2. show: display switch hash global configuration or capabilities
34+
35+
### 2.3 This feature will provide error handling for the next situations:
36+
37+
#### 2.3.1 Frontend
38+
**This feature will provide error handling for the next situations:**
39+
1. Invalid parameter value
40+
#### 2.3.2 Backend
41+
**This feature will provide error handling for the next situations:**
42+
1. Missing parameters
43+
2. Invalid parameter value
44+
3. Parameter removal
45+
4. Configuration removal
46+
47+
## 3. Scope
48+
49+
The test is to verify the hash configuration can be added/updated by the generic hash, and the ECMP and lag hash behavior will change according to the generic hash configurations.
50+
51+
### 3.1 Scale / Performance
52+
53+
No scale or performance test related
54+
55+
### 3.2 CLI commands
56+
57+
#### 3.2.1 Config
58+
The following command can be used to configure generic hash:
59+
```
60+
config
61+
|--- switch-hash
62+
|--- global
63+
|--- ecmp-hash ARGS
64+
|--- lag-hash ARGS
65+
```
66+
67+
Examples:
68+
The following command updates switch hash global:
69+
```
70+
config switch-hash global ecmp-hash \
71+
'DST_MAC' \
72+
'SRC_MAC' \
73+
'ETHERTYPE' \
74+
'IP_PROTOCOL' \
75+
'DST_IP' \
76+
'SRC_IP' \
77+
'L4_DST_PORT' \
78+
'L4_SRC_PORT' \
79+
'INNER_DST_MAC' \
80+
'INNER_SRC_MAC' \
81+
'INNER_ETHERTYPE' \
82+
'INNER_IP_PROTOCOL' \
83+
'INNER_DST_IP' \
84+
'INNER_SRC_IP' \
85+
'INNER_L4_DST_PORT' \
86+
'INNER_L4_SRC_PORT'
87+
```
88+
```
89+
config switch-hash global lag-hash \
90+
'DST_MAC' \
91+
'SRC_MAC' \
92+
'ETHERTYPE' \
93+
'IP_PROTOCOL' \
94+
'DST_IP' \
95+
'SRC_IP' \
96+
'L4_DST_PORT' \
97+
'L4_SRC_PORT' \
98+
'INNER_DST_MAC' \
99+
'INNER_SRC_MAC' \
100+
'INNER_ETHERTYPE' \
101+
'INNER_IP_PROTOCOL' \
102+
'INNER_DST_IP' \
103+
'INNER_SRC_IP' \
104+
'INNER_L4_DST_PORT' \
105+
'INNER_L4_SRC_PORT'
106+
```
107+
108+
#### 3.2.2 Show
109+
The following command shows switch hash global configuration:
110+
```
111+
show
112+
|--- switch-hash
113+
|--- global
114+
|--- capabilities
115+
```
116+
117+
Example:
118+
**The following command shows switch hash global configuration:**
119+
```bash
120+
root@sonic:/home/admin# show switch-hash global
121+
ECMP HASH LAG HASH
122+
----------------- -----------------
123+
DST_MAC DST_MAC
124+
SRC_MAC SRC_MAC
125+
ETHERTYPE ETHERTYPE
126+
IP_PROTOCOL IP_PROTOCOL
127+
DST_IP DST_IP
128+
SRC_IP SRC_IP
129+
L4_DST_PORT L4_DST_PORT
130+
L4_SRC_PORT L4_SRC_PORT
131+
INNER_DST_MAC INNER_DST_MAC
132+
INNER_SRC_MAC INNER_SRC_MAC
133+
INNER_ETHERTYPE INNER_ETHERTYPE
134+
INNER_IP_PROTOCOL INNER_IP_PROTOCOL
135+
INNER_DST_IP INNER_DST_IP
136+
INNER_SRC_IP INNER_SRC_IP
137+
INNER_L4_DST_PORT INNER_L4_DST_PORT
138+
INNER_L4_SRC_PORT INNER_L4_SRC_PORT
139+
```
140+
141+
**The following command shows switch hash capabilities:**
142+
```bash
143+
root@sonic:/home/admin# show switch-hash capabilities
144+
ECMP HASH LAG HASH
145+
----------------- -----------------
146+
IN_PORT IN_PORT
147+
DST_MAC DST_MAC
148+
SRC_MAC SRC_MAC
149+
ETHERTYPE ETHERTYPE
150+
VLAN_ID VLAN_ID
151+
IP_PROTOCOL IP_PROTOCOL
152+
DST_IP DST_IP
153+
SRC_IP SRC_IP
154+
L4_DST_PORT L4_DST_PORT
155+
L4_SRC_PORT L4_SRC_PORT
156+
INNER_DST_MAC INNER_DST_MAC
157+
INNER_SRC_MAC INNER_SRC_MAC
158+
INNER_ETHERTYPE INNER_ETHERTYPE
159+
INNER_IP_PROTOCOL INNER_IP_PROTOCOL
160+
INNER_DST_IP INNER_DST_IP
161+
INNER_SRC_IP INNER_SRC_IP
162+
INNER_L4_DST_PORT INNER_L4_DST_PORT
163+
INNER_L4_SRC_PORT INNER_L4_SRC_PORT
164+
```
165+
166+
### 3.3 DUT related configuration in config_db
167+
168+
```
169+
{
170+
"SWITCH_HASH": {
171+
"GLOBAL": {
172+
"ecmp_hash": [
173+
"DST_MAC",
174+
"SRC_MAC",
175+
"ETHERTYPE",
176+
"IP_PROTOCOL",
177+
"DST_IP",
178+
"SRC_IP",
179+
"L4_DST_PORT",
180+
"L4_SRC_PORT",
181+
"INNER_DST_MAC",
182+
"INNER_SRC_MAC",
183+
"INNER_ETHERTYPE",
184+
"INNER_IP_PROTOCOL",
185+
"INNER_DST_IP",
186+
"INNER_SRC_IP",
187+
"INNER_L4_DST_PORT",
188+
"INNER_L4_SRC_PORT"
189+
],
190+
"lag_hash": [
191+
"DST_MAC",
192+
"SRC_MAC",
193+
"ETHERTYPE",
194+
"IP_PROTOCOL",
195+
"DST_IP",
196+
"SRC_IP",
197+
"L4_DST_PORT",
198+
"L4_SRC_PORT",
199+
"INNER_DST_MAC",
200+
"INNER_SRC_MAC",
201+
"INNER_ETHERTYPE",
202+
"INNER_IP_PROTOCOL",
203+
"INNER_DST_IP",
204+
"INNER_SRC_IP",
205+
"INNER_L4_DST_PORT",
206+
"INNER_L4_SRC_PORT"
207+
]
208+
}
209+
}
210+
}
211+
```
212+
### 3.4 Supported topology
213+
The test should support t0 and t1 topologies.
214+
215+
## 4. Test cases
216+
217+
| **No.** | **Test Case** | **Test Purpose** |
218+
|----------|-------------------|----------|
219+
| 1 | test_hash_capability | Verify the “show switch-hash capabilities” gets the supported hash fields.|
220+
| 2 | test_ecmp_hash | Verify the basic functionality of ecmp hash with a single hash field|
221+
| 3 | test_lag_hash | Verify the basic functionality of lag hash with a single hash field|
222+
| 4 | test_ecmp_and_lag_hash | Verify the hash functionality with all ecmp and lag hash fields configured|
223+
| 5 | test_nexthop_flap | Verify the ecmp hash functionality when there is nexthop flap|
224+
| 6 | test_lag_member_flap | Verify the lag hash functionality when there is lag member flap|
225+
| 7 | test_lag_member_remove_add| Verify the lag hash functionality after a lag member is removed and added back to a portchannel|
226+
| 8 | test_reboot | Verify there is no hash configuration inconsistence before and after reload/reboot|
227+
| 9 | test_backend_error_messages | Verify there are backend errors in syslog when the hash config is removed or updated with invalid values via redis cli|
228+
229+
### Notes:
230+
1. The tested hash field in each test case is randomly selected from a pre-defined field list per asic type. Currently these fields are tested as default: 'IN_PORT', 'SRC_MAC', 'DST_MAC', 'ETHERTYPE', 'VLAN_ID', 'IP_PROTOCOL', 'SRC_IP', 'DST_IP', 'L4_SRC_PORT', 'L4_DST_PORT', 'INNER_SRC_IP', 'INNER_DST_IP'.
231+
2. DST_MAC, ETHERTYPE, VLAN_ID fields are only tested in lag hash test cases, because L2 traffic is needed to test these fields, and there is no ecmp hash when the traffic is fowarded in L2.
232+
3. IPv4 and IPv6 are covered in the test, but the versions(including the inner version when testing the inner fields) are randomly selected in the test cases.
233+
4. For the inner fields, three types of encapsulations are covered: IPinIP, VxLAN and NVGRE. For the VxLAN packet, the default port 4789 and a custom port 13330 are covered in the test.
234+
5. For the reboot test, reboot type is randomly selected from config reload, cold, warm and fast reboot.
235+
6. The random selections of hash fields, ip versions, encapsulation types and reboot types can be controlled by pytest options. The user is able to set each of the option as 'random', 'all', or a specific value.
236+
237+
### Test cases #1 - test_hash_capability
238+
1. Get the supported hash fields via cli "show switch-hash capabilities"
239+
2. Check the fields are as expected.
240+
241+
### Test cases #2 - test_ecmp_hash
242+
1. The test is using the default links and routes in a t0/t1 testbed.
243+
2. Randomly select a hash field and configure it to the ecmp hash list via cli "config switch-hash global ecmp-hash".
244+
3. Configure the lag hash list to exclude the selected field to verify the lag hash configuration does not affect the hash result.
245+
4. Send traffic with changing values of the field under test from a downlink ptf port to uplink destination via multiple nexthops.
246+
5. Check the traffic is balanced over the nexthops.
247+
6. If the uplinks are portchannels with multiple members, check the traffic is not balanced over the members.
248+
249+
### Test cases #3 - test_lag_hash
250+
1. The test is using the default links and routes in a t0/t1 testbed, and only runs on setups which have multi-member portchannel uplinks.
251+
2. Randomly select a hash field and configure it to the lag hash list via cli "config switch-hash global lag-hash".
252+
3. Configure the ecmp hash list to exclude the selected field to verify the ecmp hash configuration does not affect the hash result.
253+
4. If the hash field is DST_MAC, ETHERTYPE or VLAN_ID, take the steps 5-7, otherwise skip them.
254+
5. Choose one downlink interface and one uplink interface, remove all ip/ipv6 addresses on them.
255+
6. Remove the downlink interface from the existing vlan if it is t0 topology.
256+
7. For the DST_MAC, ETHERTYPE fields, add the chosen interfaces to a same vlan; For VLAN_ID field, add the interfaces to multiple vlans.
257+
8. Send traffic with changing values of the field under test from a downlink ptf port to uplink destination via the portchannels.
258+
9. Check the traffic is forwarded through only one portchannel and is balanced over the members.
259+
260+
### Test cases #4 - test_ecmp_and_lag_hash
261+
1. The test is using the default links and routes in a t0/t1 testbed.
262+
2. Configure all the supported hash fields for the ecmp and lag hash.
263+
3. Randomly select one hash field to test.
264+
4. Send traffic with changing values of the field under test from a downlink ptf port to uplink destination.
265+
5. Check the traffc is balanced over all the uplink physical ports.
266+
267+
### Test cases #5 - test_nexthop_flap
268+
1. The test is using the default links and routes in a t0/t1 testbed.
269+
2. Configure all the supported hash fields for the ecmp and lag hash.
270+
3. Randomly select one hash field to test.
271+
4. Send traffic with changing values of the field under test from a downlink ptf port to uplink destination.
272+
5. Check the traffic is balanced over all the uplink ports.
273+
6. Randomly shutdown 1 nexthop interface.
274+
7. Send the traffic again.
275+
8. Check the traffic is balanced over all remaining uplink ports with no packet loss.
276+
9. Recover the interface and do shutdown/startup on the interface 3 more times.
277+
10. Send the traffic again.
278+
11. Check the traffic is balanced over all uplink ports with no packet loss.
279+
280+
### Test cases #6 - test_lag_member_flap
281+
1. The test is using the default links and routes in a t0/t1 testbed, and only runs on setups which have multi-member portchannel uplinks.
282+
2. Configure all the supported hash fields for the ecmp and lag hash.
283+
3. Randomly select one hash field to test.
284+
4. If the hash field is DST_MAC, ETHERTYPE or VLAN_ID, take the steps 5-7, otherwise skip them.
285+
5. Choose one downlink interface and one uplink interface, remove all ip/ipv6 addresses on them.
286+
6. Remove the downlink interface from the existing vlan if it is t0 topology.
287+
7. For the DST_MAC, ETHERTYPE fields, add the chosen interfaces to a same vlan; For VLAN_ID field, add the interfaces to multiple vlans.
288+
8. Send traffic with changing values of the field under test from a downlink ptf port to uplink destination.
289+
9. Check the traffic is balanced over all the uplink ports.
290+
10. Randomly shutdown 1 member port in all uplink portchannels.
291+
11. Send the traffic again.
292+
12. Check the traffic is balanced over all remaining uplink ports with no packet loss.
293+
13. Recover the members and do shutdown/startup on them 3 more times.
294+
14. Send the traffic again.
295+
15. Check the traffic is balanced over all uplink ports with no packet loss.
296+
297+
### Test cases #7 - test_lag_member_remove_add
298+
1. The test is using the default links and routes in a t0/t1 testbed, and only runs on setups which have multi-member portchannel uplinks.
299+
2. Configure all the supported hash fields for the ecmp and lag hash.
300+
3. Randomly select one hash field to test.
301+
4. If the hash field is DST_MAC, ETHERTYPE or VLAN_ID, take the steps 5-7, otherwise skip them.
302+
5. Choose one downlink interface and one uplink interface, remove all ip/ipv6 addresses on them.
303+
6. Remove the downlink interface from the existing vlan if it is t0 topology.
304+
7. For the DST_MAC, ETHERTYPE fields, add the chosen interfaces to a same vlan; For VLAN_ID field, add the interfaces to multiple vlans.
305+
8. Send traffic with changing values of the field under test from a downlink ptf port to uplink destination.
306+
9. Check the traffic is balanced over all the uplink ports.
307+
10. Randomly remove 1 member port from each uplink portchannels.
308+
11. Add the member ports back to the portchannels.
309+
12. Send the traffic again.
310+
13. Check the traffic is balanced over all uplink ports with no packet loss.
311+
312+
### Test cases #8 - test_reboot
313+
1. The test is using the default links and routes in a t0/t1 testbed.
314+
2. Configure all the supported hash fields for the ecmp and lag hash.
315+
3. Randomly select one hash field to test.
316+
4. Randomly select a reboot type from reload or fast/warm/cold reboot, if reload or cold reboot, save the configuration before the reload/reboot.
317+
5. Send traffic with changing values of the field under test from a downlink ptf port to uplink destination.
318+
6. Check the traffic is balanced over all the uplink ports.
319+
7. Do the reload/reboot.
320+
8. After the reload/reboot, check the generic hash config via cli, it should keep the same as it was before the reload/reboot.
321+
9. Send traffic again.
322+
10. Check the traffic is balanced over all the uplink ports.
323+
324+
### Test cases #9 - test_backend_error_messages
325+
1. Config ecmp and lag hash via cli.
326+
2. Remove the ecmp hash key via redis cli.
327+
3. Check there is a warning printed in the syslog.
328+
4. Remove the lag hash key via redis cli.
329+
5. Check there is a warning printed in the syslog.
330+
6. Re-config the ecmp and lag hash via cli.
331+
7. Update the ecmp hash fields with an invalid value via redis cli.
332+
8. Check there is a warning printed in the syslog.
333+
9. Update the lag hash fields with an invalid value via redis cli.
334+
10. Check there is a warning printed in the syslog.
335+
11. Re-config the ecmp and lag hash via cli.
336+
12. Remove the generic hash key via redis cli.
337+
13. Check there is a warning printed in the syslog.

0 commit comments

Comments
 (0)