Skip to content

[Lag 2] Allow lacp timing tests to retry limited times until succeeded#403

Merged
lguohan merged 4 commits intosonic-net:masterfrom
yxieca:lag_2
Jan 13, 2018
Merged

[Lag 2] Allow lacp timing tests to retry limited times until succeeded#403
lguohan merged 4 commits intosonic-net:masterfrom
yxieca:lag_2

Conversation

@yxieca
Copy link
Collaborator

@yxieca yxieca commented Dec 27, 2017

  • [X ] Test case(new/improvement)

Approach

How did you do it?
lag_2 test has a pretty high failure rate. Most failures are at the lacp frame timing test. This test is testing if DUT would receive LACP frames at the internal consistent with the rate setting, fast: 1 sec, normal: 30 seconds. Test failures are almost always failed to receive 2 packets at the specified interval.

There are too many reasons that could happen. I propose that as long as we receive a pair of frames fits the expected interval within reasonable number of retries (10), the test should pass.

How did you verify/test it?
Repeated execute the lag_2 test, normally the test would fail with 10 retries. With the change, test executed for 70+ times.

lacp timing test sometimes fails on checking lacp frame timing. Retry up to 10 times or
until a success test is observed.
@lguohan
Copy link
Contributor

lguohan commented Dec 29, 2017

can we get a 10 samples and pick the medium value, if the medium value falls into the range, then we are good.

@yxieca
Copy link
Collaborator Author

yxieca commented Jan 3, 2018

Sure, I modified the test to collect N (default N=3) packet intervals and test on the median interval.

'''
self.dataplane = ptf.dataplane_instance

def getTimeInterval(self, masked_exp_pkt):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to getMedianInterval to reflect the actual function

Copy link
Contributor

@lguohan lguohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

self.assertTrue(rcv_pkt != None, "Failed to receive LACP packet\n")

# Get current packet timing
curr_pkt_time = round(float(curr_pkt_time), 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if the round op is really needed or not, but anyway.

@lguohan lguohan merged commit ea8a6dd into sonic-net:master Jan 13, 2018
@yxieca yxieca deleted the lag_2 branch January 15, 2018 20:27
praveen-li pushed a commit to praveen-li/sonic-mgmt that referenced this pull request Jun 20, 2019
* msft_github/master:
  [PFCWD]: Add Add support for t0 toplogy and arista fanout testbed (sonic-net#424)
  add orchagent process check before each test and sanity after each test (sonic-net#431)
  Move mem_check.sh into helpers/ directory to conform with location of other helper scripts (sonic-net#435)
  fix tests for t0-116 topology (sonic-net#430)
  Updated labinfo file with Arista fanout sku (sonic-net#429)
  add missing snmp testcases to top level snmp.yml; fixed wrong test file for neighbour test; change arp test comment more readable (sonic-net#428)
  Fix sku-sensors-data for 7060 (sonic-net#427)
  Fix port alias mapping for Arista-7050QX-32S (sonic-net#426)
  [Lag 2] Allow lacp timing tests to retry limited times until succeeded (sonic-net#403)
  [hwsku]: Add Accton-AS7716-32X (sonic-net#405)
  add generate minigrah using testbed_name option (sonic-net#425)
  Update README.testbed.md
  Create README.testbed.Example.md
  Update README.testbed.Config.md
  add missing test case name (sonic-net#417)
  fix typo when assign dscp_mode for decap test (sonic-net#418)
  [repeat harness] rework repeat harness and introduce test case continuous reboot (sonic-net#416)
  allow upgrade sonic via onie or sonic-to-sonic upgrade method (sonic-net#413)
  [topology]: Fix t1-64-lag topology device link base indices (sonic-net#414)
  Run sudo command with log_analyzer (sonic-net#415)
  [sonic tests] use host time instead of ansible time (sonic-net#410)
  [upgrade_sonic]: Fix the hostname in the wait_for condition (sonic-net#411)
  [Test cases] clean up some test cases changes and enabling more tests (sonic-net#409)
  add restart_swss (sonic-net#412)
  [sonic test] introduce a repeat harness (sonic-net#402)
  Update README.test.md
  [bgp_speaker]: always clean up test environment after it finishes (sonic-net#406)
  [vlan]: add vlan test to test by testname (sonic-net#408)
  [bug]: convert the interface from unicode to string (sonic-net#407)
  [VLAN test] Add VLAN test (sonic-net#375)
  [vm]: change the default Front plane port to 4 instead of 8 (sonic-net#404)
  Update README.testbed.Setup.md
  [acl]: Adopt tests with acl-loader to t0 and t1-lag topo. (sonic-net#370)
  add connect_topo command for testbed-cli.sh (sonic-net#398)
  [FIB, BGP speaker] fix t0-116 topology source port list (sonic-net#396)
  [bgp_speaker]: Specify VLAN IP route in case LPM to other nexthop (sonic-net#394)
  [sensor]: Change sensor labels for Arista 7050 and 7260 (sonic-net#395)
  Delete Arista DPMs sensors data (sonic-net#393)
  [bug]: change maximum packet to 9114 to send in the mtu test (sonic-net#391)
sdszhang pushed a commit to sdszhang/sonic-mgmt that referenced this pull request Jun 30, 2025
…logging for easier debugging. (sonic-net#403)

<!--
Please make sure you've read and understood our contributing guidelines;
https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md

Please provide following information to help code review process a bit easier:
-->
### Description of PR
<!--
- Please include a summary of the change and which issue is fixed.
- Please also include relevant motivation and context. Where should reviewer start? background context?
- List any dependencies that are required for this change.
-->

Summary:
The PR adds a retry for SSH connection if fails after reboot, along with logging for easier debugging.

### Type of change

<!--
- Fill x for your type of change.
- e.g.
- [x] Bug fix
-->

- [ ] Bug fix
- [ ] Testbed and Framework(new/improvement)
- [ ] New Test case
 - [ ] Skipped for non-supported platforms
- [x] Test case improvement

### Back port request
- [ ] 202012
- [ ] 202205
- [ ] 202305
- [ ] 202311
- [ ] 202405
- [X] 202411

### Approach
#### What is the motivation for this PR?

To improve debugging by adding logs and retries for SSH connection failures after a reboot.

#### How did you do it?

Adds a retry with a quick timeout and no expected "search_regex".
Adds logs to capture ping status.
Raises an exception if the first connection attempt fails.

#### How did you verify/test it?

#### Any platform specific information?

#### Supported testbed topology if it's a new test case?

Not a new test case.

### Documentation
<!--
(If it's a new feature, new test case)
Did you update documentation/Wiki relevant to your implementation?
Link to the wiki page?
-->
kazinator-arista pushed a commit to kazinator-arista/sonic-mgmt that referenced this pull request Mar 4, 2026
…tomatically (sonic-net#19318)

#### Why I did it
src/sonic-linux-kernel
```
* 152aa10 - (HEAD -> 202305, origin/202305) Fix optoe's write_max when using native i2c driver (sonic-net#401) (14 hours ago) [Prince George]
* ac4a1db - [ci] Migrate to sonicbld1es agent pool. (sonic-net#397) (sonic-net#403) (19 hours ago) [mssonicbld]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants