Skip to content

[crm] Error handler for 'ip neigh show' command#828

Merged
prsunny merged 1 commit intosonic-net:masterfrom
romankachur-mlnx:crm_get_neighbour
Mar 23, 2019
Merged

[crm] Error handler for 'ip neigh show' command#828
prsunny merged 1 commit intosonic-net:masterfrom
romankachur-mlnx:crm_get_neighbour

Conversation

@romankachur-mlnx
Copy link
Contributor

@romankachur-mlnx romankachur-mlnx commented Mar 15, 2019

Signed-off-by: Roman Kachur [email protected]

Description of PR

Summary: Make CRM test more informative when failed
Fixes # (issue)

Type of change

  • [] Bug fix
  • [] Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Approach

How did you do it?

When Ansible performs ip -4 neigh show, it presumes that the returned value is always valid,
and manipulates on it without validation.
If the returned value is empty, it results in ambiguous error value like
"ERROR! list object has no element 0"

I added conditional fail operator, to make the failure more informative:

    - fail: msg="Get Next Hop IP failed. Neighbour not found"
      when: out.stdout == ""

How did you verify/test it?

Run CRM test with both: all neighbors are Reachable, and some neighbours are Failed.

Now when the command fails, it returns more informative error:
"msg": "Get Next Hop IP failed. Neighbor not found"

Instead of previous:
"ERROR! list object has no element 0"

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thinks this will never execute because grep will fail in previous task if didn't find anything.
Could you add an error message for this case?
Or better to ignore grep failure in previous task and fail here.

Copy link
Contributor Author

@romankachur-mlnx romankachur-mlnx Mar 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I split Get Hext Hop IPv6 into two steps:
shell: ip -6 neigh show dev {{crm_intf}} nud reachable nud stale
shell: echo {{ out.stdout }} | grep -v fe80

Now fail: msg="Get Next Hop IP failed. Neighbour not found" definitely means empty grep result (and not ip -6 neigh show dev failure)

@romankachur-mlnx romankachur-mlnx force-pushed the crm_get_neighbour branch 4 times, most recently from a639a6d to 01dfa66 Compare March 15, 2019 17:21
@prsunny
Copy link
Contributor

prsunny commented Mar 20, 2019

@romankachur-mlnx , if we encounter this, it means the CRM test failed because there is no available neighbor while adding a route. However this is not exactly a test failure. I'm not sure if we must sleep and retry, since this would be a transient state that neighbor entry expired.

Copy link
Contributor

@prsunny prsunny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the comment!

@romankachur-mlnx
Copy link
Contributor Author

romankachur-mlnx commented Mar 21, 2019

Please check the comment!

@prsunny This PR aims to make the failure more informative, i.e. to show the relevant reason of failure. And not to fix or workaround the issue.

@prsunny prsunny merged commit a15fd11 into sonic-net:master Mar 23, 2019
deerao02 pushed a commit to deerao02/sonic-mgmt that referenced this pull request Dec 18, 2025
…for IPv6 (sonic-net#828)

<!--
Please make sure you've read and understood our contributing guidelines;
https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md

Please provide following information to help code review process a bit easier:
-->
### Description of PR
<!--
- Please include a summary of the change and which issue is fixed.
- Please also include relevant motivation and context. Where should reviewer start? background context?
- List any dependencies that are required for this change.
-->

Summary:
Fix the following error on t0-isolated-d96u32s2 topo.
```
arp/test_arp_extended.py::test_proxy_arp[v6-str5-7060x6-moby-512-1-None]
-------------------------------- live log call ---------------------------------
03/10/2025 02:10:49 __init__.pytest_runtest_call L0040 ERROR | Traceback (most recent call last):
 File "/usr/local/lib/python3.8/dist-packages/_pytest/python.py", line 1788, in runtest
 self.ihook.pytest_pyfunc_call(pyfuncitem=self)
 File "/usr/local/lib/python3.8/dist-packages/pluggy/_hooks.py", line 513, in __call__
 return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
 File "/usr/local/lib/python3.8/dist-packages/pluggy/_manager.py", line 120, in _hookexec
 return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
 File "/usr/local/lib/python3.8/dist-packages/pluggy/_callers.py", line 139, in _multicall
 raise exception.with_traceback(exception.__traceback__)
 File "/usr/local/lib/python3.8/dist-packages/pluggy/_callers.py", line 103, in _multicall
 res = hook_impl.function(*args)
 File "/usr/local/lib/python3.8/dist-packages/_pytest/python.py", line 194, in pytest_pyfunc_call
 result = testfunction(**testargs)
 File "/var/src/sonic-mgmt_vms91-t0-7060x6-moby-512-1/tests/arp/test_arp_extended.py", line 98, in test_proxy_arp
 testutils.verify_packet(ptfadapter, expected_packet, ptf_intf_index, timeout=10)
 File "/usr/local/lib/python3.8/dist-packages/ptf/testutils.py", line 3250, in verify_packet
 test.fail(
 File "/usr/lib/python3.8/unittest/case.py", line 753, in fail
 raise self.failureException(msg)
AssertionError: Expected packet was not received on device 0, port 32.
```

When ndppd is starting up, it will read the whole ipv6 routes via `/proc/net/ipv6_route`. For topology with large ipv6 routes and next hops, this reading may take long time to finish.

### Type of change

<!--
- Fill x for your type of change.
- e.g.
- [x] Bug fix
-->

- [ ] Bug fix
- [ ] Testbed and Framework(new/improvement)
- [ ] New Test case
 - [ ] Skipped for non-supported platforms
- [ ] Test case improvement

### Back port request
- [ ] 202205
- [ ] 202305
- [ ] 202311
- [ ] 202405
- [ ] 202411
- [x] 202505

### Approach
#### What is the motivation for this PR?
Fix test_arp_extended.py::test_proxy_arp failure for large topo.

#### How did you do it?
Add an extra delay in test case by reading `/proc/net/ipv6_route` to match up with the behavior in ndppd.

#### How did you verify/test it?
physical testbed.

#### Any platform specific information?

#### Supported testbed topology if it's a new test case?

### Documentation
<!--
(If it's a new feature, new test case)
Did you update documentation/Wiki relevant to your implementation?
Link to the wiki page?
-->
kazinator-arista pushed a commit to kazinator-arista/sonic-mgmt that referenced this pull request Mar 4, 2026
e438b0db6a8912b50f7acddf93d4dc2157f53ecf (HEAD -> 201911, origin/201911) Increase Syncd operation timeout from 1 min to 6 min. (sonic-net#828)
17974adb369111b44dd56837547806918ed4b1ed Update syncd_flex_counter.cpp (sonic-net#798)

Signed-off-by: Abhishek Dosi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants