Skip to content

Fix dhcpv6 testcase failures in Dual TOR Setups#11695

Merged
kevinskwang merged 7 commits intosonic-net:masterfrom
santoss3:dhcpv6_dual_tor_fix
Apr 1, 2024
Merged

Fix dhcpv6 testcase failures in Dual TOR Setups#11695
kevinskwang merged 7 commits intosonic-net:masterfrom
santoss3:dhcpv6_dual_tor_fix

Conversation

@santoss3
Copy link
Contributor

Description of PR

In Dual TOR scenario the "dhcpv6" testcase is failing due to the following reasons -

  1. in AA Setup the packets from server may land into any TOR due to ECMP, whereas the testcase expects it to land on DUT port
  • Fix is to toggle non-DUT ports so that the dhcpv6 request will always be received on the DUT
  1. The PTF DHCPCounterTest and DHCPTest toggles the interface to get the proper link-local address, this impacts the link health in Dual-Tor scenario due to missing ICMP HB
  • Fix is to skip the above step for Dual Tor testcase Scenarios

Summary:
Fixes # Failures in dhcp_relay/test_dhcpv6_relay.py in Dual Tor Setups

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 201911
  • 202012
  • 202205
  • 202305
  • 202311

Approach

What is the motivation for this PR?

Fix Failures in dhcp_relay/test_dhcpv6_relay.py in Dual Tor Setups

How did you do it?

Specified in PR Description

How did you verify/test it?

Verified that the dhcpv6 testcase passes after the code changes.

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

@mssonicbld
Copy link
Collaborator

The pre-commit check detected issues in the files touched by this pull request.
The pre-commit check is a mandatory check, please fix detected issues.

Detailed pre-commit check results:
trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check yaml...........................................(no files to check)Skipped
check for added large files..............................................Passed
check python ast.........................................................Passed
flake8...................................................................Failed
- hook id: flake8
- exit code: 1

ansible/roles/test/files/ptftests/py3/dhcpv6_relay_test.py:137:13: E265 block comment should start with '# '
tests/dhcp_relay/test_dhcpv6_relay.py:243:1: E302 expected 2 blank lines, found 1
tests/dhcp_relay/test_dhcpv6_relay.py:260:1: E302 expected 2 blank lines, found 1

flake8...............................................(no files to check)Skipped
check conditional mark sort..........................(no files to check)Skipped

To run the pre-commit checks locally, you can follow below steps:

  1. Ensure that default python is python3. In sonic-mgmt docker container, default python is python2. You can run
    the check by activating the python3 virtual environment in sonic-mgmt docker container or outside of sonic-mgmt
    docker container.
  2. Ensure that the pre-commit package is installed:
sudo pip install pre-commit
  1. Go to repository root folder
  2. Install the pre-commit hooks:
pre-commit install
  1. Use pre-commit to check staged file:
pre-commit
  1. Alternatively, you can check committed files using:
pre-commit run --from-ref <commit_id> --to-ref <commit_id>

bpar9
bpar9 previously approved these changes Mar 1, 2024
@XuChen-MSFT
Copy link
Contributor

after applied this changes, and run local test on 202305 branch, still observe dhcpv6 failure. Please check testlog at Mar/3 and address it

image

@santoss3
Copy link
Contributor Author

santoss3 commented Mar 7, 2024

Hi Xu,
I apologize, there was some issue with the patch with I addressed it yesterday.
Can you please try again with the latest diff?
Thanks

Copy link
Collaborator

@bpar9 bpar9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bpar9
Copy link
Collaborator

bpar9 commented Mar 20, 2024

@santoss3 , please check/rerun the sonic-mgmt failures in the workflow

@bpar9
Copy link
Collaborator

bpar9 commented Mar 20, 2024

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@XuChen-MSFT
Copy link
Contributor

@santoss3 try latest commit, hit failure and error

=========================== short test summary info ============================
SKIPPED [1] dhcp_relay/test_dhcpv6_relay.py:347: skip the link flap testcase on dual tor testbeds
SKIPPED [1] dhcp_relay/test_dhcpv6_relay.py:394: skip the uplinks down testcase on dual tor testbeds
ERROR dhcp_relay/test_dhcpv6_relay.py::test_dhcpv6_relay_counter - Failed: Pr...
FAILED dhcp_relay/test_dhcpv6_relay.py::test_dhcpv6_relay_counter - Assertion...
FAILED dhcp_relay/test_dhcpv6_relay.py::test_dhcp_relay_default - tests.commo...
==== 2 failed, 1 passed, 2 skipped, 1 warning, 1 error in 299.45s (0:04:59) ====

@santoss3
Copy link
Contributor Author

Hi @XuChen-MSFT ,
I re-ran the tetstcase in Active-Active Setup with SONiC.azure_cisco_202305.9841-dirty-20240309.111714 and its passing
dhcp_relay/test_dhcpv6_relay.py::test_interface_binding PASSED [ 20%]
dhcp_relay/test_dhcpv6_relay.py::test_dhcpv6_relay_counter PASSED [ 40%]
dhcp_relay/test_dhcpv6_relay.py::test_dhcp_relay_default PASSED [ 60%]
dhcp_relay/test_dhcpv6_relay.py::test_dhcp_relay_after_link_flap SKIPPED (skip the link flap testcase on dual tor testbeds) [ 80%]
dhcp_relay/test_dhcpv6_relay.py::test_dhcp_relay_start_with_uplinks_down SKIPPED (skip the uplinks down testcase on dual tor testbeds) [100%]

================================================================================= warnings summary =================================================================================
../../usr/local/lib/python3.8/dist-packages/_yaml/init.py:18
/usr/local/lib/python3.8/dist-packages/_yaml/init.py:18: DeprecationWarning: The _yaml extension module is now located at yaml._yaml and its location is subject to change. To use the LibYAML-based parser and emitter, import from yaml: from yaml import CLoader as Loader, CDumper as Dumper.
warnings.warn(

-- Docs: How to capture warnings — pytest documentation
--------------------------------------------------------- generated xml file: /data/tests/logs/tr_2024-03-21-11-12-59.xml ----------------------------------------------------------
============================================================================= short test summary info ==============================================================================
SKIPPED [1] dhcp_relay/test_dhcpv6_relay.py:368: skip the link flap testcase on dual tor testbeds
SKIPPED [1] dhcp_relay/test_dhcpv6_relay.py:415: skip the uplinks down testcase on dual tor testbeds
=============================================================== 3 passed, 2 skipped, 1 warning in 327.74s (0:05:27) ================================================================
INFO:root:Can not get Allure report URL. Please check logs

Can you please confirm if you are running this in AA topology and have the following fix present in the SONiC image ?
sonic-net/sonic-buildimage@60dc4d2

You may verify this by checking if the dhcp6relay process in dhcp_relay docker is running with "-u Loopback0" argument.
cisco@m64-tor-0-yy41:~$ docker exec -it dhcp_relay bash
root@m64-tor-0-yy41:/# ps -ax
PID TTY STAT TIME COMMAND
1 pts/0 Ss+ 0:01 /usr/bin/python3 /usr/local/bin/supervisord
10 pts/0 Sl 0:00 python3 /usr/bin/supervisor-proc-exit-listener --container-name dhcp_relay
13 pts/0 Sl 0:00 /usr/sbin/rsyslogd -n -iNONE
182 pts/0 S 0:00 /usr/sbin/dhcp6relay -u Loopback0
183 pts/0 Sl 0:00 /usr/sbin/dhcrelay -d -m discard -a %h:%p %P --name-alias-map-file /tmp/port-name-alias-map.txt -id Vlan1000 -U Loopback0 -dt -iu PortChannel101 -iu Po
187 pts/0 Sl 0:00 /usr/sbin/dhcpmon -id Vlan1000 -u Loopback0 -iu PortChannel101 -iu PortChannel102 -iu PortChannel103 -iu PortChannel104 -im eth0
725 pts/1 Ss 0:00 bash
731 pts/1 R+ 0:00 ps -ax
root@m64-tor-0-yy41:/#

Thanks & Regards,
Santosh

@kevinskwang kevinskwang merged commit 5e1015c into sonic-net:master Apr 1, 2024
Copy link
Collaborator

@lolyu lolyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lolyu
Copy link
Collaborator

lolyu commented Apr 2, 2024

Hi @StormLiangMS, please cherry-pick this into 202305, thanks!

mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Apr 2, 2024
* ACL testcases in A/A setup need to ALLOW the keepalives to ensure both TORs don't go in STANDBY mode.

* reverting_old_pr_fix

* dhcpv6_dual_tor_fix

* pre-commit checks

* Added reference to right fixture
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202305: #12276

mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Apr 2, 2024
* ACL testcases in A/A setup need to ALLOW the keepalives to ensure both TORs don't go in STANDBY mode.

* reverting_old_pr_fix

* dhcpv6_dual_tor_fix

* pre-commit checks

* Added reference to right fixture
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202311: #12277

mssonicbld pushed a commit that referenced this pull request Apr 2, 2024
* ACL testcases in A/A setup need to ALLOW the keepalives to ensure both TORs don't go in STANDBY mode.

* reverting_old_pr_fix

* dhcpv6_dual_tor_fix

* pre-commit checks

* Added reference to right fixture
mssonicbld pushed a commit that referenced this pull request Apr 2, 2024
* ACL testcases in A/A setup need to ALLOW the keepalives to ensure both TORs don't go in STANDBY mode.

* reverting_old_pr_fix

* dhcpv6_dual_tor_fix

* pre-commit checks

* Added reference to right fixture
lolyu added a commit to lolyu/sonic-mgmt that referenced this pull request Apr 9, 2024
wangxin pushed a commit that referenced this pull request Apr 22, 2024
…12309)

What is the motivation for this PR?
dhcpv6 relay failure in dualtor-aa was fixed by #11695
Hence remove skip condition

How did you do it?
Remove skip test_dhcpv6_relay in dualtor-aa

How did you verify/test it?
Run test
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Apr 22, 2024
…#12046)" (sonic-net#12309)

What is the motivation for this PR?
dhcpv6 relay failure in dualtor-aa was fixed by sonic-net#11695
Hence remove skip condition

How did you do it?
Remove skip test_dhcpv6_relay in dualtor-aa

How did you verify/test it?
Run test
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Apr 22, 2024
…#12046)" (sonic-net#12309)

What is the motivation for this PR?
dhcpv6 relay failure in dualtor-aa was fixed by sonic-net#11695
Hence remove skip condition

How did you do it?
Remove skip test_dhcpv6_relay in dualtor-aa

How did you verify/test it?
Run test
mssonicbld pushed a commit that referenced this pull request Apr 22, 2024
…12309)

What is the motivation for this PR?
dhcpv6 relay failure in dualtor-aa was fixed by #11695
Hence remove skip condition

How did you do it?
Remove skip test_dhcpv6_relay in dualtor-aa

How did you verify/test it?
Run test
mssonicbld pushed a commit that referenced this pull request Apr 22, 2024
…12309)

What is the motivation for this PR?
dhcpv6 relay failure in dualtor-aa was fixed by #11695
Hence remove skip condition

How did you do it?
Remove skip test_dhcpv6_relay in dualtor-aa

How did you verify/test it?
Run test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants