Skip to content

Fix arp/test_unknown_mac by disabling arp_update#13410

Merged
StormLiangMS merged 3 commits intosonic-net:masterfrom
justin-wong-ce:master
Aug 14, 2024
Merged

Fix arp/test_unknown_mac by disabling arp_update#13410
StormLiangMS merged 3 commits intosonic-net:masterfrom
justin-wong-ce:master

Conversation

@justin-wong-ce
Copy link
Contributor

@justin-wong-ce justin-wong-ce commented Jun 21, 2024

Description of PR

The DUT will occasionally (~5 mins) send a echo request through IPv6 to the PTF container. When the PTF container replies, it populates the fdb table. This is bad because the test expects the fdb table to be empty.

Temporarily disabling IPv6 on the PTF container during the test will eliminate the problem.
Temporarily disabling arp_update on the DUT during the test will eliminate the problem.

Summary:
Fixes #

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 202012
  • 202205
  • 202305
  • 202311
  • 202405

Approach

What is the motivation for this PR?

Test was flaky. Monitoring tcpdump on the PTF container and the DUT's fdb table shows IPv6 echo is causing mac addresses to be learned on the DUT even though it was previously flushed.

How did you do it?

Disable IPv6.

How did you verify/test it?

Disabling IPv6 will allow the test to consistently past. If IPv6 is re-enabled during the test and packets are sent, it immediately fails.

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

@mssonicbld
Copy link
Collaborator

The pre-commit check detected issues in the files touched by this pull request.
The pre-commit check is a mandatory check, please fix detected issues.

Detailed pre-commit check results:
trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check yaml...........................................(no files to check)Skipped
check for added large files..............................................Passed
check python ast.........................................................Passed
flake8...................................................................Failed
- hook id: flake8
- exit code: 1

tests/arp/test_unknown_mac.py:45:1: E302 expected 2 blank lines, found 1
tests/arp/test_unknown_mac.py:47:4: E111 indentation is not a multiple of 4
tests/arp/test_unknown_mac.py:54:4: E111 indentation is not a multiple of 4
tests/arp/test_unknown_mac.py:55:4: E111 indentation is not a multiple of 4
tests/arp/test_unknown_mac.py:56:4: E111 indentation is not a multiple of 4
tests/arp/test_unknown_mac.py:58:4: E111 indentation is not a multiple of 4
tests/arp/test_unknown_mac.py:60:4: E111 indentation is not a multiple of 4
tests/arp/test_unknown_mac.py:61:4: E111 indentation is not a multiple of 4
tests/arp/test_unknown_mac.py:63:1: E302 expected 2 blank lines, found 1

flake8...............................................(no files to check)Skipped
...
[truncated extra lines, please run pre-commit locally to view full check results]

To run the pre-commit checks locally, you can follow below steps:

  1. Ensure that default python is python3. In sonic-mgmt docker container, default python is python2. You can run
    the check by activating the python3 virtual environment in sonic-mgmt docker container or outside of sonic-mgmt
    docker container.
  2. Ensure that the pre-commit package is installed:
sudo pip install pre-commit
  1. Go to repository root folder
  2. Install the pre-commit hooks:
pre-commit install
  1. Use pre-commit to check staged file:
pre-commit
  1. Alternatively, you can check committed files using:
pre-commit run --from-ref <commit_id> --to-ref <commit_id>

Copy link
Collaborator

@lolyu lolyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we simply ignore the fdb entries from the IPv6 echo replies?

@justin-wong-ce
Copy link
Contributor Author

Can we simply ignore the fdb entries from the IPv6 echo replies?

I don't believe there is a simple way to just ignore it.
The echo reply would cause the DUT to learn all the mac addresses used in the test (even the "fake" ones that are generated).
There is also no telltale sign from show mac that will indicate that a particular entry is from the echo reply.

Even if there is a way to differentiate ipv6 echo replied fdb entries, I don't believe it will be a good idea to filter or add logic to check, as the goal of the test is to test the behaviour when the mac address is unknown. The test will be the most accurate and functionally correct if the DUT indeed does not know the mac at all.

@justin-wong-ce
Copy link
Contributor Author

Can we simply ignore the fdb entries from the IPv6 echo replies?

Actually, will change it so that we disable IPv6 on the DUT side instead of disabling it on the PTF container. Disabling on PTF may cause some previously populated items to be lost.

@StormLiangMS
Copy link
Collaborator

@lolyu could you help to review?

@mssonicbld
Copy link
Collaborator

The pre-commit check detected issues in the files touched by this pull request.
The pre-commit check is a mandatory check, please fix detected issues.

Detailed pre-commit check results:
trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check yaml...........................................(no files to check)Skipped
check for added large files..............................................Passed
check python ast.........................................................Passed
flake8...................................................................Failed
- hook id: flake8
- exit code: 1

tests/arp/test_unknown_mac.py:44:1: E302 expected 2 blank lines, found 1
tests/arp/test_unknown_mac.py:71:1: E302 expected 2 blank lines, found 1

flake8...............................................(no files to check)Skipped
check conditional mark sort..........................(no files to check)Skipped

To run the pre-commit checks locally, you can follow below steps:

  1. Ensure that default python is python3. In sonic-mgmt docker container, default python is python2. You can run
    the check by activating the python3 virtual environment in sonic-mgmt docker container or outside of sonic-mgmt
    docker container.
  2. Ensure that the pre-commit package is installed:
sudo pip install pre-commit
  1. Go to repository root folder
  2. Install the pre-commit hooks:
pre-commit install
  1. Use pre-commit to check staged file:
pre-commit
  1. Alternatively, you can check committed files using:
pre-commit run --from-ref <commit_id> --to-ref <commit_id>

@justin-wong-ce
Copy link
Contributor Author

Changed to disable IPv6 from DUT instead of PTF. This method of disabling PTF is same as the method used here:
https://github.com/sonic-net/sonic-mgmt/blob/master/tests/qos/qos_sai_base.py#L1815-L1830

@lolyu

This comment was marked as outdated.

Copy link
Collaborator

@lolyu lolyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can disable the arp refresh by disabling the arp_update:

# docker exec -it swss supervisorctl stop arp_update
arp_update: stopped
# docker exec -it swss supervisorctl start arp_update
arp_update: started

@justin-wong-ce
Copy link
Contributor Author

justin-wong-ce commented Jul 12, 2024

Maybe we can disable the arp refresh by disabling the arp_update:

# docker exec -it swss supervisorctl stop arp_update
arp_update: stopped
# docker exec -it swss supervisorctl start arp_update
arp_update: started

Will use this method instead, thanks.

The DUT will occasionally (~5 mins) send a echo request through IPV6 to the ptf container (from `arp_update`), and this is casuing the fdb table to be populated - which is bad because the test expects the fdb table to be empty.

Temporarily disabling arp_update during the test will eliminate the problem.
@justin-wong-ce
Copy link
Contributor Author

Changed the fix method to use the arp_update command instead of disabling ipv6

@mssonicbld
Copy link
Collaborator

The pre-commit check detected issues in the files touched by this pull request.
The pre-commit check is a mandatory check, please fix detected issues.

Detailed pre-commit check results:
trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check yaml...........................................(no files to check)Skipped
check for added large files..............................................Passed
check python ast.........................................................Passed
flake8...................................................................Failed
- hook id: flake8
- exit code: 1

tests/arp/test_unknown_mac.py:44:1: E302 expected 2 blank lines, found 1
tests/arp/test_unknown_mac.py:53:121: E501 line too long (121 > 120 characters)
tests/arp/test_unknown_mac.py:57:121: E501 line too long (122 > 120 characters)
tests/arp/test_unknown_mac.py:59:1: E302 expected 2 blank lines, found 1

flake8...............................................(no files to check)Skipped
check conditional mark sort..........................(no files to check)Skipped

To run the pre-commit checks locally, you can follow below steps:

  1. Ensure that default python is python3. In sonic-mgmt docker container, default python is python2. You can run
    the check by activating the python3 virtual environment in sonic-mgmt docker container or outside of sonic-mgmt
    docker container.
  2. Ensure that the pre-commit package is installed:
sudo pip install pre-commit
  1. Go to repository root folder
  2. Install the pre-commit hooks:
pre-commit install
  1. Use pre-commit to check staged file:
pre-commit
  1. Alternatively, you can check committed files using:
pre-commit run --from-ref <commit_id> --to-ref <commit_id>

1 similar comment
@mssonicbld
Copy link
Collaborator

The pre-commit check detected issues in the files touched by this pull request.
The pre-commit check is a mandatory check, please fix detected issues.

Detailed pre-commit check results:
trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check yaml...........................................(no files to check)Skipped
check for added large files..............................................Passed
check python ast.........................................................Passed
flake8...................................................................Failed
- hook id: flake8
- exit code: 1

tests/arp/test_unknown_mac.py:44:1: E302 expected 2 blank lines, found 1
tests/arp/test_unknown_mac.py:53:121: E501 line too long (121 > 120 characters)
tests/arp/test_unknown_mac.py:57:121: E501 line too long (122 > 120 characters)
tests/arp/test_unknown_mac.py:59:1: E302 expected 2 blank lines, found 1

flake8...............................................(no files to check)Skipped
check conditional mark sort..........................(no files to check)Skipped

To run the pre-commit checks locally, you can follow below steps:

  1. Ensure that default python is python3. In sonic-mgmt docker container, default python is python2. You can run
    the check by activating the python3 virtual environment in sonic-mgmt docker container or outside of sonic-mgmt
    docker container.
  2. Ensure that the pre-commit package is installed:
sudo pip install pre-commit
  1. Go to repository root folder
  2. Install the pre-commit hooks:
pre-commit install
  1. Use pre-commit to check staged file:
pre-commit
  1. Alternatively, you can check committed files using:
pre-commit run --from-ref <commit_id> --to-ref <commit_id>

@justin-wong-ce justin-wong-ce changed the title Fix arp/test_unknown_mac by disabling ipv6 Fix arp/test_unknown_mac by disabling arp_update Jul 15, 2024
@ZhaohuiS
Copy link
Contributor

@lolyu could you please review it again?

rand_selected_dut(AnsibleHost) : dut instance
"""
duthost = rand_selected_dut
assert duthost.shell("docker exec -t swss supervisorctl stop arp_update")['stdout_lines'][0] \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the arp_update stop fails here, LINE#59 will not be executed, we'd better to restart arp_update in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the intention - fail fast if any other test somehow disabled arp_update, or if it is disabled unintentionally.

Though if you prefer to continue the test, I am open to change it so re-enable it. Please let me know your decision, thanks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's restart it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed so it will allow arp_update to remain off and it will restart when the test is done.

Copy link
Collaborator

@lolyu lolyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@StormLiangMS StormLiangMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Aug 14, 2024
What is the motivation for this PR?
Test was flaky. Monitoring tcpdump on the PTF container and the DUT's fdb table shows IPv6 echo is causing mac addresses to be learned on the DUT even though it was previously flushed.

How did you do it?
Disable IPv6.

How did you verify/test it?
Disabling IPv6 will allow the test to consistently past. If IPv6 is re-enabled during the test and packets are sent, it immediately fails.

Any platform specific information?
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202405: #14121

mssonicbld pushed a commit that referenced this pull request Aug 14, 2024
What is the motivation for this PR?
Test was flaky. Monitoring tcpdump on the PTF container and the DUT's fdb table shows IPv6 echo is causing mac addresses to be learned on the DUT even though it was previously flushed.

How did you do it?
Disable IPv6.

How did you verify/test it?
Disabling IPv6 will allow the test to consistently past. If IPv6 is re-enabled during the test and packets are sent, it immediately fails.

Any platform specific information?
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Aug 22, 2024
What is the motivation for this PR?
Test was flaky. Monitoring tcpdump on the PTF container and the DUT's fdb table shows IPv6 echo is causing mac addresses to be learned on the DUT even though it was previously flushed.

How did you do it?
Disable IPv6.

How did you verify/test it?
Disabling IPv6 will allow the test to consistently past. If IPv6 is re-enabled during the test and packets are sent, it immediately fails.

Any platform specific information?
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202311: #14213

mssonicbld pushed a commit that referenced this pull request Aug 22, 2024
What is the motivation for this PR?
Test was flaky. Monitoring tcpdump on the PTF container and the DUT's fdb table shows IPv6 echo is causing mac addresses to be learned on the DUT even though it was previously flushed.

How did you do it?
Disable IPv6.

How did you verify/test it?
Disabling IPv6 will allow the test to consistently past. If IPv6 is re-enabled during the test and packets are sent, it immediately fails.

Any platform specific information?
arista-hpandya pushed a commit to arista-hpandya/sonic-mgmt that referenced this pull request Oct 2, 2024
What is the motivation for this PR?
Test was flaky. Monitoring tcpdump on the PTF container and the DUT's fdb table shows IPv6 echo is causing mac addresses to be learned on the DUT even though it was previously flushed.

How did you do it?
Disable IPv6.

How did you verify/test it?
Disabling IPv6 will allow the test to consistently past. If IPv6 is re-enabled during the test and packets are sent, it immediately fails.

Any platform specific information?
vikshaw-Nokia pushed a commit to vikshaw-Nokia/sonic-mgmt that referenced this pull request Oct 23, 2024
What is the motivation for this PR?
Test was flaky. Monitoring tcpdump on the PTF container and the DUT's fdb table shows IPv6 echo is causing mac addresses to be learned on the DUT even though it was previously flushed.

How did you do it?
Disable IPv6.

How did you verify/test it?
Disabling IPv6 will allow the test to consistently past. If IPv6 is re-enabled during the test and packets are sent, it immediately fails.

Any platform specific information?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants