Skip to content

[Nvidia] [DualTor] Skip dscp 2 and 6 in test case test_tunnel_decap_dscp_to_queue_mapping#10419

Merged
bingwang-ms merged 1 commit intosonic-net:masterfrom
congh-nvidia:dscp_remapping
Jan 16, 2024
Merged

[Nvidia] [DualTor] Skip dscp 2 and 6 in test case test_tunnel_decap_dscp_to_queue_mapping#10419
bingwang-ms merged 1 commit intosonic-net:masterfrom
congh-nvidia:dscp_remapping

Conversation

@congh-nvidia
Copy link
Copy Markdown
Contributor

Description of PR

Summary:
For Nvidia platforms, the inner dscp 2 and 6 are considered invalid use cases, skip the test.
The dscp 2 and 6 are reserved only for the outer dscp after the remapping and encapsulation.
We have another similar issue fixed by PR #9625.

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 201911
  • 202012
  • 202205
  • 202305

Approach

What is the motivation for this PR?

Fix the test failure of test_tunnel_decap_dscp_to_queue_mapping on Nvidia platforms.

How did you do it?

Skip the test for dscp 2 and 6

How did you verify/test it?

Run the test case on 4600c dualtor testbed, passed

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

For Nvidia platforms, the inner dscp 2 and 6 are considered invalid use cases, skip the test.
@bingwang-ms
Copy link
Copy Markdown
Collaborator

In current design, DSCP 2/6 is not used at the inner layer in dualtor setup. So the skip makes sense. But can you clarify what's the outer DSCP value for inner DSCP 2/6? And which queue is the traffic mapped to at the active side?

@congh-nvidia
Copy link
Copy Markdown
Contributor Author

Hi @bingwang-ms, according the mapping configuration, the test will generate outer dscp 0 for inner dscp 2 and 6, and send the encapsulated packets to the uplink port of the active tor. The queue will be checked based on the inner dscp 2 and 6:

            pytest_assert(check_queue_counter(rand_selected_dut, [dualtor_meta['selected_port']],
                                              tunnel_qos_maps['inner_dscp_to_queue_map'][inner_dscp], PKT_NUM),
                          "The queue counter for DSCP {} Queue {} is not as expected"
                          .format(inner_dscp, tunnel_qos_maps['inner_dscp_to_queue_map'][inner_dscp]))

https://github.com/sonic-net/sonic-mgmt/blob/42d66da79440cfb8db0d12576921b77fdabacfb4/tests/qos/test_tunnel_qos_remap.py#L236C47-L236C89
In the test, the queues for inner dscp 2 and 6 are considered as 2 and 6 on Nvidia platforms(which is correct according to mapping configuration but not expected to happen in real world).
However, on Nvidia platforms, the TC and queue are decided by the outer dscp(which is 0) for the bounced back packets on the active tor, which will cause the test to fail.
I can fix the mapping in the test to pass it on Nvidia platforms, but I think it will be meaningless and confusing since we don't really expect tunnel packets with inner dscp 2 or 6 on the active tor.

@bingwang-ms
Copy link
Copy Markdown
Collaborator

Hi @congh-nvidia . Thanks for the clarification. Please correct me if I missed anything

  • Standby side
    Packet with Inner DSCP 2/6 will be encapsulated with outer DSCP 0 following DSCP_TO_TC_MAP (DSCP 2/6 -> TC 1) and TC_TO_DSCP_MAP (TC 1 -> DSCP 0)

  • Active side
    Packet with Inner DSCP 2/6 and outer DSCP 0 are being decapsulated and then mapped to TC 1 (following DSCP_TO_TC_MAP|AZRE_TUNNEL) and then mapped to Queue 0 (following TC_TO_QUEUE_MAP).

So the expectation should be the traffic with Inner DSCP 2/6 being mapped to Queue 0 at the active side. Is there something wrong with the expectation?

@congh-nvidia
Copy link
Copy Markdown
Contributor Author

congh-nvidia commented Dec 5, 2023

Hi @bingwang-ms , my former explanation is a little confusing, please see my reply below:

  • Standby side

Packet with Inner DSCP 2/6 will be encapsulated with outer DSCP 0 following DSCP_TO_TC_MAP (DSCP 2/6 -> TC 1) and TC_TO_DSCP_MAP (TC 1 -> DSCP 0)

[CH] On Nvidia platform, firstly this is considered as invalid use case. And if the switch receives packets with inner dscp 2/6, they will be mapped to outer dscp 2 and 6 during the encap on standby tor.
This is because the mapping configurations are different on Nvidia platforms when dscp remapping is enabled: https://github.com/sonic-net/sonic-mgmt/blob/master/tests/qos/files/tunnel_qos_map_nvidia.json, and we use the
"AZURE_UPLINK" in "DSCP_TO_TC_MAP" for this mapping.

  • Active side

Packet with Inner DSCP 2/6 and outer DSCP 0 are being decapsulated and then mapped to TC 1 (following DSCP_TO_TC_MAP|AZRE_TUNNEL) and then mapped to Queue 0 (following TC_TO_QUEUE_MAP).

[CH] In the test, the packet is crafted as 0/2 and 0/6, and it is expected to map it to TC after the decapsulation based on the inner dscp. However, on Nvidia platforms, the expected outer dscp for inner dscp 2/6 are also 2/6, and we do the dscp to TC mapping based on the outer dscp instead of inner dscp. Which means, in the test, because the inner dscps are 2 and 6, according to the logic for Nvidia platforms (added by me in PR #8317, added the tunnel_qos_map_nvidia.json for Nvidia platforms), it will use TC 2 and 6 to map the queues, not TC1. (Without PR #8317, TC1 will be used)
Actually, the change in PR#8317 was for the valid inner dscps, but still we need to handle the invalid 2/6.

In summary, the difference of the mapping configuration and the fact that we do the decap mapping based on outer dscp instead of inner dscp cause the unexpected results in inner dscp 2 and 6. For other dscps, PR#8317 makes sure we get the correct expectations.
For inner dscp 2 and 6, since they are invalid use cases, we suggest skip them in the current tests. But if you feel still it's necessary to cover them, we can further adjust the logic for Nvidia platform.

@stephenxs please point out if there are any mistakes in my explanation, thanks.

@bingwang-ms
Copy link
Copy Markdown
Collaborator

@congh-nvidia Thanks! I see the gap here. The DSCP_TO_TC_MAP is different on Mellanox.

On Mellanox

 "TC_TO_DSCP_MAP": {
        "AZURE_TUNNEL": {
            "0": "8",
            "1": "0",
            "2": "2",
            "3": "2",
            "4": "6",
            "5": "46",
            "6": "6",
            "7": "48",
            "8": "33"
        }
    },

Other vendor

"TC_TO_DSCP_MAP": {
        "AZURE_TUNNEL": {
            "0": "8",
            "1": "0",
            "2": "0",
            "3": "2",
            "4": "6",
            "5": "46",
            "6": "0",
            "7": "48",
            "8": "33"
        }
    },

@stephenxs Do you still remember why we mapped TC 2 to DSCP 2, and TC 6 to DSCP 6?

@congh-nvidia
Copy link
Copy Markdown
Contributor Author

congh-nvidia commented Dec 12, 2023

Hi @stephenxs, could you help answer Bing's question?
And @bingwang-ms , we have another PR which is opened basically for the same reason as this one:
#10419

@stephenxs
Copy link
Copy Markdown
Contributor

Hi @stephenxs, could you help answer Bing's question? And @bingwang-ms , we have another PR which is opened basically for the same reason as this one: #10419

Yes.
@bingwang-ms
This is to enable DSCP rewrite logic for the encapsulation packet.

@bingwang-ms bingwang-ms merged commit 95edeb9 into sonic-net:master Jan 16, 2024
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Jan 17, 2024
…nic-net#10419)

For Nvidia platforms, the inner dscp 2 and 6 are considered invalid use cases, skip the test.
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Jan 17, 2024
…nic-net#10419)

For Nvidia platforms, the inner dscp 2 and 6 are considered invalid use cases, skip the test.
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Jan 17, 2024
…nic-net#10419)

For Nvidia platforms, the inner dscp 2 and 6 are considered invalid use cases, skip the test.
@mssonicbld
Copy link
Copy Markdown
Collaborator

Cherry-pick PR to 202311: #11301

@mssonicbld
Copy link
Copy Markdown
Collaborator

Cherry-pick PR to 202205: #11302

@mssonicbld
Copy link
Copy Markdown
Collaborator

Cherry-pick PR to 202305: #11303

mssonicbld pushed a commit that referenced this pull request Jan 17, 2024
…0419)

For Nvidia platforms, the inner dscp 2 and 6 are considered invalid use cases, skip the test.
mssonicbld pushed a commit that referenced this pull request Jan 17, 2024
…0419)

For Nvidia platforms, the inner dscp 2 and 6 are considered invalid use cases, skip the test.
mssonicbld pushed a commit that referenced this pull request Jan 22, 2024
…0419)

For Nvidia platforms, the inner dscp 2 and 6 are considered invalid use cases, skip the test.
@congh-nvidia congh-nvidia deleted the dscp_remapping branch October 17, 2024 02:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants