Skip to content

Support for not using all ports on hardware DUT for testing#1744

Closed
sanmalho-git wants to merge 4 commits intosonic-net:masterfrom
sanmalho-git:fanout
Closed

Support for not using all ports on hardware DUT for testing#1744
sanmalho-git wants to merge 4 commits intosonic-net:masterfrom
sanmalho-git:fanout

Conversation

@sanmalho-git
Copy link
Contributor

Description of PR

Summary:
Support for not using all ports on hardware DUT for testing and thus not connecting all the ports to a fanout switch.

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Approach

What is the motivation for this PR?

Currently, it is required that all ports on DUT are in use and are connected to a fanout.
However, there is a need to be able to run tests where all ports are not in use. Specifically, when dealing with

  • new hardware boxes
  • boxes with a front panel port used for in-band management
  • boxes with lots of ports and multiple asics, were every port on every asic is not required to be covered
  • chassis as a DUT, where the number of ports can be in hundreds
  • higher bandwidth ports like 400G
    • hard to go from 400G down to 1/10G

Also, majority of the basic functional testing can be done without testing all the ports on the DUT.

How did you do it?

To support above in orchestration, following changes were made:

  • dut_fp_ports is changed to be a dictionary instead of a list, and
  • vlan_base as an optional ansible variable passed into testbed-cli.sh.

In current orchestration in add-topo, the dut_fp_ports is a list of nic’s on the testbed server corresponding to the ports on the DUT. In the topology file, the 'host_interfaces' and 'vlans' for VM's are defined as an offset from the 'vlan_base'. When we 'bind' the topology (using vm_topology module), it uses this offset as the index into the dut_fp_ports to get the corresponding nic on the testbed server to the DUT's port. So, when having lesser number of elements in dut_fp_ports than the offsets defined in the topology results in index of out bound exception.

By changing dut_fp_ports to be a dictionary with key being this offset from 'vlan_base' and the value being the actual nic on the testbed server corresponding to the DUT's port, we avoid the above issue. This 'vlan_base' defaults to 0 (to give backward compatability) and can be specified as an extra ansible arg (using -e option). If it is not specified (has value 0), then the key is the same as the index in the original dut_fp_ports list.

For example, let's consider that we have a topology with only 2 ports (3 and 51) on the DUT connected to the fanout, with fanout vlans 103 and 151 respectively,
'vlan_base' of 100, and the trunk port on the testbed server being 'eno2'. Then dut_fp_ports would look like:

        {
          '3' :  'eno2.103',
          `51' : 'eno2.151
        }

For a topology where we 32 ports connected to fanout with vlans 100 - 132 on trunk port eno2, and vlan_base is not specified, then the resulting dut_fp_ports would be

     {
       '0':  'eno2.100',
       '1':  'eno2.101',
       '2':  'eno2.102',
        .
        .
        '31': 'eno2.132'
     }

How did you verify/test it?

  • Ran add/remove-topo against KVM SONiC DUT and ceos containers against t0 topology and validated BGP and ptf connectivity
    ./testbed-cli.sh -t vtestbed.csv -m veos.vtb -k ceos add-topo vms-kvm-t0 password.txt 
    ./testbed-cli.sh -t vtestbed.csv -m veos.vtb deploy-mg vms-kvm-t0 lab password.txt
    ansible-playbook -i lab -l vlab-01 test_sonic.yml -e testbed_name=vms-kvm-t0 -e testcase_name=fdb -e testbed_file=vtestbed.csv 
    ./testbed-cli.sh -t vtestbed.csv -m veos.vtb -k ceos remove-topo vms-kvm-t0 password.txt
  • Ran add/remove-topo against a SONiC DUT with all ports connected to a fanout against t0 topology and validated BGP and ptf connectivity
    ./testbed-cli.sh -t testbed.csv -m veos.vtb -k ceos add-topo vms-s6100-t0 password.txt
    ./testbed-cli.sh -t testbed.csv -m veos.vtb deploy-mg vms-s6100-t0 lab password.txt
    ./testbed-cli.sh -t testbed.csv -m veos.vtb -k ceos remove-topo vms-s6100-t0 password.txt
  • Ran add/remove-topo against a SONiC DUT with only 2 ports connected to fanout against
    topology defining only 1 VM and 1 'host-interfaces' and validated BGP and ptf connectivity
    ./testbed-cli.sh -t testbed.csv -m veos.vtb -k ceos add-topo vms-s6100-2p password.txt -e vlan_base=100
    ./testbed-cli.sh -t testbed.csv -m veos.vtb deploy-mg vms-s6100-2p lab password.txt
    ./testbed-cli.sh -t testbed.csv -m veos.vtb -k ceos remove-topo vms-s6100-2p password.txt -e vlan_base=100

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

@ghost
Copy link

ghost commented Jun 8, 2020

CLA assistant check
All CLA requirements met.

Had extra spaces and indented with 2 white spaces instead of 4
@sanmalho-git sanmalho-git requested a review from yxieca June 11, 2020 21:44
Copy link
Collaborator

@yxieca yxieca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with 2 very cosmetic issues.

@sanmalho-git
Copy link
Contributor Author

The solution proposed works if we have a one-to-one mapping between a leaf fanout switch and DUT. But, if we want to share the leaf fanout amongst multiple DUT's, then this restricts the ports on the DUT that are connected to the fanout - they have to be all unique and contiguous amongst the multiple DUTs.

Working on an implementation the would use the connection graph to create the dut_fp_ports dictionary. The connection graph 'device_port_vlans' has the DUT port and fanout vlan info. For example:
"device_port_vlans": {
"Ethernet19": {
"mode": "Access",
"vlanids": "114",
"vlanlist": [
114
]
},
"Ethernet9": {
"mode": "Access",
"vlanids": "113",
"vlanlist": [
113
]
}
},

This along with the device hwsku can help get the right key mapping for the dut_fp_ports dictionary.

@wangxin
Copy link
Collaborator

wangxin commented Jun 24, 2020

retest vsimage please

@jleveque
Copy link
Contributor

jleveque commented Jul 2, 2020

@sanmalho-git: The newly-added platform API tests (specifically the SFP tests) run under the assumption that transceivers are connected to all ports on the device. With this change, we will also need to update those tests to understand which ports should have transceivers connected. Could you please also provide an example of what a test should do to determine whether a port is expected to be connected?

@sanmalho-git
Copy link
Contributor Author

@jleveque: We only list the ports that are connected in the connection graph (eg. lab_connection_graph.xml). In our test, we use 'conn_graphs' ansible module to get these connections and expect transciever to be present only on those ports that are in the connection graph.

eg. sonic_lab_links.csv:

et_6448m_52x-r0_pizza3,,,Ethernet3,ixr7220D1pizza3,,,Ethernet3,1000,103,Access
et_6448m_52x-r0_pizza3,,,Ethernet51,ixr7220D1pizza3,,,Ethernet51,100000,151,Access

One other challenge for us right now is that we have SONiC box with 1G copper ports as well. We are trying to figure out how we would be able to distinguish those as ports present in connection graph, but still no transceiver is present on them (eg. Ethernet3 from above sample sonic_lab_links.csv). Currently, we are using the 'bandwidth' to determine this. If it is 1000, then we expect no transceiver. Our SONiC box has 48 copper 1G port, and 4 10G SFP ports. But, this won't work for 1G Fiber ports.

@jleveque
Copy link
Contributor

jleveque commented Jul 2, 2020

One other challenge for us right now is that we have SONiC box with 1G copper ports as well. We are trying to figure out how we would be able to distinguish those as ports present in connection graph, but still no transceiver is present on them (eg. Ethernet3 from above sample sonic_lab_links.csv). Currently, we are using the 'bandwidth' to determine this. If it is 1000, then we expect no transceiver. Our SONiC box has 48 copper 1G port, and 4 10G SFP ports. But, this won't work for 1G Fiber ports.

In this case, we may want to consider adding a "transceiver type" and/or "cable type" to the connection graph. It would also be beneficial for my SFP tests, in order to know which type of transceiver to expect when testing.

@yxieca
Copy link
Collaborator

yxieca commented Jul 29, 2020

@sanmalho-git can you address the merge conflict?

@sanmalho-git
Copy link
Contributor Author

This approach doesn't work well for us, and so are closing this pull request. Will open another PR with a better solution.

@sanmalho-git sanmalho-git deleted the fanout branch January 6, 2021 17:47
kazinator-arista pushed a commit to kazinator-arista/sonic-mgmt that referenced this pull request Mar 4, 2026
…#8355)

To include following changes:

* d84a8cc 2021-08-05 | [fast-reboot] revert the change of disabling counter polling before fast-reboot (sonic-net#1744) (HEAD -> 202012, github/202012) [Ying Xie]
* e900bc5 2021-08-04 | Add script null_route_helper (sonic-net#1718) [bingwang-ms]
* 85f14e1 2021-08-02 | disk_check updates: (sonic-net#1736) [Renuka Manavalan]
* d68ac1c 2021-05-27 | [console][show] Force refresh all lines status during show line (sonic-net#1641) [Blueve]
* a0e417f 2021-04-25 | [console] Display success message after line cleared (sonic-net#1579) [Blueve]
* 0c6bb27 2021-04-07 | [console] Include Flow Control status in show line result (sonic-net#1549) [Blueve]
kazinator-arista pushed a commit to kazinator-arista/sonic-mgmt that referenced this pull request Mar 4, 2026
8b149a3 Load the  database global_db only once for show cli  (sonic-net#1712)
cd0e560 [config][interface][speed] Fixed the config interface speed in multiasic issue (sonic-net#1739)
b595ba6 [fast-reboot] revert the change of disabling counter polling before fast-reboot (sonic-net#1744)
8518820 [minigraph] Donot enable PFC watchdog for MgmtTsToR (sonic-net#1734)
2213774 [CLI][show][bgp] Fix the show ip bgp network command (sonic-net#1733)
3526507 [configlet] Python3 compatible syntax for extracting a key from the dict (sonic-net#1721)
5b56b97 [sonic_installer] don't print errors when installing an image not supporting app ext (sonic-net#1719)
a581955 [LLDP] Fix lldpshow script to enable display multiple MAC addresses on the same remote physical interface (sonic-net#1657)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants