Converge cEOSLab peer containers via VRFs#22171
Merged
yxieca merged 4 commits intosonic-net:masterfrom Feb 12, 2026
Merged
Conversation
Converging the total number of peer switches into the fewest possible number of cEOSLab containers reduces the overall resource constraints required to run large numbers of peers. The basic premises behind convergence are as follows: Approach: cEOSLab peers in docker containers may be converged into a smaller number of host peers. The SONiC-facing configuration of each BGP peer may be separated in routing and bridging via the use of VRFs. The PTF-facing configuration of each BGP peer may be separated within each VRF via VLAN tagging, enabling the use of a single backplane interface on each host cEOSLab container. Each VRF includes a number of interfaces either facing the SONiC DUT or the backplane. Changes should be as transparent to the SONiC DUT as possible. At the time of testbed setup, the ansible topology file for the testbed is modified to include new metadata specific to multi-vrf configuration, and the VMs list is trimmed to only include those containers which will host multiple BGP peerings, separated by VRF. The new metadata includes mappings between host containers and VRFs, backplane VLAN mappings, and BGP session parameters. VLAN tag 2000 is used as the starting value for all VLANs between the test infrastructure PTF container interfaces and cEOSLab device interfaces. The IP and IPv6 addresses used to connect the cEOSLab peer and infrastructure PTF container are generated in order to make the backplane connections clearer, more unique, and easier to implement. In general, backplane L3 addresses used by the CEOSLab peer end in even numbers, and those used by the PTF container end in odd numbers. All addresses generated for use in backplane connections start with the value 100 (0x64) in the least-significant octet or hextet (depending on the family of the address). The address changes are mapped and stored in the new multi-vrf metadata in the ansible topology file. Multiple BGP features, such as local-as and next-hop-peer, are used in order to aid in the resolution of routes. This is necessary to keep the SONiC DUT multi-vrf-agnostic as possible. Enabling multi-VRF mode: Multi-VRF mode may be enabled by including the set attribute use_converged_peers: true in the testbed definition found in sonic-mgmt/ansible/testbed.yaml. This file is read the TesbedProcessing.py script, which sets global variables indicating to other ansible tasks and libraries that the testbed is to be started in multi-VRF mode. In addition, the value of max_fp_nums must be adjusted such that each CEOSLab docker container has enough resources to run all the new BGP sessions in each vrf. This can be done dynamically, of course, however for the full-scale topologies the maximum supported by cEOSLab, 127, must be used. Known limitations: cEOSLab instances do not allow for the creation of interfaces with interface-IDs greater than 127, when interfaces are layed out unidimensionally. The use of multiple VRFs has not been tested in conjunction with asynchronous ansible tasks. Test library changes: Test libraries needed to be made aware of the new underlying structure of cEOSLab containers, VRFs, and BGP adjacencies. In many cases this was done by reference to the testbed topology YAML passed into library functions. In other cases, most notably BGP libraries, the nbrhosts fixture was adjusted to include multi-VRF-specific metadata which callers could leverage to navigate the relationship between containers, VRFs, and BGP neighborship. Signed-off-by: Will Rideout <[email protected]>
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Resolve all pylint issues. Signed-off-by: Will Rideout <[email protected]>
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
yutongzhang-microsoft
approved these changes
Feb 9, 2026
Contributor
Author
|
@r12f this looks ready to go, just needs final say |
r12f
approved these changes
Feb 10, 2026
Collaborator
|
signed off since we have tried it already, and our latest nightly run pass rate suggests the same result as regular topo. |
Collaborator
|
@wrideout-arista PR conflicts with 202511 branch |
Contributor
Author
|
I will manually cast. |
Contributor
Author
12 tasks
yxieca
pushed a commit
that referenced
this pull request
Feb 13, 2026
What is the motivation for this PR? In PR #22171, support for multi-VRF testbeds was introduced. Although multi-VRF testbeds require a different definition file format than standard testbeds, the required files can be derived from existing definitions through a conversion process. This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention. How did you do it? This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention. How did you verify/test it? We have already tested it in our internal repo. Signed-off-by: Yutong Zhang <[email protected]>
Contributor
|
@wrideout-arista I tried renumber and found a bug, please help review bugfix PR #22423 |
anilal-amd
pushed a commit
to anilal-amd/anilal-forked-sonic-mgmt
that referenced
this pull request
Feb 19, 2026
* Converge cEOSLab peer containers via VRFs Converging the total number of peer switches into the fewest possible number of cEOSLab containers reduces the overall resource constraints required to run large numbers of peers. The basic premises behind convergence are as follows: Approach: cEOSLab peers in docker containers may be converged into a smaller number of host peers. The SONiC-facing configuration of each BGP peer may be separated in routing and bridging via the use of VRFs. The PTF-facing configuration of each BGP peer may be separated within each VRF via VLAN tagging, enabling the use of a single backplane interface on each host cEOSLab container. Each VRF includes a number of interfaces either facing the SONiC DUT or the backplane. Changes should be as transparent to the SONiC DUT as possible. At the time of testbed setup, the ansible topology file for the testbed is modified to include new metadata specific to multi-vrf configuration, and the VMs list is trimmed to only include those containers which will host multiple BGP peerings, separated by VRF. The new metadata includes mappings between host containers and VRFs, backplane VLAN mappings, and BGP session parameters. VLAN tag 2000 is used as the starting value for all VLANs between the test infrastructure PTF container interfaces and cEOSLab device interfaces. The IP and IPv6 addresses used to connect the cEOSLab peer and infrastructure PTF container are generated in order to make the backplane connections clearer, more unique, and easier to implement. In general, backplane L3 addresses used by the CEOSLab peer end in even numbers, and those used by the PTF container end in odd numbers. All addresses generated for use in backplane connections start with the value 100 (0x64) in the least-significant octet or hextet (depending on the family of the address). The address changes are mapped and stored in the new multi-vrf metadata in the ansible topology file. Multiple BGP features, such as local-as and next-hop-peer, are used in order to aid in the resolution of routes. This is necessary to keep the SONiC DUT multi-vrf-agnostic as possible. Enabling multi-VRF mode: Multi-VRF mode may be enabled by including the set attribute use_converged_peers: true in the testbed definition found in sonic-mgmt/ansible/testbed.yaml. This file is read the TesbedProcessing.py script, which sets global variables indicating to other ansible tasks and libraries that the testbed is to be started in multi-VRF mode. In addition, the value of max_fp_nums must be adjusted such that each CEOSLab docker container has enough resources to run all the new BGP sessions in each vrf. This can be done dynamically, of course, however for the full-scale topologies the maximum supported by cEOSLab, 127, must be used. Known limitations: cEOSLab instances do not allow for the creation of interfaces with interface-IDs greater than 127, when interfaces are layed out unidimensionally. The use of multiple VRFs has not been tested in conjunction with asynchronous ansible tasks. Test library changes: Test libraries needed to be made aware of the new underlying structure of cEOSLab containers, VRFs, and BGP adjacencies. In many cases this was done by reference to the testbed topology YAML passed into library functions. In other cases, most notably BGP libraries, the nbrhosts fixture was adjusted to include multi-VRF-specific metadata which callers could leverage to navigate the relationship between containers, VRFs, and BGP neighborship. Signed-off-by: Will Rideout <[email protected]> Signed-off-by: Zhuohui Tan <[email protected]>
anilal-amd
pushed a commit
to anilal-amd/anilal-forked-sonic-mgmt
that referenced
this pull request
Feb 19, 2026
…t#22399) What is the motivation for this PR? In PR sonic-net#22171, support for multi-VRF testbeds was introduced. Although multi-VRF testbeds require a different definition file format than standard testbeds, the required files can be derived from existing definitions through a conversion process. This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention. How did you do it? This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention. How did you verify/test it? We have already tested it in our internal repo. Signed-off-by: Yutong Zhang <[email protected]> Signed-off-by: Zhuohui Tan <[email protected]>
mssonicbld
pushed a commit
to mssonicbld/sonic-mgmt
that referenced
this pull request
Feb 20, 2026
…t#22399) What is the motivation for this PR? In PR sonic-net#22171, support for multi-VRF testbeds was introduced. Although multi-VRF testbeds require a different definition file format than standard testbeds, the required files can be derived from existing definitions through a conversion process. This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention. How did you do it? This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention. How did you verify/test it? We have already tested it in our internal repo. Signed-off-by: Yutong Zhang <[email protected]> Signed-off-by: mssonicbld <[email protected]>
12 tasks
mssonicbld
pushed a commit
that referenced
this pull request
Feb 20, 2026
What is the motivation for this PR? In PR #22171, support for multi-VRF testbeds was introduced. Although multi-VRF testbeds require a different definition file format than standard testbeds, the required files can be derived from existing definitions through a conversion process. This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention. How did you do it? This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention. How did you verify/test it? We have already tested it in our internal repo. Signed-off-by: Yutong Zhang <[email protected]> Signed-off-by: mssonicbld <[email protected]>
aronovic
pushed a commit
to aronovic/sonic-mgmt
that referenced
this pull request
Mar 3, 2026
* Converge cEOSLab peer containers via VRFs Converging the total number of peer switches into the fewest possible number of cEOSLab containers reduces the overall resource constraints required to run large numbers of peers. The basic premises behind convergence are as follows: Approach: cEOSLab peers in docker containers may be converged into a smaller number of host peers. The SONiC-facing configuration of each BGP peer may be separated in routing and bridging via the use of VRFs. The PTF-facing configuration of each BGP peer may be separated within each VRF via VLAN tagging, enabling the use of a single backplane interface on each host cEOSLab container. Each VRF includes a number of interfaces either facing the SONiC DUT or the backplane. Changes should be as transparent to the SONiC DUT as possible. At the time of testbed setup, the ansible topology file for the testbed is modified to include new metadata specific to multi-vrf configuration, and the VMs list is trimmed to only include those containers which will host multiple BGP peerings, separated by VRF. The new metadata includes mappings between host containers and VRFs, backplane VLAN mappings, and BGP session parameters. VLAN tag 2000 is used as the starting value for all VLANs between the test infrastructure PTF container interfaces and cEOSLab device interfaces. The IP and IPv6 addresses used to connect the cEOSLab peer and infrastructure PTF container are generated in order to make the backplane connections clearer, more unique, and easier to implement. In general, backplane L3 addresses used by the CEOSLab peer end in even numbers, and those used by the PTF container end in odd numbers. All addresses generated for use in backplane connections start with the value 100 (0x64) in the least-significant octet or hextet (depending on the family of the address). The address changes are mapped and stored in the new multi-vrf metadata in the ansible topology file. Multiple BGP features, such as local-as and next-hop-peer, are used in order to aid in the resolution of routes. This is necessary to keep the SONiC DUT multi-vrf-agnostic as possible. Enabling multi-VRF mode: Multi-VRF mode may be enabled by including the set attribute use_converged_peers: true in the testbed definition found in sonic-mgmt/ansible/testbed.yaml. This file is read the TesbedProcessing.py script, which sets global variables indicating to other ansible tasks and libraries that the testbed is to be started in multi-VRF mode. In addition, the value of max_fp_nums must be adjusted such that each CEOSLab docker container has enough resources to run all the new BGP sessions in each vrf. This can be done dynamically, of course, however for the full-scale topologies the maximum supported by cEOSLab, 127, must be used. Known limitations: cEOSLab instances do not allow for the creation of interfaces with interface-IDs greater than 127, when interfaces are layed out unidimensionally. The use of multiple VRFs has not been tested in conjunction with asynchronous ansible tasks. Test library changes: Test libraries needed to be made aware of the new underlying structure of cEOSLab containers, VRFs, and BGP adjacencies. In many cases this was done by reference to the testbed topology YAML passed into library functions. In other cases, most notably BGP libraries, the nbrhosts fixture was adjusted to include multi-VRF-specific metadata which callers could leverage to navigate the relationship between containers, VRFs, and BGP neighborship. Signed-off-by: Will Rideout <[email protected]> Signed-off-by: Mihut Aronovici <[email protected]>
aronovic
pushed a commit
to aronovic/sonic-mgmt
that referenced
this pull request
Mar 3, 2026
…t#22399) What is the motivation for this PR? In PR sonic-net#22171, support for multi-VRF testbeds was introduced. Although multi-VRF testbeds require a different definition file format than standard testbeds, the required files can be derived from existing definitions through a conversion process. This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention. How did you do it? This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention. How did you verify/test it? We have already tested it in our internal repo. Signed-off-by: Yutong Zhang <[email protected]> Signed-off-by: Mihut Aronovici <[email protected]>
ravaliyel
pushed a commit
to ravaliyel/sonic-mgmt
that referenced
this pull request
Mar 12, 2026
* Converge cEOSLab peer containers via VRFs Converging the total number of peer switches into the fewest possible number of cEOSLab containers reduces the overall resource constraints required to run large numbers of peers. The basic premises behind convergence are as follows: Approach: cEOSLab peers in docker containers may be converged into a smaller number of host peers. The SONiC-facing configuration of each BGP peer may be separated in routing and bridging via the use of VRFs. The PTF-facing configuration of each BGP peer may be separated within each VRF via VLAN tagging, enabling the use of a single backplane interface on each host cEOSLab container. Each VRF includes a number of interfaces either facing the SONiC DUT or the backplane. Changes should be as transparent to the SONiC DUT as possible. At the time of testbed setup, the ansible topology file for the testbed is modified to include new metadata specific to multi-vrf configuration, and the VMs list is trimmed to only include those containers which will host multiple BGP peerings, separated by VRF. The new metadata includes mappings between host containers and VRFs, backplane VLAN mappings, and BGP session parameters. VLAN tag 2000 is used as the starting value for all VLANs between the test infrastructure PTF container interfaces and cEOSLab device interfaces. The IP and IPv6 addresses used to connect the cEOSLab peer and infrastructure PTF container are generated in order to make the backplane connections clearer, more unique, and easier to implement. In general, backplane L3 addresses used by the CEOSLab peer end in even numbers, and those used by the PTF container end in odd numbers. All addresses generated for use in backplane connections start with the value 100 (0x64) in the least-significant octet or hextet (depending on the family of the address). The address changes are mapped and stored in the new multi-vrf metadata in the ansible topology file. Multiple BGP features, such as local-as and next-hop-peer, are used in order to aid in the resolution of routes. This is necessary to keep the SONiC DUT multi-vrf-agnostic as possible. Enabling multi-VRF mode: Multi-VRF mode may be enabled by including the set attribute use_converged_peers: true in the testbed definition found in sonic-mgmt/ansible/testbed.yaml. This file is read the TesbedProcessing.py script, which sets global variables indicating to other ansible tasks and libraries that the testbed is to be started in multi-VRF mode. In addition, the value of max_fp_nums must be adjusted such that each CEOSLab docker container has enough resources to run all the new BGP sessions in each vrf. This can be done dynamically, of course, however for the full-scale topologies the maximum supported by cEOSLab, 127, must be used. Known limitations: cEOSLab instances do not allow for the creation of interfaces with interface-IDs greater than 127, when interfaces are layed out unidimensionally. The use of multiple VRFs has not been tested in conjunction with asynchronous ansible tasks. Test library changes: Test libraries needed to be made aware of the new underlying structure of cEOSLab containers, VRFs, and BGP adjacencies. In many cases this was done by reference to the testbed topology YAML passed into library functions. In other cases, most notably BGP libraries, the nbrhosts fixture was adjusted to include multi-VRF-specific metadata which callers could leverage to navigate the relationship between containers, VRFs, and BGP neighborship. Signed-off-by: Will Rideout <[email protected]> Signed-off-by: Ravali Yeluri (WIPRO LIMITED) <[email protected]>
ravaliyel
pushed a commit
to ravaliyel/sonic-mgmt
that referenced
this pull request
Mar 12, 2026
…t#22399) What is the motivation for this PR? In PR sonic-net#22171, support for multi-VRF testbeds was introduced. Although multi-VRF testbeds require a different definition file format than standard testbeds, the required files can be derived from existing definitions through a conversion process. This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention. How did you do it? This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention. How did you verify/test it? We have already tested it in our internal repo. Signed-off-by: Yutong Zhang <[email protected]> Signed-off-by: Ravali Yeluri (WIPRO LIMITED) <[email protected]>
abhishek-nexthop
pushed a commit
to nexthop-ai/sonic-mgmt
that referenced
this pull request
Mar 17, 2026
* Converge cEOSLab peer containers via VRFs Converging the total number of peer switches into the fewest possible number of cEOSLab containers reduces the overall resource constraints required to run large numbers of peers. The basic premises behind convergence are as follows: Approach: cEOSLab peers in docker containers may be converged into a smaller number of host peers. The SONiC-facing configuration of each BGP peer may be separated in routing and bridging via the use of VRFs. The PTF-facing configuration of each BGP peer may be separated within each VRF via VLAN tagging, enabling the use of a single backplane interface on each host cEOSLab container. Each VRF includes a number of interfaces either facing the SONiC DUT or the backplane. Changes should be as transparent to the SONiC DUT as possible. At the time of testbed setup, the ansible topology file for the testbed is modified to include new metadata specific to multi-vrf configuration, and the VMs list is trimmed to only include those containers which will host multiple BGP peerings, separated by VRF. The new metadata includes mappings between host containers and VRFs, backplane VLAN mappings, and BGP session parameters. VLAN tag 2000 is used as the starting value for all VLANs between the test infrastructure PTF container interfaces and cEOSLab device interfaces. The IP and IPv6 addresses used to connect the cEOSLab peer and infrastructure PTF container are generated in order to make the backplane connections clearer, more unique, and easier to implement. In general, backplane L3 addresses used by the CEOSLab peer end in even numbers, and those used by the PTF container end in odd numbers. All addresses generated for use in backplane connections start with the value 100 (0x64) in the least-significant octet or hextet (depending on the family of the address). The address changes are mapped and stored in the new multi-vrf metadata in the ansible topology file. Multiple BGP features, such as local-as and next-hop-peer, are used in order to aid in the resolution of routes. This is necessary to keep the SONiC DUT multi-vrf-agnostic as possible. Enabling multi-VRF mode: Multi-VRF mode may be enabled by including the set attribute use_converged_peers: true in the testbed definition found in sonic-mgmt/ansible/testbed.yaml. This file is read the TesbedProcessing.py script, which sets global variables indicating to other ansible tasks and libraries that the testbed is to be started in multi-VRF mode. In addition, the value of max_fp_nums must be adjusted such that each CEOSLab docker container has enough resources to run all the new BGP sessions in each vrf. This can be done dynamically, of course, however for the full-scale topologies the maximum supported by cEOSLab, 127, must be used. Known limitations: cEOSLab instances do not allow for the creation of interfaces with interface-IDs greater than 127, when interfaces are layed out unidimensionally. The use of multiple VRFs has not been tested in conjunction with asynchronous ansible tasks. Test library changes: Test libraries needed to be made aware of the new underlying structure of cEOSLab containers, VRFs, and BGP adjacencies. In many cases this was done by reference to the testbed topology YAML passed into library functions. In other cases, most notably BGP libraries, the nbrhosts fixture was adjusted to include multi-VRF-specific metadata which callers could leverage to navigate the relationship between containers, VRFs, and BGP neighborship. Signed-off-by: Will Rideout <[email protected]> Signed-off-by: Abhishek <[email protected]>
abhishek-nexthop
pushed a commit
to nexthop-ai/sonic-mgmt
that referenced
this pull request
Mar 17, 2026
…t#22399) What is the motivation for this PR? In PR sonic-net#22171, support for multi-VRF testbeds was introduced. Although multi-VRF testbeds require a different definition file format than standard testbeds, the required files can be derived from existing definitions through a conversion process. This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention. How did you do it? This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention. How did you verify/test it? We have already tested it in our internal repo. Signed-off-by: Yutong Zhang <[email protected]> Signed-off-by: Abhishek <[email protected]>
vrajeshe
pushed a commit
to vrajeshe/sonic-mgmt
that referenced
this pull request
Mar 23, 2026
…t#22399) What is the motivation for this PR? In PR sonic-net#22171, support for multi-VRF testbeds was introduced. Although multi-VRF testbeds require a different definition file format than standard testbeds, the required files can be derived from existing definitions through a conversion process. This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention. How did you do it? This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention. How did you verify/test it? We have already tested it in our internal repo. Signed-off-by: Yutong Zhang <[email protected]> Signed-off-by: Venkata Gouri Rajesh Etla <[email protected]>
ravaliyel
pushed a commit
to ravaliyel/sonic-mgmt
that referenced
this pull request
Mar 27, 2026
* Converge cEOSLab peer containers via VRFs Converging the total number of peer switches into the fewest possible number of cEOSLab containers reduces the overall resource constraints required to run large numbers of peers. The basic premises behind convergence are as follows: Approach: cEOSLab peers in docker containers may be converged into a smaller number of host peers. The SONiC-facing configuration of each BGP peer may be separated in routing and bridging via the use of VRFs. The PTF-facing configuration of each BGP peer may be separated within each VRF via VLAN tagging, enabling the use of a single backplane interface on each host cEOSLab container. Each VRF includes a number of interfaces either facing the SONiC DUT or the backplane. Changes should be as transparent to the SONiC DUT as possible. At the time of testbed setup, the ansible topology file for the testbed is modified to include new metadata specific to multi-vrf configuration, and the VMs list is trimmed to only include those containers which will host multiple BGP peerings, separated by VRF. The new metadata includes mappings between host containers and VRFs, backplane VLAN mappings, and BGP session parameters. VLAN tag 2000 is used as the starting value for all VLANs between the test infrastructure PTF container interfaces and cEOSLab device interfaces. The IP and IPv6 addresses used to connect the cEOSLab peer and infrastructure PTF container are generated in order to make the backplane connections clearer, more unique, and easier to implement. In general, backplane L3 addresses used by the CEOSLab peer end in even numbers, and those used by the PTF container end in odd numbers. All addresses generated for use in backplane connections start with the value 100 (0x64) in the least-significant octet or hextet (depending on the family of the address). The address changes are mapped and stored in the new multi-vrf metadata in the ansible topology file. Multiple BGP features, such as local-as and next-hop-peer, are used in order to aid in the resolution of routes. This is necessary to keep the SONiC DUT multi-vrf-agnostic as possible. Enabling multi-VRF mode: Multi-VRF mode may be enabled by including the set attribute use_converged_peers: true in the testbed definition found in sonic-mgmt/ansible/testbed.yaml. This file is read the TesbedProcessing.py script, which sets global variables indicating to other ansible tasks and libraries that the testbed is to be started in multi-VRF mode. In addition, the value of max_fp_nums must be adjusted such that each CEOSLab docker container has enough resources to run all the new BGP sessions in each vrf. This can be done dynamically, of course, however for the full-scale topologies the maximum supported by cEOSLab, 127, must be used. Known limitations: cEOSLab instances do not allow for the creation of interfaces with interface-IDs greater than 127, when interfaces are layed out unidimensionally. The use of multiple VRFs has not been tested in conjunction with asynchronous ansible tasks. Test library changes: Test libraries needed to be made aware of the new underlying structure of cEOSLab containers, VRFs, and BGP adjacencies. In many cases this was done by reference to the testbed topology YAML passed into library functions. In other cases, most notably BGP libraries, the nbrhosts fixture was adjusted to include multi-VRF-specific metadata which callers could leverage to navigate the relationship between containers, VRFs, and BGP neighborship. Signed-off-by: Will Rideout <[email protected]>
ravaliyel
pushed a commit
to ravaliyel/sonic-mgmt
that referenced
this pull request
Mar 27, 2026
…t#22399) What is the motivation for this PR? In PR sonic-net#22171, support for multi-VRF testbeds was introduced. Although multi-VRF testbeds require a different definition file format than standard testbeds, the required files can be derived from existing definitions through a conversion process. This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention. How did you do it? This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention. How did you verify/test it? We have already tested it in our internal repo. Signed-off-by: Yutong Zhang <[email protected]>
selldinesh
pushed a commit
to selldinesh/sonic-mgmt
that referenced
this pull request
Apr 1, 2026
* Converge cEOSLab peer containers via VRFs Converging the total number of peer switches into the fewest possible number of cEOSLab containers reduces the overall resource constraints required to run large numbers of peers. The basic premises behind convergence are as follows: Approach: cEOSLab peers in docker containers may be converged into a smaller number of host peers. The SONiC-facing configuration of each BGP peer may be separated in routing and bridging via the use of VRFs. The PTF-facing configuration of each BGP peer may be separated within each VRF via VLAN tagging, enabling the use of a single backplane interface on each host cEOSLab container. Each VRF includes a number of interfaces either facing the SONiC DUT or the backplane. Changes should be as transparent to the SONiC DUT as possible. At the time of testbed setup, the ansible topology file for the testbed is modified to include new metadata specific to multi-vrf configuration, and the VMs list is trimmed to only include those containers which will host multiple BGP peerings, separated by VRF. The new metadata includes mappings between host containers and VRFs, backplane VLAN mappings, and BGP session parameters. VLAN tag 2000 is used as the starting value for all VLANs between the test infrastructure PTF container interfaces and cEOSLab device interfaces. The IP and IPv6 addresses used to connect the cEOSLab peer and infrastructure PTF container are generated in order to make the backplane connections clearer, more unique, and easier to implement. In general, backplane L3 addresses used by the CEOSLab peer end in even numbers, and those used by the PTF container end in odd numbers. All addresses generated for use in backplane connections start with the value 100 (0x64) in the least-significant octet or hextet (depending on the family of the address). The address changes are mapped and stored in the new multi-vrf metadata in the ansible topology file. Multiple BGP features, such as local-as and next-hop-peer, are used in order to aid in the resolution of routes. This is necessary to keep the SONiC DUT multi-vrf-agnostic as possible. Enabling multi-VRF mode: Multi-VRF mode may be enabled by including the set attribute use_converged_peers: true in the testbed definition found in sonic-mgmt/ansible/testbed.yaml. This file is read the TesbedProcessing.py script, which sets global variables indicating to other ansible tasks and libraries that the testbed is to be started in multi-VRF mode. In addition, the value of max_fp_nums must be adjusted such that each CEOSLab docker container has enough resources to run all the new BGP sessions in each vrf. This can be done dynamically, of course, however for the full-scale topologies the maximum supported by cEOSLab, 127, must be used. Known limitations: cEOSLab instances do not allow for the creation of interfaces with interface-IDs greater than 127, when interfaces are layed out unidimensionally. The use of multiple VRFs has not been tested in conjunction with asynchronous ansible tasks. Test library changes: Test libraries needed to be made aware of the new underlying structure of cEOSLab containers, VRFs, and BGP adjacencies. In many cases this was done by reference to the testbed topology YAML passed into library functions. In other cases, most notably BGP libraries, the nbrhosts fixture was adjusted to include multi-VRF-specific metadata which callers could leverage to navigate the relationship between containers, VRFs, and BGP neighborship. Signed-off-by: Will Rideout <[email protected]> Signed-off-by: selldinesh <[email protected]>
selldinesh
pushed a commit
to selldinesh/sonic-mgmt
that referenced
this pull request
Apr 1, 2026
…t#22399) What is the motivation for this PR? In PR sonic-net#22171, support for multi-VRF testbeds was introduced. Although multi-VRF testbeds require a different definition file format than standard testbeds, the required files can be derived from existing definitions through a conversion process. This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention. How did you do it? This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention. How did you verify/test it? We have already tested it in our internal repo. Signed-off-by: Yutong Zhang <[email protected]> Signed-off-by: selldinesh <[email protected]>
yutongzhang-microsoft
added a commit
to yutongzhang-microsoft/sonic-mgmt
that referenced
this pull request
Apr 1, 2026
- Enabling Multi-VRF Mode: remove Approach #1/#2 split; manual convergence is now correctly described as a fallback when TestbedProcessing.py is not used, not an alternative approach - Test Library Changes: rewrite based on actual code changes in PR sonic-net#22171: - ceos_topo_converger.py (new) - TestbedProcessing.py: use_converged_peers flag handling - topo_facts.py: get_vm_list/get_vlans for multi-VRF - testbed_vm_info.py: VRF-to-host mapping - nbrhosts fixture: multi_vrf_data dict structure - bgp_helpers.py: VRF-aware route checks and port resolution - bgp/conftest.py: VRF-scoped graceful-restart config Co-authored-by: Copilot <[email protected]>
12 tasks
yutongzhang-microsoft
added a commit
to yutongzhang-microsoft/sonic-mgmt
that referenced
this pull request
Apr 1, 2026
- Enabling Multi-VRF Mode: remove Approach #1/#2 split; manual convergence is now correctly described as a fallback when TestbedProcessing.py is not used, not an alternative approach - Test Library Changes: rewrite based on actual code changes in PR sonic-net#22171: - ceos_topo_converger.py (new) - TestbedProcessing.py: use_converged_peers flag handling - topo_facts.py: get_vm_list/get_vlans for multi-VRF - testbed_vm_info.py: VRF-to-host mapping - nbrhosts fixture: multi_vrf_data dict structure - bgp_helpers.py: VRF-aware route checks and port resolution - bgp/conftest.py: VRF-scoped graceful-restart config Co-authored-by: Copilot <[email protected]> Signed-off-by: Yutong Zhang <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of PR
Converging the total number of peer switches into the fewest possible number of cEOSLab containers reduces the overall resource constraints required to run large numbers of peers.
Type of change
Back port request
Approach
cEOSLab peers in docker containers may be converged into a smaller number of host peers. The SONiC-facing configuration of each BGP peer may be separated in routing and bridging via the use of VRFs. The PTF-facing configuration of each BGP peer may be separated within each VRF via VLAN tagging, enabling the use of a single backplane interface on each host cEOSLab container. Each VRF includes a number of interfaces either facing the SONiC DUT or the backplane. Changes should be as transparent to the SONiC DUT as possible. At the time of testbed setup, the ansible topology file for the testbed is modified to include new metadata specific to multi-vrf configuration, and the VMs list is trimmed to only include those containers which will host multiple BGP peerings, separated by VRF. The new metadata includes mappings between host containers and VRFs, backplane VLAN mappings, and BGP session parameters.
VLAN tag 2000 is used as the starting value for all VLANs between the test infrastructure PTF container interfaces and cEOSLab device interfaces.
The IP and IPv6 addresses used to connect the cEOSLab peer and infrastructure PTF container are generated in order to make the backplane connections clearer, more unique, and easier to implement. In general, backplane L3 addresses used by the CEOSLab peer end in even numbers, and those used by the PTF container end in odd numbers. All addresses generated for use in backplane connections start with the value 100 (0x64) in the least-significant octet or hextet (depending on the family of the address). The address changes are mapped and stored in the new multi-vrf metadata in the ansible topology file.
Multiple BGP features, such as local-as and next-hop-peer, are used in order to aid in the resolution of routes. This is necessary to keep the SONiC DUT multi-vrf-agnostic as possible.
Enabling multi-VRF mode
Multi-VRF mode may be enabled by including the set attribute use_converged_peers: true in the testbed definition found in sonic-mgmt/ansible/testbed.yaml, as below.
This file is read the TesbedProcessing.py script, which sets global variables indicating to other ansible tasks and libraries that the testbed is to be started in multi-VRF mode.
If the use of TestbedProcessing.py is not desired, the convergence script may be invoked manually, thus:
In the above example, the convergence script is run with the same input and output file, which causes the topology file to be overwritten in sonic-mgmt/ansible/vars. A different output file may be provided via argument.
In addition, the value of max_fp_nums must be adjusted such that each CEOSLab docker container has enough resources to run all the new BGP sessions in each vrf. This can be done dynamically, of course, however for the full-scale topologies the maximum supported by cEOSLab, 127, must be used.
Known limitations
cEOSLab instances do not allow for the creation of interfaces with interface-IDs greater than 127, when interfaces are layed out unidimensionally.
The use of multiple VRFs has not been tested in conjunction with asynchronous ansible tasks.
Test library changes
Test libraries needed to be made aware of the new underlying structure of cEOSLab containers, VRFs, and BGP adjacencies. In many cases this was done by reference to the testbed topology YAML passed into library functions. In other cases, most notably BGP libraries, the nbrhosts fixture was adjusted to include multi-VRF-specific metadata which callers could leverage to navigate the relationship between containers, VRFs, and BGP neighborship.