Skip to content

Converge cEOSLab peer containers via VRFs#22171

Merged
yxieca merged 4 commits intosonic-net:masterfrom
wrideout-arista:master_multi_vrf
Feb 12, 2026
Merged

Converge cEOSLab peer containers via VRFs#22171
yxieca merged 4 commits intosonic-net:masterfrom
wrideout-arista:master_multi_vrf

Conversation

@wrideout-arista
Copy link
Copy Markdown
Contributor

@wrideout-arista wrideout-arista commented Jan 29, 2026

Description of PR

Converging the total number of peer switches into the fewest possible number of cEOSLab containers reduces the overall resource constraints required to run large numbers of peers.

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • New Test case
  • Skipped for non-supported platforms
  • Test case improvement

Back port request

  • 202205
  • 202305
  • 202311
  • 202405
  • 202411
  • 202505
  • 202511

Approach

cEOSLab peers in docker containers may be converged into a smaller number of host peers. The SONiC-facing configuration of each BGP peer may be separated in routing and bridging via the use of VRFs. The PTF-facing configuration of each BGP peer may be separated within each VRF via VLAN tagging, enabling the use of a single backplane interface on each host cEOSLab container. Each VRF includes a number of interfaces either facing the SONiC DUT or the backplane. Changes should be as transparent to the SONiC DUT as possible. At the time of testbed setup, the ansible topology file for the testbed is modified to include new metadata specific to multi-vrf configuration, and the VMs list is trimmed to only include those containers which will host multiple BGP peerings, separated by VRF. The new metadata includes mappings between host containers and VRFs, backplane VLAN mappings, and BGP session parameters.

VLAN tag 2000 is used as the starting value for all VLANs between the test infrastructure PTF container interfaces and cEOSLab device interfaces.

The IP and IPv6 addresses used to connect the cEOSLab peer and infrastructure PTF container are generated in order to make the backplane connections clearer, more unique, and easier to implement. In general, backplane L3 addresses used by the CEOSLab peer end in even numbers, and those used by the PTF container end in odd numbers. All addresses generated for use in backplane connections start with the value 100 (0x64) in the least-significant octet or hextet (depending on the family of the address). The address changes are mapped and stored in the new multi-vrf metadata in the ansible topology file.

Multiple BGP features, such as local-as and next-hop-peer, are used in order to aid in the resolution of routes. This is necessary to keep the SONiC DUT multi-vrf-agnostic as possible.

Enabling multi-VRF mode

Multi-VRF mode may be enabled by including the set attribute use_converged_peers: true in the testbed definition found in sonic-mgmt/ansible/testbed.yaml, as below.

AzDevOps@8b88d4457891:/data/ansible$ cat testbed.yaml
---
- auto_recover: true
  comment: Tests Arista Arista-7050CX3-32S-C6S104
  conf-name: ardut
  dut:
  - ld600
  group-name: ardut
  inv_name: lab
  netns_mgmt_ip: 10.250.32.3/20
  ptf: ptf
  ptf_extra_mgmt_ip: ''
  ptf_image_name: docker-ptf
  ptf_ip: 10.243.72.20/21
  ptf_ipv6: fdfd:5c41:712d:d043:4c94:4dff:fefb:ae85/64
  server: server_1
  topo: m1-108
  use_converged_peers: true
  vm_base: VM0100

This file is read the TesbedProcessing.py script, which sets global variables indicating to other ansible tasks and libraries that the testbed is to be started in multi-VRF mode.

If the use of TestbedProcessing.py is not desired, the convergence script may be invoked manually, thus:

sonic-mgmt/ansible $ python3
Python 3.9.21 (main, Aug 19 2025, 00:00:00)
[GCC 11.5.0 20240719 (Red Hat 11.5.0-5)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from ceos_topo_converger import converge_testbed
>>> converge_testbed( "vars/topo_t1-isolated-d448u15-lag.yml", "vars/topo_t1-isolated-d448u15-lag.yml")
>>>

In the above example, the convergence script is run with the same input and output file, which causes the topology file to be overwritten in sonic-mgmt/ansible/vars. A different output file may be provided via argument.

In addition, the value of max_fp_nums must be adjusted such that each CEOSLab docker container has enough resources to run all the new BGP sessions in each vrf. This can be done dynamically, of course, however for the full-scale topologies the maximum supported by cEOSLab, 127, must be used.

Known limitations

cEOSLab instances do not allow for the creation of interfaces with interface-IDs greater than 127, when interfaces are layed out unidimensionally.

The use of multiple VRFs has not been tested in conjunction with asynchronous ansible tasks.

Test library changes

Test libraries needed to be made aware of the new underlying structure of cEOSLab containers, VRFs, and BGP adjacencies. In many cases this was done by reference to the testbed topology YAML passed into library functions. In other cases, most notably BGP libraries, the nbrhosts fixture was adjusted to include multi-VRF-specific metadata which callers could leverage to navigate the relationship between containers, VRFs, and BGP neighborship.

Converging the total number of peer switches into the fewest possible
number of cEOSLab containers reduces the overall resource constraints
required to run large numbers of peers. The basic premises behind
convergence are as follows:

Approach:

cEOSLab peers in docker containers may be converged into a smaller
number of host peers.  The SONiC-facing configuration of each BGP peer
may be separated in routing and bridging via the use of VRFs.  The
PTF-facing configuration of each BGP peer may be separated within each
VRF via VLAN tagging, enabling the use of a single backplane interface
on each host cEOSLab container.  Each VRF includes a number of
interfaces either facing the SONiC DUT or the backplane.  Changes should
be as transparent to the SONiC DUT as possible.  At the time of testbed
setup, the ansible topology file for the testbed is modified to include
new metadata specific to multi-vrf configuration, and the VMs list is
trimmed to only include those containers which will host multiple BGP
peerings, separated by VRF. The new metadata includes mappings between
host containers and VRFs, backplane VLAN mappings, and BGP session
parameters.

VLAN tag 2000 is used as the starting value for all VLANs between the
test infrastructure PTF container interfaces and cEOSLab device
interfaces.

The IP and IPv6 addresses used to connect the cEOSLab peer and
infrastructure PTF container are generated in order to make the
backplane connections clearer, more unique, and easier to implement. In
general, backplane L3 addresses used by the CEOSLab peer end in even
numbers, and those used by the PTF container end in odd numbers. All
addresses generated for use in backplane connections start with the
value 100 (0x64) in the least-significant octet or hextet (depending on
the family of the address). The address changes are mapped and stored in
the new multi-vrf metadata in the ansible topology file.

Multiple BGP features, such as local-as and next-hop-peer, are used in
order to aid in the resolution of routes. This is necessary to keep the
SONiC DUT multi-vrf-agnostic as possible.

Enabling multi-VRF mode:

Multi-VRF mode may be enabled by including the set attribute
use_converged_peers: true in the testbed definition found in
sonic-mgmt/ansible/testbed.yaml. This file is read the
TesbedProcessing.py script, which sets global variables indicating to
other ansible tasks and libraries that the testbed is to be started in
multi-VRF mode.

In addition, the value of max_fp_nums must be adjusted such that each
CEOSLab docker container has enough resources to run all the new BGP
sessions in each vrf. This can be done dynamically, of course, however
for the full-scale topologies the maximum supported by cEOSLab, 127,
must be used.

Known limitations:

cEOSLab instances do not allow for the creation of interfaces with
interface-IDs greater than 127, when interfaces are layed out
unidimensionally.

The use of multiple VRFs has not been tested in conjunction with
asynchronous ansible tasks.

Test library changes:

Test libraries needed to be made aware of the new underlying structure
of cEOSLab containers, VRFs, and BGP adjacencies.  In many cases this
was done by reference to the testbed topology YAML passed into library
functions.  In other cases, most notably BGP libraries, the nbrhosts
fixture was adjusted to include multi-VRF-specific metadata which
callers could leverage to navigate the relationship between containers,
VRFs, and BGP neighborship.

Signed-off-by: Will Rideout <[email protected]>
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@github-actions github-actions bot requested review from r12f, sdszhang and wangxin January 29, 2026 16:05
Resolve all pylint issues.

Signed-off-by: Will Rideout <[email protected]>
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@wrideout-arista wrideout-arista mentioned this pull request Feb 5, 2026
5 tasks
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@wrideout-arista
Copy link
Copy Markdown
Contributor Author

@r12f this looks ready to go, just needs final say

@r12f
Copy link
Copy Markdown
Collaborator

r12f commented Feb 10, 2026

signed off since we have tried it already, and our latest nightly run pass rate suggests the same result as regular topo.

@r12f r12f added the Request for 202511 branch Request to backport a change to 202511 branch label Feb 10, 2026
@yxieca yxieca merged commit 49ca889 into sonic-net:master Feb 12, 2026
23 checks passed
@wrideout-arista wrideout-arista deleted the master_multi_vrf branch February 12, 2026 13:42
@mssonicbld
Copy link
Copy Markdown
Collaborator

@wrideout-arista PR conflicts with 202511 branch

@wrideout-arista
Copy link
Copy Markdown
Contributor Author

I will manually cast.

@wrideout-arista
Copy link
Copy Markdown
Contributor Author

Cast to 202511 in #22395.

@r12f for vis.

yxieca pushed a commit that referenced this pull request Feb 13, 2026
What is the motivation for this PR?
In PR #22171, support for multi-VRF testbeds was introduced. Although multi-VRF testbeds require a different definition file format than standard testbeds, the required files can be derived from existing definitions through a conversion process.

This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention.

How did you do it?
This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention.

How did you verify/test it?
We have already tested it in our internal repo.

Signed-off-by: Yutong Zhang <[email protected]>
@qiluo-msft
Copy link
Copy Markdown
Contributor

@wrideout-arista I tried renumber and found a bug, please help review bugfix PR #22423

anilal-amd pushed a commit to anilal-amd/anilal-forked-sonic-mgmt that referenced this pull request Feb 19, 2026
* Converge cEOSLab peer containers via VRFs

Converging the total number of peer switches into the fewest possible
number of cEOSLab containers reduces the overall resource constraints
required to run large numbers of peers. The basic premises behind
convergence are as follows:

Approach:

cEOSLab peers in docker containers may be converged into a smaller
number of host peers.  The SONiC-facing configuration of each BGP peer
may be separated in routing and bridging via the use of VRFs.  The
PTF-facing configuration of each BGP peer may be separated within each
VRF via VLAN tagging, enabling the use of a single backplane interface
on each host cEOSLab container.  Each VRF includes a number of
interfaces either facing the SONiC DUT or the backplane.  Changes should
be as transparent to the SONiC DUT as possible.  At the time of testbed
setup, the ansible topology file for the testbed is modified to include
new metadata specific to multi-vrf configuration, and the VMs list is
trimmed to only include those containers which will host multiple BGP
peerings, separated by VRF. The new metadata includes mappings between
host containers and VRFs, backplane VLAN mappings, and BGP session
parameters.

VLAN tag 2000 is used as the starting value for all VLANs between the
test infrastructure PTF container interfaces and cEOSLab device
interfaces.

The IP and IPv6 addresses used to connect the cEOSLab peer and
infrastructure PTF container are generated in order to make the
backplane connections clearer, more unique, and easier to implement. In
general, backplane L3 addresses used by the CEOSLab peer end in even
numbers, and those used by the PTF container end in odd numbers. All
addresses generated for use in backplane connections start with the
value 100 (0x64) in the least-significant octet or hextet (depending on
the family of the address). The address changes are mapped and stored in
the new multi-vrf metadata in the ansible topology file.

Multiple BGP features, such as local-as and next-hop-peer, are used in
order to aid in the resolution of routes. This is necessary to keep the
SONiC DUT multi-vrf-agnostic as possible.

Enabling multi-VRF mode:

Multi-VRF mode may be enabled by including the set attribute
use_converged_peers: true in the testbed definition found in
sonic-mgmt/ansible/testbed.yaml. This file is read the
TesbedProcessing.py script, which sets global variables indicating to
other ansible tasks and libraries that the testbed is to be started in
multi-VRF mode.

In addition, the value of max_fp_nums must be adjusted such that each
CEOSLab docker container has enough resources to run all the new BGP
sessions in each vrf. This can be done dynamically, of course, however
for the full-scale topologies the maximum supported by cEOSLab, 127,
must be used.

Known limitations:

cEOSLab instances do not allow for the creation of interfaces with
interface-IDs greater than 127, when interfaces are layed out
unidimensionally.

The use of multiple VRFs has not been tested in conjunction with
asynchronous ansible tasks.

Test library changes:

Test libraries needed to be made aware of the new underlying structure
of cEOSLab containers, VRFs, and BGP adjacencies.  In many cases this
was done by reference to the testbed topology YAML passed into library
functions.  In other cases, most notably BGP libraries, the nbrhosts
fixture was adjusted to include multi-VRF-specific metadata which
callers could leverage to navigate the relationship between containers,
VRFs, and BGP neighborship.

Signed-off-by: Will Rideout <[email protected]>
Signed-off-by: Zhuohui Tan <[email protected]>
anilal-amd pushed a commit to anilal-amd/anilal-forked-sonic-mgmt that referenced this pull request Feb 19, 2026
…t#22399)

What is the motivation for this PR?
In PR sonic-net#22171, support for multi-VRF testbeds was introduced. Although multi-VRF testbeds require a different definition file format than standard testbeds, the required files can be derived from existing definitions through a conversion process.

This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention.

How did you do it?
This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention.

How did you verify/test it?
We have already tested it in our internal repo.

Signed-off-by: Yutong Zhang <[email protected]>
Signed-off-by: Zhuohui Tan <[email protected]>
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Feb 20, 2026
…t#22399)

What is the motivation for this PR?
In PR sonic-net#22171, support for multi-VRF testbeds was introduced. Although multi-VRF testbeds require a different definition file format than standard testbeds, the required files can be derived from existing definitions through a conversion process.

This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention.

How did you do it?
This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention.

How did you verify/test it?
We have already tested it in our internal repo.

Signed-off-by: Yutong Zhang <[email protected]>
Signed-off-by: mssonicbld <[email protected]>
mssonicbld pushed a commit that referenced this pull request Feb 20, 2026
What is the motivation for this PR?
In PR #22171, support for multi-VRF testbeds was introduced. Although multi-VRF testbeds require a different definition file format than standard testbeds, the required files can be derived from existing definitions through a conversion process.

This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention.

How did you do it?
This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention.

How did you verify/test it?
We have already tested it in our internal repo.

Signed-off-by: Yutong Zhang <[email protected]>
Signed-off-by: mssonicbld <[email protected]>
aronovic pushed a commit to aronovic/sonic-mgmt that referenced this pull request Mar 3, 2026
* Converge cEOSLab peer containers via VRFs

Converging the total number of peer switches into the fewest possible
number of cEOSLab containers reduces the overall resource constraints
required to run large numbers of peers. The basic premises behind
convergence are as follows:

Approach:

cEOSLab peers in docker containers may be converged into a smaller
number of host peers.  The SONiC-facing configuration of each BGP peer
may be separated in routing and bridging via the use of VRFs.  The
PTF-facing configuration of each BGP peer may be separated within each
VRF via VLAN tagging, enabling the use of a single backplane interface
on each host cEOSLab container.  Each VRF includes a number of
interfaces either facing the SONiC DUT or the backplane.  Changes should
be as transparent to the SONiC DUT as possible.  At the time of testbed
setup, the ansible topology file for the testbed is modified to include
new metadata specific to multi-vrf configuration, and the VMs list is
trimmed to only include those containers which will host multiple BGP
peerings, separated by VRF. The new metadata includes mappings between
host containers and VRFs, backplane VLAN mappings, and BGP session
parameters.

VLAN tag 2000 is used as the starting value for all VLANs between the
test infrastructure PTF container interfaces and cEOSLab device
interfaces.

The IP and IPv6 addresses used to connect the cEOSLab peer and
infrastructure PTF container are generated in order to make the
backplane connections clearer, more unique, and easier to implement. In
general, backplane L3 addresses used by the CEOSLab peer end in even
numbers, and those used by the PTF container end in odd numbers. All
addresses generated for use in backplane connections start with the
value 100 (0x64) in the least-significant octet or hextet (depending on
the family of the address). The address changes are mapped and stored in
the new multi-vrf metadata in the ansible topology file.

Multiple BGP features, such as local-as and next-hop-peer, are used in
order to aid in the resolution of routes. This is necessary to keep the
SONiC DUT multi-vrf-agnostic as possible.

Enabling multi-VRF mode:

Multi-VRF mode may be enabled by including the set attribute
use_converged_peers: true in the testbed definition found in
sonic-mgmt/ansible/testbed.yaml. This file is read the
TesbedProcessing.py script, which sets global variables indicating to
other ansible tasks and libraries that the testbed is to be started in
multi-VRF mode.

In addition, the value of max_fp_nums must be adjusted such that each
CEOSLab docker container has enough resources to run all the new BGP
sessions in each vrf. This can be done dynamically, of course, however
for the full-scale topologies the maximum supported by cEOSLab, 127,
must be used.

Known limitations:

cEOSLab instances do not allow for the creation of interfaces with
interface-IDs greater than 127, when interfaces are layed out
unidimensionally.

The use of multiple VRFs has not been tested in conjunction with
asynchronous ansible tasks.

Test library changes:

Test libraries needed to be made aware of the new underlying structure
of cEOSLab containers, VRFs, and BGP adjacencies.  In many cases this
was done by reference to the testbed topology YAML passed into library
functions.  In other cases, most notably BGP libraries, the nbrhosts
fixture was adjusted to include multi-VRF-specific metadata which
callers could leverage to navigate the relationship between containers,
VRFs, and BGP neighborship.

Signed-off-by: Will Rideout <[email protected]>
Signed-off-by: Mihut Aronovici <[email protected]>
aronovic pushed a commit to aronovic/sonic-mgmt that referenced this pull request Mar 3, 2026
…t#22399)

What is the motivation for this PR?
In PR sonic-net#22171, support for multi-VRF testbeds was introduced. Although multi-VRF testbeds require a different definition file format than standard testbeds, the required files can be derived from existing definitions through a conversion process.

This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention.

How did you do it?
This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention.

How did you verify/test it?
We have already tested it in our internal repo.

Signed-off-by: Yutong Zhang <[email protected]>
Signed-off-by: Mihut Aronovici <[email protected]>
ravaliyel pushed a commit to ravaliyel/sonic-mgmt that referenced this pull request Mar 12, 2026
* Converge cEOSLab peer containers via VRFs

Converging the total number of peer switches into the fewest possible
number of cEOSLab containers reduces the overall resource constraints
required to run large numbers of peers. The basic premises behind
convergence are as follows:

Approach:

cEOSLab peers in docker containers may be converged into a smaller
number of host peers.  The SONiC-facing configuration of each BGP peer
may be separated in routing and bridging via the use of VRFs.  The
PTF-facing configuration of each BGP peer may be separated within each
VRF via VLAN tagging, enabling the use of a single backplane interface
on each host cEOSLab container.  Each VRF includes a number of
interfaces either facing the SONiC DUT or the backplane.  Changes should
be as transparent to the SONiC DUT as possible.  At the time of testbed
setup, the ansible topology file for the testbed is modified to include
new metadata specific to multi-vrf configuration, and the VMs list is
trimmed to only include those containers which will host multiple BGP
peerings, separated by VRF. The new metadata includes mappings between
host containers and VRFs, backplane VLAN mappings, and BGP session
parameters.

VLAN tag 2000 is used as the starting value for all VLANs between the
test infrastructure PTF container interfaces and cEOSLab device
interfaces.

The IP and IPv6 addresses used to connect the cEOSLab peer and
infrastructure PTF container are generated in order to make the
backplane connections clearer, more unique, and easier to implement. In
general, backplane L3 addresses used by the CEOSLab peer end in even
numbers, and those used by the PTF container end in odd numbers. All
addresses generated for use in backplane connections start with the
value 100 (0x64) in the least-significant octet or hextet (depending on
the family of the address). The address changes are mapped and stored in
the new multi-vrf metadata in the ansible topology file.

Multiple BGP features, such as local-as and next-hop-peer, are used in
order to aid in the resolution of routes. This is necessary to keep the
SONiC DUT multi-vrf-agnostic as possible.

Enabling multi-VRF mode:

Multi-VRF mode may be enabled by including the set attribute
use_converged_peers: true in the testbed definition found in
sonic-mgmt/ansible/testbed.yaml. This file is read the
TesbedProcessing.py script, which sets global variables indicating to
other ansible tasks and libraries that the testbed is to be started in
multi-VRF mode.

In addition, the value of max_fp_nums must be adjusted such that each
CEOSLab docker container has enough resources to run all the new BGP
sessions in each vrf. This can be done dynamically, of course, however
for the full-scale topologies the maximum supported by cEOSLab, 127,
must be used.

Known limitations:

cEOSLab instances do not allow for the creation of interfaces with
interface-IDs greater than 127, when interfaces are layed out
unidimensionally.

The use of multiple VRFs has not been tested in conjunction with
asynchronous ansible tasks.

Test library changes:

Test libraries needed to be made aware of the new underlying structure
of cEOSLab containers, VRFs, and BGP adjacencies.  In many cases this
was done by reference to the testbed topology YAML passed into library
functions.  In other cases, most notably BGP libraries, the nbrhosts
fixture was adjusted to include multi-VRF-specific metadata which
callers could leverage to navigate the relationship between containers,
VRFs, and BGP neighborship.

Signed-off-by: Will Rideout <[email protected]>
Signed-off-by: Ravali Yeluri (WIPRO LIMITED) <[email protected]>
ravaliyel pushed a commit to ravaliyel/sonic-mgmt that referenced this pull request Mar 12, 2026
…t#22399)

What is the motivation for this PR?
In PR sonic-net#22171, support for multi-VRF testbeds was introduced. Although multi-VRF testbeds require a different definition file format than standard testbeds, the required files can be derived from existing definitions through a conversion process.

This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention.

How did you do it?
This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention.

How did you verify/test it?
We have already tested it in our internal repo.

Signed-off-by: Yutong Zhang <[email protected]>
Signed-off-by: Ravali Yeluri (WIPRO LIMITED) <[email protected]>
abhishek-nexthop pushed a commit to nexthop-ai/sonic-mgmt that referenced this pull request Mar 17, 2026
* Converge cEOSLab peer containers via VRFs

Converging the total number of peer switches into the fewest possible
number of cEOSLab containers reduces the overall resource constraints
required to run large numbers of peers. The basic premises behind
convergence are as follows:

Approach:

cEOSLab peers in docker containers may be converged into a smaller
number of host peers.  The SONiC-facing configuration of each BGP peer
may be separated in routing and bridging via the use of VRFs.  The
PTF-facing configuration of each BGP peer may be separated within each
VRF via VLAN tagging, enabling the use of a single backplane interface
on each host cEOSLab container.  Each VRF includes a number of
interfaces either facing the SONiC DUT or the backplane.  Changes should
be as transparent to the SONiC DUT as possible.  At the time of testbed
setup, the ansible topology file for the testbed is modified to include
new metadata specific to multi-vrf configuration, and the VMs list is
trimmed to only include those containers which will host multiple BGP
peerings, separated by VRF. The new metadata includes mappings between
host containers and VRFs, backplane VLAN mappings, and BGP session
parameters.

VLAN tag 2000 is used as the starting value for all VLANs between the
test infrastructure PTF container interfaces and cEOSLab device
interfaces.

The IP and IPv6 addresses used to connect the cEOSLab peer and
infrastructure PTF container are generated in order to make the
backplane connections clearer, more unique, and easier to implement. In
general, backplane L3 addresses used by the CEOSLab peer end in even
numbers, and those used by the PTF container end in odd numbers. All
addresses generated for use in backplane connections start with the
value 100 (0x64) in the least-significant octet or hextet (depending on
the family of the address). The address changes are mapped and stored in
the new multi-vrf metadata in the ansible topology file.

Multiple BGP features, such as local-as and next-hop-peer, are used in
order to aid in the resolution of routes. This is necessary to keep the
SONiC DUT multi-vrf-agnostic as possible.

Enabling multi-VRF mode:

Multi-VRF mode may be enabled by including the set attribute
use_converged_peers: true in the testbed definition found in
sonic-mgmt/ansible/testbed.yaml. This file is read the
TesbedProcessing.py script, which sets global variables indicating to
other ansible tasks and libraries that the testbed is to be started in
multi-VRF mode.

In addition, the value of max_fp_nums must be adjusted such that each
CEOSLab docker container has enough resources to run all the new BGP
sessions in each vrf. This can be done dynamically, of course, however
for the full-scale topologies the maximum supported by cEOSLab, 127,
must be used.

Known limitations:

cEOSLab instances do not allow for the creation of interfaces with
interface-IDs greater than 127, when interfaces are layed out
unidimensionally.

The use of multiple VRFs has not been tested in conjunction with
asynchronous ansible tasks.

Test library changes:

Test libraries needed to be made aware of the new underlying structure
of cEOSLab containers, VRFs, and BGP adjacencies.  In many cases this
was done by reference to the testbed topology YAML passed into library
functions.  In other cases, most notably BGP libraries, the nbrhosts
fixture was adjusted to include multi-VRF-specific metadata which
callers could leverage to navigate the relationship between containers,
VRFs, and BGP neighborship.

Signed-off-by: Will Rideout <[email protected]>
Signed-off-by: Abhishek <[email protected]>
abhishek-nexthop pushed a commit to nexthop-ai/sonic-mgmt that referenced this pull request Mar 17, 2026
…t#22399)

What is the motivation for this PR?
In PR sonic-net#22171, support for multi-VRF testbeds was introduced. Although multi-VRF testbeds require a different definition file format than standard testbeds, the required files can be derived from existing definitions through a conversion process.

This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention.

How did you do it?
This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention.

How did you verify/test it?
We have already tested it in our internal repo.

Signed-off-by: Yutong Zhang <[email protected]>
Signed-off-by: Abhishek <[email protected]>
vrajeshe pushed a commit to vrajeshe/sonic-mgmt that referenced this pull request Mar 23, 2026
…t#22399)

What is the motivation for this PR?
In PR sonic-net#22171, support for multi-VRF testbeds was introduced. Although multi-VRF testbeds require a different definition file format than standard testbeds, the required files can be derived from existing definitions through a conversion process.

This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention.

How did you do it?
This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention.

How did you verify/test it?
We have already tested it in our internal repo.

Signed-off-by: Yutong Zhang <[email protected]>
Signed-off-by: Venkata Gouri Rajesh Etla <[email protected]>
ravaliyel pushed a commit to ravaliyel/sonic-mgmt that referenced this pull request Mar 27, 2026
* Converge cEOSLab peer containers via VRFs

Converging the total number of peer switches into the fewest possible
number of cEOSLab containers reduces the overall resource constraints
required to run large numbers of peers. The basic premises behind
convergence are as follows:

Approach:

cEOSLab peers in docker containers may be converged into a smaller
number of host peers.  The SONiC-facing configuration of each BGP peer
may be separated in routing and bridging via the use of VRFs.  The
PTF-facing configuration of each BGP peer may be separated within each
VRF via VLAN tagging, enabling the use of a single backplane interface
on each host cEOSLab container.  Each VRF includes a number of
interfaces either facing the SONiC DUT or the backplane.  Changes should
be as transparent to the SONiC DUT as possible.  At the time of testbed
setup, the ansible topology file for the testbed is modified to include
new metadata specific to multi-vrf configuration, and the VMs list is
trimmed to only include those containers which will host multiple BGP
peerings, separated by VRF. The new metadata includes mappings between
host containers and VRFs, backplane VLAN mappings, and BGP session
parameters.

VLAN tag 2000 is used as the starting value for all VLANs between the
test infrastructure PTF container interfaces and cEOSLab device
interfaces.

The IP and IPv6 addresses used to connect the cEOSLab peer and
infrastructure PTF container are generated in order to make the
backplane connections clearer, more unique, and easier to implement. In
general, backplane L3 addresses used by the CEOSLab peer end in even
numbers, and those used by the PTF container end in odd numbers. All
addresses generated for use in backplane connections start with the
value 100 (0x64) in the least-significant octet or hextet (depending on
the family of the address). The address changes are mapped and stored in
the new multi-vrf metadata in the ansible topology file.

Multiple BGP features, such as local-as and next-hop-peer, are used in
order to aid in the resolution of routes. This is necessary to keep the
SONiC DUT multi-vrf-agnostic as possible.

Enabling multi-VRF mode:

Multi-VRF mode may be enabled by including the set attribute
use_converged_peers: true in the testbed definition found in
sonic-mgmt/ansible/testbed.yaml. This file is read the
TesbedProcessing.py script, which sets global variables indicating to
other ansible tasks and libraries that the testbed is to be started in
multi-VRF mode.

In addition, the value of max_fp_nums must be adjusted such that each
CEOSLab docker container has enough resources to run all the new BGP
sessions in each vrf. This can be done dynamically, of course, however
for the full-scale topologies the maximum supported by cEOSLab, 127,
must be used.

Known limitations:

cEOSLab instances do not allow for the creation of interfaces with
interface-IDs greater than 127, when interfaces are layed out
unidimensionally.

The use of multiple VRFs has not been tested in conjunction with
asynchronous ansible tasks.

Test library changes:

Test libraries needed to be made aware of the new underlying structure
of cEOSLab containers, VRFs, and BGP adjacencies.  In many cases this
was done by reference to the testbed topology YAML passed into library
functions.  In other cases, most notably BGP libraries, the nbrhosts
fixture was adjusted to include multi-VRF-specific metadata which
callers could leverage to navigate the relationship between containers,
VRFs, and BGP neighborship.

Signed-off-by: Will Rideout <[email protected]>
ravaliyel pushed a commit to ravaliyel/sonic-mgmt that referenced this pull request Mar 27, 2026
…t#22399)

What is the motivation for this PR?
In PR sonic-net#22171, support for multi-VRF testbeds was introduced. Although multi-VRF testbeds require a different definition file format than standard testbeds, the required files can be derived from existing definitions through a conversion process.

This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention.

How did you do it?
This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention.

How did you verify/test it?
We have already tested it in our internal repo.

Signed-off-by: Yutong Zhang <[email protected]>
selldinesh pushed a commit to selldinesh/sonic-mgmt that referenced this pull request Apr 1, 2026
* Converge cEOSLab peer containers via VRFs

Converging the total number of peer switches into the fewest possible
number of cEOSLab containers reduces the overall resource constraints
required to run large numbers of peers. The basic premises behind
convergence are as follows:

Approach:

cEOSLab peers in docker containers may be converged into a smaller
number of host peers.  The SONiC-facing configuration of each BGP peer
may be separated in routing and bridging via the use of VRFs.  The
PTF-facing configuration of each BGP peer may be separated within each
VRF via VLAN tagging, enabling the use of a single backplane interface
on each host cEOSLab container.  Each VRF includes a number of
interfaces either facing the SONiC DUT or the backplane.  Changes should
be as transparent to the SONiC DUT as possible.  At the time of testbed
setup, the ansible topology file for the testbed is modified to include
new metadata specific to multi-vrf configuration, and the VMs list is
trimmed to only include those containers which will host multiple BGP
peerings, separated by VRF. The new metadata includes mappings between
host containers and VRFs, backplane VLAN mappings, and BGP session
parameters.

VLAN tag 2000 is used as the starting value for all VLANs between the
test infrastructure PTF container interfaces and cEOSLab device
interfaces.

The IP and IPv6 addresses used to connect the cEOSLab peer and
infrastructure PTF container are generated in order to make the
backplane connections clearer, more unique, and easier to implement. In
general, backplane L3 addresses used by the CEOSLab peer end in even
numbers, and those used by the PTF container end in odd numbers. All
addresses generated for use in backplane connections start with the
value 100 (0x64) in the least-significant octet or hextet (depending on
the family of the address). The address changes are mapped and stored in
the new multi-vrf metadata in the ansible topology file.

Multiple BGP features, such as local-as and next-hop-peer, are used in
order to aid in the resolution of routes. This is necessary to keep the
SONiC DUT multi-vrf-agnostic as possible.

Enabling multi-VRF mode:

Multi-VRF mode may be enabled by including the set attribute
use_converged_peers: true in the testbed definition found in
sonic-mgmt/ansible/testbed.yaml. This file is read the
TesbedProcessing.py script, which sets global variables indicating to
other ansible tasks and libraries that the testbed is to be started in
multi-VRF mode.

In addition, the value of max_fp_nums must be adjusted such that each
CEOSLab docker container has enough resources to run all the new BGP
sessions in each vrf. This can be done dynamically, of course, however
for the full-scale topologies the maximum supported by cEOSLab, 127,
must be used.

Known limitations:

cEOSLab instances do not allow for the creation of interfaces with
interface-IDs greater than 127, when interfaces are layed out
unidimensionally.

The use of multiple VRFs has not been tested in conjunction with
asynchronous ansible tasks.

Test library changes:

Test libraries needed to be made aware of the new underlying structure
of cEOSLab containers, VRFs, and BGP adjacencies.  In many cases this
was done by reference to the testbed topology YAML passed into library
functions.  In other cases, most notably BGP libraries, the nbrhosts
fixture was adjusted to include multi-VRF-specific metadata which
callers could leverage to navigate the relationship between containers,
VRFs, and BGP neighborship.

Signed-off-by: Will Rideout <[email protected]>
Signed-off-by: selldinesh <[email protected]>
selldinesh pushed a commit to selldinesh/sonic-mgmt that referenced this pull request Apr 1, 2026
…t#22399)

What is the motivation for this PR?
In PR sonic-net#22171, support for multi-VRF testbeds was introduced. Although multi-VRF testbeds require a different definition file format than standard testbeds, the required files can be derived from existing definitions through a conversion process.

This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention.

How did you do it?
This PR integrates the conversion script into the CI pipeline, allowing the transformation to be performed automatically as part of the workflow, without manual intervention.

How did you verify/test it?
We have already tested it in our internal repo.

Signed-off-by: Yutong Zhang <[email protected]>
Signed-off-by: selldinesh <[email protected]>
yutongzhang-microsoft added a commit to yutongzhang-microsoft/sonic-mgmt that referenced this pull request Apr 1, 2026
- Enabling Multi-VRF Mode: remove Approach #1/#2 split; manual convergence
  is now correctly described as a fallback when TestbedProcessing.py is not
  used, not an alternative approach
- Test Library Changes: rewrite based on actual code changes in PR sonic-net#22171:
  - ceos_topo_converger.py (new)
  - TestbedProcessing.py: use_converged_peers flag handling
  - topo_facts.py: get_vm_list/get_vlans for multi-VRF
  - testbed_vm_info.py: VRF-to-host mapping
  - nbrhosts fixture: multi_vrf_data dict structure
  - bgp_helpers.py: VRF-aware route checks and port resolution
  - bgp/conftest.py: VRF-scoped graceful-restart config

Co-authored-by: Copilot <[email protected]>
yutongzhang-microsoft added a commit to yutongzhang-microsoft/sonic-mgmt that referenced this pull request Apr 1, 2026
- Enabling Multi-VRF Mode: remove Approach #1/#2 split; manual convergence
  is now correctly described as a fallback when TestbedProcessing.py is not
  used, not an alternative approach
- Test Library Changes: rewrite based on actual code changes in PR sonic-net#22171:
  - ceos_topo_converger.py (new)
  - TestbedProcessing.py: use_converged_peers flag handling
  - topo_facts.py: get_vm_list/get_vlans for multi-VRF
  - testbed_vm_info.py: VRF-to-host mapping
  - nbrhosts fixture: multi_vrf_data dict structure
  - bgp_helpers.py: VRF-aware route checks and port resolution
  - bgp/conftest.py: VRF-scoped graceful-restart config

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Yutong Zhang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants