Skip to content

Optimize vm_topology to reduce time required for add-topo#4156

Merged
wangxin merged 1 commit intosonic-net:masterfrom
wangxin:optimize-vm-topology-pr
Sep 2, 2021
Merged

Optimize vm_topology to reduce time required for add-topo#4156
wangxin merged 1 commit intosonic-net:masterfrom
wangxin:optimize-vm-topology-pr

Conversation

@wangxin
Copy link
Collaborator

@wangxin wangxin commented Sep 1, 2021

Description of PR

Summary:
Fixes # (issue)

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 201911

Approach

What is the motivation for this PR?

Currently, to add a topology using tool 'testbed-cli.sh add-topo', most of the time is spent
in calling the customized vm_topology module to bind interfaces and bridges for PTF docker
and neighbor VMs. Before apply any operation on an interface, usually we need to firstly
check existence of the interface in host or in docker. The vm_topology module runs command
"ifconfig -a" and parse the output to find out all the interfaces. Then based on this
information to check existence of interface. On a test server with multiple testbeds, there
could be hundreds of interfaces and bridges in the output of "ifconfig -a". Even worse,
checking existence of interface is a frequent operation. So, it could take more than 20
minutes to run the vm_topology module.

How did you do it?

This change mainly improved the way of checking existence of interface in the vm_topology
module. Instead of run "ifconfig -a" every time, the new way is to run "ifconfig <intf_name>"
which is much quicker. With this change, the vm_topology needs around 100 seconds for
t1-lag topo and around 40 seconds for t0 topology.

Other improvements made in this PR:

  • Add timestamp to log generated by vm_topology.
  • The ceos_network module is also improved in similar way.
  • Skip exporting the PTF docker if vm_type is not "vsonic".
  • Remove the bridges for CEOS during 'testbed-cli.sh remove-topo' if vm_type is "ceos".

How did you verify/test it?

Tested add-topo/remove-topo for VS setup and physical setup. Covered:

  • t0 and t1-lag topology
  • VEOS and CEOS neighbor

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

Currently, to add a topology using tool 'testbed-cli.sh add-topo', most of the time is spent
in calling the customized vm_topology module to bind interfaces and bridges for PTF docker
and neighbor VMs. Before apply any operation on an interface, usually we need to firstly
check existence of the interface in host or in docker. The vm_topology module runs command
"ifocnfig -a" and parse the output to find out all the interfaces. Then based on this
information to check existence of interface. On a test server with multiple testbeds, there
could be hundereds of interfaces and bridges in the output of "ifconfig -a". Even worse,
checking existence of interface is a frequent operation. So, it could take more than 20
minutes to run the vm_topology module.

This change mainly improved the way of checking existence of interface in the vm_topology
module. Instead of run "ifconfig -a" every time, the new way is to run "ifconfig <intf_name>"
which is much quicker. With this change, the vm_topology needs around 100 seconds for
t1-lag topo and around 40 seconds for t0 topology.

Other improvements made in this PR:
* Add timestamp to log generated by vm_topology.
* The ceos_network module is also improved in similar way.
* Skip exporting the PTF docker if vm_type is not "vsonic".
* Remove the bridges for CEOS during 'testbed-cli.sh remove-topo' if vm_type is "ceos".

Signed-off-by: Xin Wang <[email protected]>
@wangxin wangxin requested a review from lguohan September 1, 2021 05:08
@wangxin wangxin requested a review from a team as a code owner September 1, 2021 05:08
@lgtm-com
Copy link

lgtm-com bot commented Sep 1, 2021

This pull request introduces 4 alerts and fixes 5 when merging e05d46f into 762f366 - view on LGTM.com

new alerts:

  • 4 for Except block handles 'BaseException'

fixed alerts:

  • 2 for Module is imported more than once
  • 2 for Unused import
  • 1 for Unused local variable

@wangxin wangxin merged commit f049fa4 into sonic-net:master Sep 2, 2021
@wangxin wangxin deleted the optimize-vm-topology-pr branch September 9, 2021 08:07
wangxin added a commit to wangxin/sonic-mgmt that referenced this pull request Sep 10, 2021
PR sonic-net#4156 replaced variable VM_hosts with VM_targets when call the
vm_topology module. With this change, the vm_topology will create
bridges for VMs used by current testbed instead of create bridges
for all VMs available on the test server.
However, the change missed the ptf topology scenario. In case of
ptf topology, the VM_targets ansible varialbe is not defined.

The fix is to assign default value `[]` to VM_targets when it is not
defined.

Signed-off-by: Xin Wang <[email protected]>
wangxin added a commit that referenced this pull request Sep 12, 2021
What is the motivation for this PR?
PR #4156 replaced variable VM_hosts with VM_targets when call the
vm_topology module. With this change, the vm_topology module will
create bridges for VMs used by current testbed instead of create bridges
for all VMs available on the test server.

However, the change missed the ptf topology scenario. In case of
ptf topology, the VM_targets ansible variable is not defined. Calling
the vm_topology module will fail.

How did you do it?
The fix is to assign default value [] to VM_targets when it is not defined.

How did you verify/test it?
Test add-topo/remove-topo for ptf and t0 topologies.

Signed-off-by: Xin Wang <[email protected]>
wangxin added a commit to wangxin/sonic-mgmt that referenced this pull request Sep 16, 2021
PR sonic-net#4156 optimized the performance of the vm_topology module. However, the retry
logci of running a command is also removed. On some testbeds, command like "ifconfig -a"
may fail ocassionally and need to retry. Without the retry logic, add-topo and
restart-ptf may fail ocassionally.

This change added back the retry logic of running commands in the vm_topology module and
should can increase robustness of calling the vm_topology module.
The ceos_network module is also enhanced in the similiar way.

Signed-off-by: Xin Wang <[email protected]>
lguohan pushed a commit that referenced this pull request Sep 17, 2021
PR #4156 optimized the performance of the vm_topology module. However, the retry
logci of running a command is also removed. On some testbeds, command like "ifconfig -a"
may fail ocassionally and need to retry. Without the retry logic, add-topo and
restart-ptf may fail ocassionally.

This change added back the retry logic of running commands in the vm_topology module and
should can increase robustness of calling the vm_topology module.
The ceos_network module is also enhanced in the similiar way.

Signed-off-by: Xin Wang <[email protected]>
vmittal-msft pushed a commit to vmittal-msft/sonic-mgmt that referenced this pull request Sep 28, 2021
…4156)

Currently, to add a topology using tool 'testbed-cli.sh add-topo', most of the time is spent
in calling the customized vm_topology module to bind interfaces and bridges for PTF docker
and neighbor VMs. Before apply any operation on an interface, usually we need to firstly
check existence of the interface in host or in docker. The vm_topology module runs command
"ifconfig -a" and parse the output to find out all the interfaces. Then based on this
information to check existence of interface. On a test server with multiple testbeds, there
could be hundreds of interfaces and bridges in the output of "ifconfig -a". Even worse,
checking existence of interface is a frequent operation. So, it could take more than 20
minutes to run the vm_topology module.

This change mainly improved the way of checking existence of interface in the vm_topology
module. Instead of run "ifconfig -a" every time, the new way is to run "ifconfig <intf_name>"
which is much quicker. With this change, the vm_topology needs around 100 seconds for
t1-lag topo and around 40 seconds for t0 topology.

Other improvements made in this PR:
* Add timestamp to log generated by vm_topology.
* The ceos_network module is also improved in similar way.
* Skip exporting the PTF docker if vm_type is not "vsonic".
* Remove the bridges for CEOS during 'testbed-cli.sh remove-topo' if vm_type is "ceos".

Signed-off-by: Xin Wang <[email protected]>
vmittal-msft pushed a commit to vmittal-msft/sonic-mgmt that referenced this pull request Sep 28, 2021
…ic-net#4229)

What is the motivation for this PR?
PR sonic-net#4156 replaced variable VM_hosts with VM_targets when call the
vm_topology module. With this change, the vm_topology module will
create bridges for VMs used by current testbed instead of create bridges
for all VMs available on the test server.

However, the change missed the ptf topology scenario. In case of
ptf topology, the VM_targets ansible variable is not defined. Calling
the vm_topology module will fail.

How did you do it?
The fix is to assign default value [] to VM_targets when it is not defined.

How did you verify/test it?
Test add-topo/remove-topo for ptf and t0 topologies.

Signed-off-by: Xin Wang <[email protected]>
vmittal-msft pushed a commit to vmittal-msft/sonic-mgmt that referenced this pull request Sep 28, 2021
…net#4289)

PR sonic-net#4156 optimized the performance of the vm_topology module. However, the retry
logci of running a command is also removed. On some testbeds, command like "ifconfig -a"
may fail ocassionally and need to retry. Without the retry logic, add-topo and
restart-ptf may fail ocassionally.

This change added back the retry logic of running commands in the vm_topology module and
should can increase robustness of calling the vm_topology module.
The ceos_network module is also enhanced in the similiar way.

Signed-off-by: Xin Wang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants