Skip to content

[FIB] Updated packets to send per next hop group when check balancing #2585

Merged
wangxin merged 2 commits intosonic-net:masterfrom
SavchukRomanLv:fib_check_balancing_updates
Dec 1, 2020
Merged

[FIB] Updated packets to send per next hop group when check balancing #2585
wangxin merged 2 commits intosonic-net:masterfrom
SavchukRomanLv:fib_check_balancing_updates

Conversation

@SavchukRomanLv
Copy link

Signed-off-by: Roman Savchuk [email protected]

Description of PR

Summary:
When run fib case on t0 topology we receive check_balancing error. After investigation was found, that BGP sessions flaps because PTF generates too many packets to test each IP range. This affects TCP sessions on VMs side. For t0 topology we have 4 next hop groups(Port Channels). For t1 topology number for next_hop groups is 16.

Type of change

  • [+ ] Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Approach

What is the motivation for this PR?

Make fib TC passed on t0

How did you do it?

Test case itself sends every time 10000 packets for all topos.
To include specialty of each topo I decided to use weighting factors based on topo type, as an ethalon I've taken load per group in t1 topo. New BALANCING_TEST_TIMES = 625 (10000/16). Based of number of next_hop group in topo this BALANCING_TEST_TIMES will be multiply on this number for checking balancing.

Also small delays (0.01 sec) was being added when send balancing traffic not to overload TCP sessions.

How did you verify/test it?

Run fib TC on t0 topo

Any platform specific information?

SONiC Software Version: SONiC.master.35-dirty-20201112.031542
Distribution: Debian 10.6
Kernel: 4.19.0-9-2-amd64
Build commit: 6c362a0
Build date: Thu Nov 12 11:49:55 UTC 2020
Platform: x86_64-accton_wedge100bf_32x-r0
HwSKU: montara
ASIC: barefoot

Supported testbed topology if it's a new test case?

Documentation

Roman Savchuk added 2 commits November 24, 2020 16:13
Signed-off-by: Roman Savchuk <[email protected]>
Signed-off-by: Roman Savchuk <[email protected]>
@SavchukRomanLv
Copy link
Author

Pls review @wangxin @yxieca

if ip_range.length() > 2:
self.check_ip_route(src_port, ip_range.get_random_ip(), exp_port_list, ipv4)

time.sleep(0.01)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why need this delay?

If this delay is really needed, should it be added to the loop in function def check_ip_ranges?

    def check_ip_ranges(self, ipv4=True):
        if ipv4:
            ip_ranges = self.fib.ipv4_ranges()
        else:
            ip_ranges = self.fib.ipv6_ranges()

        for ip_range in ip_ranges:
            if ip_range.get_first_ip() in self.fib:
                next_hop = self.fib[ip_range.get_first_ip()]
                self.check_ip_range(ip_range, next_hop, ipv4)
                time.sleep(0.01)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we need this tiny sleep to avoid flapping bgp sessions, as TC received ~6k routes per neighbor, and sends 3 packets per route group. To avoid overloading when send traffic delay was being set.

In def check_ip_ranges i beleive we do not need such delay, I've re-run fib dozen times and TC passed without changes in check_ip_ranges.

@wangxin wangxin merged commit 19a520b into sonic-net:master Dec 1, 2020
kazinator-arista pushed a commit to kazinator-arista/sonic-mgmt that referenced this pull request Mar 4, 2026
…le head (sonic-net#13353)

utilties:
* 7fc3fb7 2023-01-06 | [storyteller] add link prober state change to story teller (sonic-net#2585) (HEAD -> 202205, github/202205) [Jing Zhang]
* d8202e8 2023-01-12 | [202205] Added a new option in show queue counters command to display voq statistics (sonic-net#2591) [Sambath Kumar Balasubramanian]
* 98dde31 2023-01-11 | resolved conflicts (sonic-net#2589) [kannankvs]

swss:
* 268c3f3 2023-01-11 | Added new attributes for Vnet and Vxlan ecmp configurations. (sonic-net#2584) (HEAD -> 202205, github/202205) [siqbal1986]
* 50235b6 2023-01-12 | [routesync] Fix for stale dynamic neighbor (sonic-net#2553) [vganesan-nokia]
* ad9d826 2023-01-12 | [202205][voq] Add voq counters. (sonic-net#2606) [Sambath Kumar Balasubramanian]
* 34995f1 2023-01-10 | Revert "[voq][chassis]Add show fabric counters port/queue commands (sonic-net#2522)" (sonic-net#2612) [Arvindsrinivasan Lakshmi Narasimhan]

sairedis:
* 92e6442 2023-01-05 | enable cisco8000 SAI bulk API feature (sonic-net#1153) (sonic-net#1164) (github/202205) [Keith Lu]

platform-daemons:
* 10eb2e6 2023-01-06 | Fix bug where transceiver info is missing after port breakout change (sonic-net#329) (HEAD -> 202205) [Tal Berlowitz]

Signed-off-by: Ying Xie <[email protected]>

Signed-off-by: Ying Xie <[email protected]>
kazinator-arista pushed a commit to kazinator-arista/sonic-mgmt that referenced this pull request Mar 4, 2026
Why I did it
Submodule advances:
sonic-utilities

8e8e6088 - [202211][dhcp_relay] Remove add field of vlanid to DHCP_RELAY table while adding vlan ([201811 sub-module] advance sub-modules: utilities, swss, swss-common sonic-net#2679) (16 hours ago) [Yaqiang Zhu]
1400fb94 - [GCU] Ignore bgpraw in GCU applier (Fix sfputil indexing for 7170-Q59S20 sonic-net#2623) (15 hours ago) [jingwenxie]
f76a6364 - [vlan] Refresh dhcpv6_relay config while adding/deleting a vlan ([sonic-py-swsssdk] Update submodule sonic-net#2660) (15 hours ago) [Yaqiang Zhu]
7849e18d - [db_migrator] make LOG_LEVEL_DB migration more robust (Mellanox platform: attach queues 2 and 6 to lossy profile using generic buffer template sonic-net#2651) (16 hours ago) [Stepan Blyshchak]
c7df6dfa - Fixed a bug in "show vnet routes all" causing screen overrun. (Add hook to allow customizing link cable lengths sonic-net#2644) (16 hours ago) [siqbal1986]
a5505f02 - show logging CLI support for logs stored in tmpfs (Traceback error seen while issuing show interface commands with if_names sonic-net#2641) (16 hours ago) [mihirpat1]
bbacb91a - [system-health] Fix issue: show system-health CLI crashes (Updating deb package for platform and sai sonic-net#2635) (16 hours ago) [Junchao-Mellanox]
8d724024 - [sai_failure_dump]Invoking dump during SAI failure ([dockers]: Upgrade LLDP docker to stretch build sonic-net#2633) (16 hours ago) [Sudharsan Dhamal Gopalarathnam]
3c3be526 - Add transceiver info CLI support to show output from TRANSCEIVER_INFO for ZR ([submodule]: Update sonic-sairedis pointer sonic-net#2630) (16 hours ago) [mihirpat1]
37f41666 - [show] add support for gRPC show commands for active-active ([bitmap-vnet]: Bitmap vnet test image [DO NOT MERGE] sonic-net#2629) (16 hours ago) [vdahiya12]
b06d7fe4 - [show_bfd] add local discriminator in show bfd command ([Pmon] Selectively load pmon container daemons sonic-net#2625) (16 hours ago) [Baorong Liu]
6adcd3e8 - [GCU] Ignore bgpraw table in GCU operation ([Mellanox] Fix SAI version sonic-net#2628) (16 hours ago) [jingwenxie]
c65bdc35 - [muxcable][config] Add support to enable/disable ceasing to be an advertisement interface when radv service is stopped (Add knob in ConfigDB to enable/disable telemetry container sonic-net#2622) (16 hours ago) [Jing Zhang]
91e9457f - Add Transceiver PM basic CLI support to show output from TRANSCEIVER_PM table for ZR ([201803] Restart SwSS, syncd and dependent services if a critical process in syncd container exits sonic-net#2615) (16 hours ago) [longhuan-cisco]
54cc8c5a - Remove TODO comment which is no longer relevant (Warm-reboot: teamd warm restart caused neighbor deleted and learned again.  sonic-net#2600) (16 hours ago) [Lior Avramov]
6891b4fb - Making 'show feature autorestart' more resilient to missing auto_restart config in CONFIG_DB ([submodule] update mellanox hw-mgmgt pointer (V.2.0.0061) sonic-net#2592) (16 hours ago) [kartik-arista]
1e8bea37 - [storyteller] add link prober state change to story teller ([sonic-buildimage] New feature managementVRF(L3mdev) sonic-net#2585) (16 hours ago) [Jing Zhang]
7481a20f - Extend fast-reboot STATE_DB entry timer ([submodule]: update sonic-swss-common, sonic-py-swsssdk, sonic-snmpagent sonic-net#2577) (16 hours ago) [Aryeh Feigin]
0e08701c - [sonic_installer] use /etc/resolv.conf from the host when migrating packages (Set a rate limit on syslog messages from all Docker containers sonic-net#2573) (16 hours ago) [Stepan Blyshchak]
06096780 - Fixed admin state config CLI for Backport interfaces (Prior to install a new ONIE SONiC image, delete all partitions except EFI/ONIE sonic-net#2557) (16 hours ago) [anamehra]
9f1f13e4 - [show] Add bgpraw to show run all (Fixed typo on paragraph sonic-net#40 sonic-net#2537) (16 hours ago) [jingwenxie]
98bc8bd2 - [chassis][voq] Add "show fabric reachability" command. ([ntp]: Build 4.2.6 locally. sonic-net#2528) (16 hours ago) [jfeng-arista]
3a50b63f - Preserve copp tables through DB migration ([docker-radvd]: upgrade docker radvd to stretch based sonic-net#2524) (16 hours ago) [Aryeh Feigin]
28f6b127 - [masic] 'show interfaces counters' reminds to use '-d all' option to check for internal links (solve dependency issue sonic-net#2466) (16 hours ago) [wenyiz2021]
15026e14 - suppport multi asic for show queue counter ([dockers] Prevent old supervisord messages from gettting re-logged to syslog sonic-net#2439) (16 hours ago) [zhixzhu]
2d773e17 - [masic support] 'show run bgp' support for multi-asic (lo address not synced to the asic sonic-net#2427) (16 hours ago) [wenyiz2021]
sonic-swss

4f304bc - [EVPN]Handling race condition when remote VNI arrives before tunnel map entry ([sonic-quagga] Function defect, do NOT cancel route while connect IP down sonic-net#2642) (15 hours ago) [Sudharsan Dhamal Gopalarathnam]
34fc615 - [sai_failure_dump]Invoking dump during SAI failure (Add hook to allow customizing link cable lengths sonic-net#2644) (15 hours ago) [Sudharsan Dhamal Gopalarathnam]
b817695 - [autoneg]Fixing adv interface types to be set when AN is disabled (Fix issue with platform file path name sonic-net#2638) (15 hours ago) [Sudharsan Dhamal Gopalarathnam]
ab36bd4 - [bfdorch] add local discriminator to state DB ([bitmap-vnet]: Bitmap vnet test image [DO NOT MERGE] sonic-net#2629) (15 hours ago) [Baorong Liu]
6343471 - Remove TODO comments that are no longer relevant (Add knob in ConfigDB to enable/disable telemetry container sonic-net#2622) (15 hours ago) [Lior Avramov]
2b1869c - [refactor]Refactoring sai handle status (Rollback kernel submodule update. sonic-net#2621) (15 hours ago) [Sudharsan Dhamal Gopalarathnam]
c41a1b7 - Fix issue ARP entry is out of sync between kernel and APPL_DB after warm reboot if the ARP entry is updated more than once during warm reboot in PFC watchdog warm reboot test sonic-net#13341 ARP entry can be out of sync between kernel and APPL_DB if multiple updates are received from RTNL ([sub module] advance sonic-utilities sub module for 201811 branch sonic-net#2619) (15 hours ago) [Stephen Sun]
da0cf7a - Changed the BFD default detect multiplier to 10x ("failed to load plugin io.containerd.snapshotter..." seen during linux boot up sonic-net#2614) (15 hours ago) [siqbal1986]
13b5adf - [vstest] Only collect stdout of orchagent_restart_check in vstest ([submodules] update swss and utilities pointers sonic-net#2597) (15 hours ago) [bingwang-ms]
2b9d94d - Avoid aborting orchagent when setting TUNNEL attributes (build failing for PLATFORM=p4 sonic-net#2591) (15 hours ago) [Stephen Sun]
99b7d3b - Only collect stdout of orchagent_restart_check in vstest ( [saibcm-modules]: import new bcm modules sonic-net#2578) (15 hours ago) [bingwang-ms]
5209c42 - dereg acl-rule counters during acl-table del ([201803] Set a rate limit on syslog messages from all Docker containers sonic-net#2574) (15 hours ago) [Vivek]
ae68054 - Fixed set mtu for deleted subintf due to late notification ([vs]: Add option to specify platform name for DVS orchagent sonic-net#2571) (15 hours ago) [EdenGri]
ab13dfa - Remove TODO comments which are no longer needed (support set timezone in ConfigDB sonic-net#2568) (15 hours ago) [Junchao-Mellanox]
a3545cf - Modify coppmgr mergeConfig to support preserving copp tables through reboot. (Added new SN3700/SN3700C Mellanox platforms sonic-net#2548) (15 hours ago) [Aryeh Feigin]
be16e79 - Use github code scanning instead of LGTM ([201803] [services] Restart SwSS service upon unexpected critical process exit sonic-net#2546) (15 hours ago) [Liu Shilong]
63c0234 - Updated handling of VRF_VNI mapping and VLAN_VNI mapping for same VNI ID (Move warm_restart enable/disable config to stateDB WARM_RESTART_ENABL… sonic-net#2538) (15 hours ago) [Tapash Das]
4844111 - Fix potential risks ([mlnx] Fix sai xml path for boxer platform sonic-net#2516) (15 hours ago) [Liran-Ar]
6420808 - [p4orch]: PINS Extension tables support ([build] When generating image version, handle case where current commit has no reachable tags sonic-net#2506) (15 hours ago) [svshah-intel]
sonic-swss-common

1badd46 - Increase the netlink buffer size from 3MB to 16MB. (arp_update doesn't sleep 300 between each execution sonic-net#739) (14 hours ago) [KISHORE KUNAL]
6555057 - Refactor eventpublisher deinit ([acl] Add default deny rule for l3 table sonic-net#734) (14 hours ago) [Zain Budhwani]
f4d6de7 - Use github code scanning instead of LGTM ([sonic-quagga]:update submodule sonic-net#718) (14 hours ago) [Liu Shilong]
sonic-linux-kernel

74f9a8f - Update linux kernel for hw-mgmt V.7.0020.4104 (Move template files to /usr/share/sonic/templates sonic-net#305) (14 hours ago) [Stephen Sun]
6365701 - Fixes for emmc unreliability ([build_debian.sh]: Integrate system dump script sonic-net#270) (14 hours ago) [Samuel Angebault]
How I did it
How to verify it
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants