[services] make snmp.timer work again and delay telemetry.service#3742
[services] make snmp.timer work again and delay telemetry.service#3742lguohan merged 6 commits intosonic-net:masterfrom
Conversation
Signed-off-by: Stepan Blyschak <[email protected]>
swss.service but start with delay on boot Signed-off-by: Stepan Blyschak <[email protected]>
…enabled Signed-off-by: Stepan Blyschak <[email protected]>
Signed-off-by: Stepan Blyschak <[email protected]>
| RestartSec=30 | ||
|
|
||
| [Install] | ||
| WantedBy=multi-user.target swss.service |
There was a problem hiding this comment.
By remove these 2 lines, restarting swss will no longer restarting snmp. You need to add snmp as a dependent of swss similar https://github.com/Azure/sonic-buildimage/blob/master/files/scripts/swss.sh#L5. But you need to add a logic to ignore this dependency if the uptime is less than 210 seconds.
I see that you did it with snmp.timer below.
I am wondering if we should use systemctrld to manage these dependencies that beyond the scope of systemctrld?
There was a problem hiding this comment.
If we will ignore snmp dependency in swss.sh if uptime is less than 210 sec, then who will start snmp? looks like we still need snmp.timer so the complexity will not decrease and we will have 210 sec hard coded in two places.
If the desired behavior can be done with systemd then why to do that in scripts?
) Delay CPU intensive services at boot - How I did it Made snmp.timer work and add telemetry.timer. But this is not enough because it breaks the existing snmp dependency on swss. So, in this solution snmp timer is a wanted by swss service, but since OnBootSec timer expires only once it will not trigger snmp service, so I added line "OnUnitActiveSec=0 sec" which will start snmp service based on the last time it was active. On boot only OnBootSec will expire, on swss start/restarts only second timer will expire immediately and trigger snmp service. However, snmp service will not stop after "systemctl stop snmp" because of the second timer which will always expire when snmp service because unavailable. So there is a conflict which will be handled by systemd if we add "Conflicts=" line to both snmp.service and snmp.timer. So during boot: snmp does not start by default swss starts and starts snmp timer OnUnitActiveSec=0 does not expire since there is no snmp active OnBootSec expires and starts snmp service and snmp timer gets stopped During "systemctl restart swss" snmp stops because of Requisite on swss snmp unblocks snmp timer from running swss starts and starts snmp timer OnUnitActiveSec=0 expires imidiately and start snmp which stops snmp timer During "systemctl stop snmp" stop of snmp service unblocks snmp timer but no one starts the timer so it is not started by "OnUnitActiveSec=0"
…atically (#21343) #### Why I did it src/sonic-utilities ``` * 899ed9b - (HEAD -> master, origin/master, origin/HEAD) Remove Multi ASIC namespace Check. (#3783) (3 days ago) [Xincun Li] * 9f1eab4 - Memory Statistics Config and Show Commands (#3575) (3 days ago) [Kanza Latif] * 595c2aa - Utilities Changes for DHCP DoS Mitigation Feature (#3301) (3 days ago) [Asad Raza] * a3d15bc - Add multi-asic support for dropconfig (#3735) (3 days ago) [HP] * 5ce06b2 - Add golden config check (#3770) (4 days ago) [Xincun Li] * 3c50dee - Fix `vnet_route_check` for active and inactive routes, add `--all` option (#3763) (9 days ago) [mramezani95] * 40ba225 - Revert "enable pfcwd for backplane ports (#3759)" (#3767) (13 days ago) [Dashuai Zhang] * 2866ccd - enable pfcwd for backplane ports (#3759) (2 weeks ago) [Dashuai Zhang] * 3abd19e - [FC] remove FC delay field (#3577) (3 weeks ago) [Stepan Blyshchak] * 11c2716 - Improve SONiC disk checker to handle disk full case and mount overlay fs to allow remote user login. (#3700) (3 weeks ago) [Hua Liu] * 13619aa - [QOS] Skip showing unnecessary warning message (#3708) (3 weeks ago) [Vivek] * f4e6e5b - Fixing 'show ip bgp neighbor <ip>' in frr unified config mode (#3738) (3 weeks ago) [kalash-nexthop] * 9a18155 - Optimize lag_keepalive by crafting the LACPDU packet ourselves (#3170) (3 weeks ago) [Saikrishna Arcot] * a7deb8c - display proper message with proper errno for kvm. (#3750) (3 weeks ago) [Dawei Huang] * 865f196 - [Mellanox] Add new SN5640 platform and HwSKU (#3742) (4 weeks ago) [Noa Or] * 5fa8502 - [show_techsupport][pstore] Archive /var/lib/systemd/pstore info to techsupport (#3745) (4 weeks ago) [Marty Y. Lok] * 8f69d5a - sonic-utilities: WRED stats feature changes on sonic-utilities (#2807) (4 weeks ago) [Rajesh Perumal R] * 9d273f1 - Fix call for spanning-tree commands in dump script (#3723) (4 weeks ago) [DavidZagury] * 6d95d9b - Make 'show ip bgp summary' work even when we don't have any peer groups (#3739) (4 weeks ago) [kalash-nexthop] * 7f3957c - Fix ssdhealth failure on VS platform (#3743) (4 weeks ago) [Vivek] * 97b4e4b - Fix show interface counters for Chassis Packet Supervisor (#3734) (4 weeks ago) [anamehra] * 414935b - Add recover asic_id config if load golden config. (#3711) (5 weeks ago) [Xincun Li] * 973cfdc - Remove partially installer image when image install failed. (#3712) (5 weeks ago) [Hua Liu] * 97c20cc - CLI support for SmartSwitch PMON (#3271) (5 weeks ago) [rameshraghupathy] * 752c3d4 - [ACL] Display rule and table info written to APP DB (#3713) (5 weeks ago) [Vivek] * fbd0c3b - [show][interface] Add changes for show interface flap command (#3724) (5 weeks ago) [vdahiya12] * ddccabe - [show][interface] Add changes for show interface errors command (#3721) (5 weeks ago) [vdahiya12] * be870a6 - [config] Exit with non-zero when qos reload fail (#3710) (6 weeks ago) [Jianyue Wu] * ffa66e9 - Remove debug dump import by default (#3715) (6 weeks ago) [Vivek] * fc3a3cb - [ssdhealth] Check for default device before falling back to discovery (#3693) (6 weeks ago) [Vivek] * 782c33a - [yang] Enforce yang full support in full config command (#3716) (6 weeks ago) [jingwenxie] * a5b7a90 - [show][interfaces] Add proposal for show interfaces flap (#3627) (6 weeks ago) [vdahiya12] * 0083a12 - sonic-utilities: add th5 hwskus to gcu conf file (#3714) (7 weeks ago) [Chris] * 81cf04f - [Mellanox] Add Mellanox-SN5610N-C256S2, Mellanox-SN5610N-C224O8 to GCU validators (#3658) (7 weeks ago) [noaOrMlnx] * 349a101 - [db_migrator] Migrate tunnel table (#3704) (7 weeks ago) [Longxiang Lyu] * 0e327c5 - show ip interfaces: fix exception with BGP unnumbered (#3695) (8 weeks ago) [Brad House] * 7100f73 - [show_tech] modify generate_dump to includes BERT data and also the platform specified hw-mgmt info (#3676) (8 weeks ago) [Marty Y. Lok] * 80d4698 - Delete TRSNCEIVER tables while config reload (#3680) (8 weeks ago) [noaOrMlnx] * bf6ff9f - Enable multi asic golden config YANG validation. (#3685) (8 weeks ago) [Xincun Li] ``` #### How I did it #### How to verify it #### Description for the changelog
…atically (sonic-net#21343) #### Why I did it src/sonic-utilities ``` * 899ed9b - (HEAD -> master, origin/master, origin/HEAD) Remove Multi ASIC namespace Check. (sonic-net#3783) (3 days ago) [Xincun Li] * 9f1eab4 - Memory Statistics Config and Show Commands (sonic-net#3575) (3 days ago) [Kanza Latif] * 595c2aa - Utilities Changes for DHCP DoS Mitigation Feature (sonic-net#3301) (3 days ago) [Asad Raza] * a3d15bc - Add multi-asic support for dropconfig (sonic-net#3735) (3 days ago) [HP] * 5ce06b2 - Add golden config check (sonic-net#3770) (4 days ago) [Xincun Li] * 3c50dee - Fix `vnet_route_check` for active and inactive routes, add `--all` option (sonic-net#3763) (9 days ago) [mramezani95] * 40ba225 - Revert "enable pfcwd for backplane ports (sonic-net#3759)" (sonic-net#3767) (13 days ago) [Dashuai Zhang] * 2866ccd - enable pfcwd for backplane ports (sonic-net#3759) (2 weeks ago) [Dashuai Zhang] * 3abd19e - [FC] remove FC delay field (sonic-net#3577) (3 weeks ago) [Stepan Blyshchak] * 11c2716 - Improve SONiC disk checker to handle disk full case and mount overlay fs to allow remote user login. (sonic-net#3700) (3 weeks ago) [Hua Liu] * 13619aa - [QOS] Skip showing unnecessary warning message (sonic-net#3708) (3 weeks ago) [Vivek] * f4e6e5b - Fixing 'show ip bgp neighbor <ip>' in frr unified config mode (sonic-net#3738) (3 weeks ago) [kalash-nexthop] * 9a18155 - Optimize lag_keepalive by crafting the LACPDU packet ourselves (sonic-net#3170) (3 weeks ago) [Saikrishna Arcot] * a7deb8c - display proper message with proper errno for kvm. (sonic-net#3750) (3 weeks ago) [Dawei Huang] * 865f196 - [Mellanox] Add new SN5640 platform and HwSKU (sonic-net#3742) (4 weeks ago) [Noa Or] * 5fa8502 - [show_techsupport][pstore] Archive /var/lib/systemd/pstore info to techsupport (sonic-net#3745) (4 weeks ago) [Marty Y. Lok] * 8f69d5a - sonic-utilities: WRED stats feature changes on sonic-utilities (sonic-net#2807) (4 weeks ago) [Rajesh Perumal R] * 9d273f1 - Fix call for spanning-tree commands in dump script (sonic-net#3723) (4 weeks ago) [DavidZagury] * 6d95d9b - Make 'show ip bgp summary' work even when we don't have any peer groups (sonic-net#3739) (4 weeks ago) [kalash-nexthop] * 7f3957c - Fix ssdhealth failure on VS platform (sonic-net#3743) (4 weeks ago) [Vivek] * 97b4e4b - Fix show interface counters for Chassis Packet Supervisor (sonic-net#3734) (4 weeks ago) [anamehra] * 414935b - Add recover asic_id config if load golden config. (sonic-net#3711) (5 weeks ago) [Xincun Li] * 973cfdc - Remove partially installer image when image install failed. (sonic-net#3712) (5 weeks ago) [Hua Liu] * 97c20cc - CLI support for SmartSwitch PMON (sonic-net#3271) (5 weeks ago) [rameshraghupathy] * 752c3d4 - [ACL] Display rule and table info written to APP DB (sonic-net#3713) (5 weeks ago) [Vivek] * fbd0c3b - [show][interface] Add changes for show interface flap command (sonic-net#3724) (5 weeks ago) [vdahiya12] * ddccabe - [show][interface] Add changes for show interface errors command (sonic-net#3721) (5 weeks ago) [vdahiya12] * be870a6 - [config] Exit with non-zero when qos reload fail (sonic-net#3710) (6 weeks ago) [Jianyue Wu] * ffa66e9 - Remove debug dump import by default (sonic-net#3715) (6 weeks ago) [Vivek] * fc3a3cb - [ssdhealth] Check for default device before falling back to discovery (sonic-net#3693) (6 weeks ago) [Vivek] * 782c33a - [yang] Enforce yang full support in full config command (sonic-net#3716) (6 weeks ago) [jingwenxie] * a5b7a90 - [show][interfaces] Add proposal for show interfaces flap (sonic-net#3627) (6 weeks ago) [vdahiya12] * 0083a12 - sonic-utilities: add th5 hwskus to gcu conf file (sonic-net#3714) (7 weeks ago) [Chris] * 81cf04f - [Mellanox] Add Mellanox-SN5610N-C256S2, Mellanox-SN5610N-C224O8 to GCU validators (sonic-net#3658) (7 weeks ago) [noaOrMlnx] * 349a101 - [db_migrator] Migrate tunnel table (sonic-net#3704) (7 weeks ago) [Longxiang Lyu] * 0e327c5 - show ip interfaces: fix exception with BGP unnumbered (sonic-net#3695) (8 weeks ago) [Brad House] * 7100f73 - [show_tech] modify generate_dump to includes BERT data and also the platform specified hw-mgmt info (sonic-net#3676) (8 weeks ago) [Marty Y. Lok] * 80d4698 - Delete TRSNCEIVER tables while config reload (sonic-net#3680) (8 weeks ago) [noaOrMlnx] * bf6ff9f - Enable multi asic golden config YANG validation. (sonic-net#3685) (8 weeks ago) [Xincun Li] ``` #### How I did it #### How to verify it #### Description for the changelog
…lly (#23068) #### Why I did it src/sonic-swss ``` * 9d74a494 - (HEAD -> master, origin/master, origin/HEAD) [orchagent] CoPP neighbor miss trap and enhancements (#3624) (8 hours ago) [Ravi Minnikanti(Marvell)] * 41dc0cbc - Improve route performance 20% by changing NextHopGroupTable from std::map to std::unordered_map. (#3742) (8 hours ago) [Hua Liu] * 27391fcf - Publish oper_status time to STATE_DB (#3756) (29 hours ago) [Bobby McGonigle] * ad80fa5c - [trim]: Add Packet Trimming Asym DSCP to OA (#3705) (29 hours ago) [Nazarii Hnydyn] * dc520a78 - [ssw][ha] fix dpu_state_db connection issue and zmq not supporting dpu_appl_db (31 hours ago) [Jing Zhang] * 035e1c7a - Added MAX pre-FEC BER for link health monitoring (#3757) (33 hours ago) [Prince George] * 0c5a6e43 - Skip ref counting standby mux neighbor NHs when added to NH group (#3753) (33 hours ago) [manamand2020] * f53cc8cd - [DASH] Implement PL Redirect Map (#3731) (35 hours ago) [Lawrence Lee] * c5c360e9 - Fix counter issue #22775 and #22478 (#3681) (4 days ago) [Stephen Sun] * bd737056 - [DASH] Support trusted VNIs for appliance and ENI objects (#3728) (4 days ago) [Lawrence Lee] * cea81b2e - stpd crashes due to wrong no.of stp instance passed from stpmgrd (#3752) (5 days ago) [Divya Kumaran Chandralekha] * af56a611 - Fix fpmsyncd crash during pfcwd/test_pfcwd_warm_reboot.py worm reboot issue (#3746) (5 days ago) [Hua Liu] * 80932db9 - use the exact strings from hld (#3735) (13 days ago) [Jing Zhang] * f44f6ab6 - [vs][mirror]: Update test to use the max TC number provided by VS lib (#3712) (2 weeks ago) [Nazarii Hnydyn] * 55e9bba7 - remove the logic that skip system neigh task for ASICs that share common hostname (#3718) (2 weeks ago) [Changrong Wu] * 33567531 - LC buffer errors for local port (#3719) (2 weeks ago) [Vineet Mittal] * bad21415 - Update INIT_VIEW timeout for marvell-prestera platforms (#3729) (2 weeks ago) [Pavan Prakash] * eebaf97e - [routeorch] Wait for the VRF to be created (#3652) (2 weeks ago) [Manoharan Sundaramoorthy] * 7dd3be98 - [fpmsyncd]Fixing the blackhole route removal during warmboot (#3726) (2 weeks ago) [Sudharsan Dhamal Gopalarathnam] * 575c3427 - [routeorch] Handle SAI_STATUS_ITEM_NOT_FOUND when setting route entries (#3713) (3 weeks ago) [Nikola Dancejic] * 1ae67874 - [portsorch] postpone non-critical port init part in warm/fast-reboot (#3562) (3 weeks ago) [Stepan Blyshchak] * 889aff63 - add support for local endpoints in vnet_route_tunnel (#3651) (3 weeks ago) [Jing Zhang] * 1f97afb3 - [trim]: Add Packet Trimming to OA (#3594) (3 weeks ago) [Nazarii Hnydyn] * 8c2b3379 - Gracefully handle errors when accessing dpu app_state DB on NPU from DPU (#3716) (3 weeks ago) [prabhataravind] * a0e19532 - Harden module build script by specifying the source version to get (#3723) (3 weeks ago) [Saikrishna Arcot] * 0081e3ae - Improve route orch performance by enable ZMQ (#3632) (3 weeks ago) [Hua Liu] ``` #### How I did it #### How to verify it #### Description for the changelog
- What I did
Delay CPU intensive services at boot
- How I did it
Made snmp.timer work and add telemetry.timer.
But this is not enough because it breaks the existing snmp dependency on swss.
So, in this solution snmp timer is a wanted by swss service, but since OnBootSec timer expires only once it will not trigger snmp service, so I added line "OnUnitActiveSec=0 sec" which will start snmp service based on the last time it was active. On boot only OnBootSec will expire, on swss start/restarts only second timer will expire immediately and trigger snmp service.
However, snmp service will not stop after "systemctl stop snmp" because of the second timer which will always expire when snmp service because unavailable.
So there is a conflict which will be handled by systemd if we add "Conflicts=" line to both snmp.service and snmp.timer.
So during boot:
During "systemctl restart swss"
During "systemctl stop snmp"
- How to verify it
- Description for the changelog
- A picture of a cute animal (not mandatory but encouraged)