Added code to start a new service to configure TSA as soon as Linecard comes up, start a timer to configure TSB and configure TSB when the timer expires#206
Added code to start a new service to configure TSA as soon as Linecard comes up, start a timer to configure TSB and configure TSB when the timer expires#206gechiang merged 3 commits intoAzure:202205from saksarav-nokia:saksarav-nokia-tsa-tsb
Conversation
files/scripts/startup_tsa_tsb.py
Outdated
| from threading import Timer | ||
| import os.path | ||
|
|
||
| def getPlatform(): |
There was a problem hiding this comment.
use this function iso defining a new one
https://github.com/sonic-net/sonic-buildimage/blob/04d83c05aef84df531e2cb0c55c24dcd637f7e83/src/sonic-py-common/sonic_py_common/device_info.py#L89
There was a problem hiding this comment.
Reused the existing function
files/scripts/startup_tsa_tsb.py
Outdated
| platform = (subprocess.check_output(['sonic-cfggen', '-d', '-v', platform_key.replace('"',"'")]).strip()).decode() | ||
| return platform | ||
|
|
||
| def getNumAsics(): |
There was a problem hiding this comment.
There was a problem hiding this comment.
Reused the existing function
files/scripts/startup_tsa_tsb.py
Outdated
| return line.split('=')[1].strip() | ||
| return 0 | ||
|
|
||
| def getTsbTimerInterval(): |
There was a problem hiding this comment.
nit: please use snake case for functions names.
There was a problem hiding this comment.
Modified the function names to use snake case
files/scripts/startup_tsa_tsb.py
Outdated
| def getTsaConfig(asic_ns): | ||
| tsa_config = 'BGP_DEVICE_GLOBAL.STATE.tsa_enabled' | ||
| tsa_ena = (getSonicConfig(asic_ns, tsa_config)).decode() | ||
| print('{}: {} - CONFIG_DB.{} : {}'.format(__file__, asic_ns, tsa_config, tsa_ena)) |
There was a problem hiding this comment.
can we please use logger iso print statements so that the message appear in syslog?
There was a problem hiding this comment.
Changed the print to logger.log_info
files/scripts/startup_tsa_tsb.py
Outdated
| def config_tsa(): | ||
| tsa_ena = get_tsa_status() | ||
| if tsa_ena == True: | ||
| print("{}: Configuring TSA".format(__file__)) |
There was a problem hiding this comment.
nit: please check the indentation in the file. It does not seem consistent
There was a problem hiding this comment.
Fixed the indentation
files/scripts/startup_tsa_tsb.py
Outdated
| return line.split('=')[1].strip() | ||
| return 0 | ||
|
|
||
| def getSonicConfig(ns, config_name): |
There was a problem hiding this comment.
will this work for single asic linecards?
There was a problem hiding this comment.
Modified the functions to work for both single asic and multi asic
| ConditionPathExists=!/etc/sonic/chassisdb.conf | ||
|
|
||
| [Service] | ||
| Environment="STARTED_BY_TSA_TSB_SERVICE=1" |
There was a problem hiding this comment.
after the service is stopped do we need to reset STARTED_BY_TSA_TSB_SERVICE
There was a problem hiding this comment.
Reset the env variable
|
@arlakshm , addressed all your comments. Please check |
|
@saksarav-nokia Why is this PR raised against this MSFT repo instead of on the public repo at: "https://github.com/sonic-net/sonic-buildimage"? |
@gechiang , It was suggested by MSFT in the meeting . |
| Description= STARTUP TSA-TSB SERVICE | ||
| Requires=updategraph.service database.service | ||
| After=updategraph.service database.service | ||
| ConditionPathExists=!/etc/sonic/chassisdb.conf |
There was a problem hiding this comment.
updategraph dependency is to start the tsa-tsb service after config is loaded and chassisdb.conf file check is to start the service only in IMM.
| [Unit] | ||
| Description= STARTUP TSA-TSB SERVICE | ||
| Requires=updategraph.service database.service | ||
| After=updategraph.service database.service |
There was a problem hiding this comment.
Does it make sure we are running this service before swss/syncd/teamd/bgp services comes up ?
There was a problem hiding this comment.
There is no dependency on these services right now. Ensure this service starts before bgp should be sufficient right?
There was a problem hiding this comment.
Added "Before=bgp.service"
| Environment="STARTED_BY_TSA_TSB_SERVICE=1" | ||
| ExecStart=/usr/bin/python3 -u /usr/local/bin/startup_tsa_tsb.py start | ||
| ExecStop=/usr/bin/python3 -u /usr/local/bin/startup_tsa_tsb.py stop | ||
| RemainAfterExit=yes |
There was a problem hiding this comment.
what is the behavior of this services in case of config reload and if the swss docker restart ?
There was a problem hiding this comment.
config reload or swss restart doesn't restart the tsa service
As discussed in the Chassis meeting today. It seems that this PR is supposed to be raised in this MSFT repo for the 202205 branch. Once it is approved and merged and well tested, this "feature" is also required in public master at which time please port it to "https://github.com/sonic-net/sonic-buildimage" master branch. |
…es up, start a timer to configure TSB and configure TSB when the timer expires Signed-off-by: saksarav <sakthivadivu.saravanaraj@nokia.com>
1) Reused the existing Python functions 2) Modified the function names to use snake case 3) Changed the print to logger.log_info 4) Fixed the indentation 5) Modified it to work for both single and multi asic 6) Reset the env variable 7) When user issues TSB when the service is running, stop the service Signed-off-by: saksarav <sakthivadivu.saravanaraj@nokia.com>
|
@gechiang , i moved the PR to 202205 |
|
please replace IMM with LC in the PR title so it's more generic |
Changed the title |
files/scripts/startup_tsa_tsb.py
Outdated
| return tsa_ena | ||
|
|
||
| def config_tsb(): | ||
| logger.log_info("startup_tsa_tsb: Configuring TSB") |
There was a problem hiding this comment.
logging at line 70 and 80 could be consistent
| return | ||
|
|
||
| def stop_tsa_tsb(): | ||
| reset_env_variables() |
There was a problem hiding this comment.
In case the user entered TSA/TSB command manually, shouldn't we explicitly stop/cancel timer thread which we started earlier in start_tsb_timer() ? Or does the systemd cleans it up during exit
Here we just reset the env variables only.
There was a problem hiding this comment.
I will set the daemon flag for the Timer thread so that the thread will exit when the main thread is killed by the service stop
There was a problem hiding this comment.
@judyjoseph , i tested with prctl and the threads started by the main thread are stopped when the service is stopped and main python script is killed with SIGTERM. So we don't need to explicitly stop the timer.
Signed-off-by: saksarav <sakthivadivu.saravanaraj@nokia.com>
|
cc @anamehra for viz. |
…utomatically (#21565) #### Why I did it src/sonic-host-services ``` * 69788c2 - (HEAD -> master, origin/master, origin/HEAD) Fix: Set default values for kdump enhancement (#217) (3 days ago) [Muhammad Ali Hussnain] * d7e8021 - [chassis] make multi_asic_ns_to_host_fwd False for EXTERNAL_CLIENT acl service (#218) (4 days ago) [Arvindsrinivasan Lakshmi Narasimhan] * d925457 - Fix sudo command failed because root user password expired by password hardening feature issue (#215) (4 days ago) [Hua Liu] * 7e1b280 - reverted ssh_key to ssh_string (#216) (6 days ago) [Muhammad Ali Hussnain] * 741a9df - Handle NotImplementedError in determine-reboot-cause that will be thrown on VS Chassis platform (#211) (7 days ago) [Changrong Wu] * 47f6feb - check sonic-installer message and return proper error code for when image doesn't exist. (#210) (7 days ago) [Dawei Huang] * 1fe9a76 - Updated Key and Path (#209) (7 days ago) [Muhammad Ali Hussnain] * bb0a31c - Adding support for persistent storage and retrieval of DPU reboot-cause (#169) (8 days ago) [rameshraghupathy] * 5e08927 - [hostcfgd] Fix the state machine during eth0 default route check failure (#196) (2 weeks ago) [Vivek] * d2cc1a8 - Add ImageService.set_next_boot for GNOI Activate OS. (#207) (3 weeks ago) [Dawei Huang] * 9c49913 - register image_service and docker_service. (#208) (3 weeks ago) [Dawei Huang] * d88d8d0 - Implementation for ImageService.List (#206) (4 weeks ago) [Dawei Huang] * d7e4df5 - Enabled configuring the default number of kdumps in Linux. (#202) (4 weeks ago) [Mridul Bajpai] * ca9d329 - kdump-remote feature in hostcfgd (#166) (4 weeks ago) [Muhammad Ali Hussnain] ``` #### How I did it #### How to verify it #### Description for the changelog
…AD automatically (#1396) #### Why I did it src/sonic-utilities ``` * 7941e530 - (HEAD -> 202503, origin/202503) Merge pull request #206 from mssonicbld/sonicbld/202503-merge (23 hours ago) [mssonicbld] * 7e808a81 - Merge branch '202412' of https://github.com/Azure/sonic-utilities.msft into 202503 (23 hours ago) [Sonic Automation] * 992831ac - (origin/202412) fix show bgp cli on multiple asic device (#205) (24 hours ago) [mssonicbld] ``` #### How I did it #### How to verify it #### Description for the changelog
…tomatically (#1983) #### Why I did it src/sonic-swss ``` * b7678f23 - (HEAD -> 202412, origin/202412) [action] [PR:4193] [countersyncd] fix otel actor log level (#204) (22 hours ago) [mssonicbld] * 803d2d78 - [action] [PR:4197] [countersyncd]: Modify the exit behavior of the main function (#206) (22 hours ago) [mssonicbld] ``` #### How I did it #### How to verify it #### Description for the changelog
Why I did it
The traffic loss is seen when the chassis or Linecard is rebooted
How I did it
How to verify it
Which release branch to backport (provide reason below if selected)
Description for the changelog
A picture of a cute animal (not mandatory but encouraged)