Skip to content

[action] [PR:21830] Fix TSA-TSB race condition on multi-asic platforms#710

Merged
mssonicbld merged 1 commit intoAzure:202405from
mssonicbld:cherry/msft-202405/21830
Feb 26, 2025
Merged

[action] [PR:21830] Fix TSA-TSB race condition on multi-asic platforms#710
mssonicbld merged 1 commit intoAzure:202405from
mssonicbld:cherry/msft-202405/21830

Conversation

@mssonicbld
Copy link
Collaborator

Why I did it

Fixes sonic-net/sonic-buildimage#21816

Work item tracking
  • Microsoft ADO 31499777:

How I did it

Setting the STATE_DB ALL_SERVICE_STATUS|tsa_tsb_service flag first as part of startup_tsa_tsb service, followed by configuring TSA.
And as part of the case, when tsa_ena is False (genuine or due to race condition), we explictly call TSA again to ensure all asics go to TSA state.

How to verify it

Reboot the multi-asic linecard, and validate that all asics are in TSA state and TSA-TSB timer is running
config_reload

Tested following scenarios:

  1. reboot multi-asic linecard
  2. config reload
  3. execute TSA while the service is running
  4. TSA, config save and then config_reload
  5. execute TSB while the service is running

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305

Tested branch (Please provide the tested image version)

20240532.08

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

<!--
     Please make sure you've read and understood our contributing guidelines:
     https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

     ** Make sure all your commits include a signature generated with `git commit -s` **

     If this is a bug fix, make sure your description includes "fixes #xxxx", or
     "closes #xxxx" or "resolves #xxxx"

     Please provide the following information:
-->

#### Why I did it
Fixes sonic-net/sonic-buildimage#21816

##### Work item tracking
- Microsoft ADO **31499777**:

#### How I did it
Setting the STATE_DB ALL_SERVICE_STATUS|tsa_tsb_service flag first as part of startup_tsa_tsb service, followed by configuring TSA.
And as part of the case, when tsa_ena is False (genuine or due to race condition), we explictly call TSA again to ensure all asics go to TSA state.
#### How to verify it
Reboot the multi-asic linecard, and validate that all asics are in TSA state and TSA-TSB timer is running
config_reload

Tested following scenarios:
1. reboot multi-asic linecard
2. config reload
3. execute TSA while the service is running
4. TSA, config save and then config_reload
5. execute TSB while the service is running
<!--
If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012.
-->

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
- [ ] 202211
- [ ] 202305

#### Tested branch (Please provide the tested image version)
20240532.08
<!--
- Please provide tested image version
- e.g.
- [x] 20201231.100
-->

- [ ] <!-- image version 1 -->
- [ ] <!-- image version 2 -->

#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->

<!--
 Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
-->

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

#### A picture of a cute animal (not mandatory but encouraged)
@mssonicbld
Copy link
Collaborator Author

Original PR: sonic-net/sonic-buildimage#21830

@mssonicbld
Copy link
Collaborator Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld mssonicbld merged commit 94eecda into Azure:202405 Feb 26, 2025
7 of 9 checks passed
mssonicbld added a commit that referenced this pull request Nov 25, 2025
…test HEAD automatically (#1846)

#### Why I did it
src/sonic-platform-daemons
```
* 1f7fd1b - (HEAD -> 202503, origin/202503) Merge pull request #57 from mihirpat1/sff_mgr_cmis_fix_202503 (23 hours ago) [Arvindsrinivasan Lakshmi Narasimhan]
* d26e663 - [202503] [sff-mgr] Disable SFF manager support for all CMIS transceivers (#710) (5 days ago) [mihirpat1]
```
#### How I did it
#### How to verify it
#### Description for the changelog
mssonicbld added a commit to mssonicbld/sonic-buildimage-msft that referenced this pull request Nov 28, 2025
…D automatically (#24554)

#### Why I did it
src/sonic-platform-daemons
```
* 3ba2423 - (HEAD -> 202505, origin/202505) [202505] [sff-mgr] Disable SFF manager support for all CMIS transceivers (Azure#710) (Azure#712) (3 days ago) [mihirpat1]
* f9e15e3 - [202505] Re-arm timer for CMIS Datapath pre-init check (Azure#707) (Azure#708) (9 days ago) [mihirpat1]
```
#### How I did it
#### How to verify it
#### Description for the changelog
liushilongbuaa pushed a commit that referenced this pull request Mar 25, 2026
…D automatically (#24583)

#### Why I did it
src/sonic-platform-daemons
```
* 99a4594 - (HEAD -> master, origin/master, origin/HEAD) Enable sff_mgr for all non-cmis txvrs (#715) (2 days ago) [Ariz Zubair]
* 6ce0b7c - Xcvrd Refactor 3/13: Breakup task_worker into separate functions - 1 (#701) (6 days ago) [Bobby McGonigle]
* 10b787c - [SmartSwitch] Add graceful shutdown and startup handling in platform daemons (#703) (6 days ago) [Vasundhara Volam]
* 4a6a012 - [sff-mgr] Disable SFF manager support for all CMIS transceivers (#710) (8 days ago) [mihirpat1]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant