Skip to content

[teamd] Enable/disable container auto-restart based on configuration#4053

Closed
yozhao101 wants to merge 7 commits intosonic-net:masterfrom
yozhao101:restart_knob
Closed

[teamd] Enable/disable container auto-restart based on configuration#4053
yozhao101 wants to merge 7 commits intosonic-net:masterfrom
yozhao101:restart_knob

Conversation

@yozhao101
Copy link
Contributor

@yozhao101 yozhao101 commented Jan 23, 2020

- What I did
Currently we already have the auto-restart features for each docker container. That means if a critical
process exited abnormally or crashed, this event will be captured and then the corresponding
container will be restarted. Right now, we want to add a knob/switch for this feature such that
the developer can dynamically turn on/off it during testing new docker images.

- How I did it
We will create a table in the database container. In this table, we store the current state of
auto-restart feature for each container. Initially, the state of this feature will be enabled.
The event listener will dynamically read the state from databse container and then decide whether
restart the container based on it once receive the event showing a critical process exited.
The user can use the existing interface (TBD) to modify this state from enabled to disabled or
vice versa.

- How to verify it
I manually created a table in the database container called Container_Feature. In this table, each
container will have its corresponding state row such as the initial state of auto-restart for teamd
is in the 'enabled' status.

listener read the flag of auto-restart from database, then decide
whether enable/disable the auto-restart feature or not.

Signed-off-by: Yong Zhao <[email protected]>
@yozhao101 yozhao101 requested a review from jleveque January 23, 2020 01:21
@jleveque
Copy link
Contributor

@yozhao101: After the requested changes, I'm satisfied with this approach. Do you intend to extend this PR to all containers? If you think it would create a PR that is very large and difficult to review, you can also create separate PRs for each container.

@yozhao101
Copy link
Contributor Author

@jleveque Instead of creating a large PR, I will submit a separate PR for each containers such that
it will be convenient for review.

@yozhao101 yozhao101 changed the title [Service] Add a knob for auto-restart feature used by different containers. [Service] Add a knob for auto-restart feature used by different containers (Teamd). Jan 23, 2020
@yozhao101 yozhao101 changed the title [Service] Add a knob for auto-restart feature used by different containers (Teamd). [Service] Add a knob for auto-restart feature used by Teamd. Jan 23, 2020
@jleveque jleveque changed the title [Service] Add a knob for auto-restart feature used by Teamd. [teamd] Enable/disable container auto-restart based on configuration Jan 23, 2020
@jleveque
Copy link
Contributor

FYI, I also tweaked the PR title.

@yozhao101
Copy link
Contributor Author

@jleveque Good title, Thanks. Joe!

@yozhao101 yozhao101 closed this Jan 27, 2020
mssonicbld added a commit that referenced this pull request Jan 10, 2026
…lly (#24979)

#### Why I did it
src/sonic-swss
```
* a63ba5da - (HEAD -> master, origin/master, origin/HEAD) [LPO] Added support for serdes Tx/Rx polarity settings (#4053) (4 hours ago) [Prince George]
* bb691a5d - Merge pull request #3899 from rkavitha-hcl/bulk_neighbor (20 hours ago) [StephenWangGoogle]
|\ 
| failure_prs.log skip_prs.log c123c44c - Bulk support in neighbor manager. (28 hours ago) [mint570]
|/ 
* 505c4b7b - [HFT]: keep STATE_DB session stream_status in sync with profile stream_state (#4107) (34 hours ago) [Ze Gan]
* eed7900c - [countersyncd]: Add benchmark suite for countersyncd and optimize otel actor (#4016) (34 hours ago) [Janet Cui]
* 2de94bb9 - Merge pull request #3868 from rkavitha-hcl/neighbor_changes (2 days ago) [StephenWangGoogle]
|\ 
| failure_prs.log skip_prs.log 3f839f64 - Rename neiighor methods (3 days ago) [kishanps]
|/ 
* 34da384f - Merge pull request #3862 from rkavitha-hcl/acl_actions (3 days ago) [StephenWangGoogle]
|\ 
| failure_prs.log skip_prs.log b17e712d - Merge branch 'master' into acl_actions (3 days ago) [StephenWangGoogle]
| |\ 
| |/ 
|/| 
* | 0e233d18 - [gearsyncd,macsec]: Deterministic MACsec backend selection for gearbox ports (#3926) (3 days ago) [rajshekhar-nexthop]
* | c6b0c3e6 - [portsorch] DOM config change causes interface link to flap (#4056) (3 days ago) [mihirpat1]
| failure_prs.log skip_prs.log f4e65c92 - Merge branch 'master' into acl_actions (4 days ago) [rkavitha-hcl]
| |\ 
| |/ 
|/| 
* | 7297d14c - Merge pull request #3894 from divyagayathri-hcl/acl_vrf_id_oid (4 days ago) [StephenWangGoogle]
|\ \ 
| failure_prs.log skip_prs.log \ d7fc9535 - Merge branch 'master' into acl_vrf_id_oid (4 days ago) [StephenWangGoogle]
| |\ \ 
| |/ / 
|/| | 
* | | 1244a7df - [orchagent] support single ASIC VOQ Fixed-System (#4054) (4 days ago) [saravanan sellappa]
* | | c330c450 - [countersyncd]: Fix netlink fd leakage and deadlock issue (#4043) (4 days ago) [Ze Gan]
* | | ff585509 - Refactor OtelActor and message modules by removing unused code and improving clarity (#4072) (4 days ago) [Ze Gan]
* | | 68f081ae - [hftorch]: Handle exception of HFT instead of exit (#3887) (4 days ago) [Ze Gan]
* | | 4b81a3b8 - [hft]: Fix TAM type capability enable list (#4075) (4 days ago) [Ze Gan]
 / / 
* / 93ea8ef0 - [P4Orch] Update ACL VRF ID to use oid instead of u16. (5 weeks ago) [mint570]
 / 
* af78f1c0 - [P4Orch] Enable ACL Action SET_ACL_META_DATA and packet action COPY_CANCEL and DENY support (4 days ago) [mint570]
```
#### How I did it
#### How to verify it
#### Description for the changelog
jasonbridges pushed a commit to jasonbridges/sonic-buildimage that referenced this pull request Jan 22, 2026
…lly (sonic-net#24979)

#### Why I did it
src/sonic-swss
```
* a63ba5da - (HEAD -> master, origin/master, origin/HEAD) [LPO] Added support for serdes Tx/Rx polarity settings (sonic-net#4053) (4 hours ago) [Prince George]
* bb691a5d - Merge pull request sonic-net#3899 from rkavitha-hcl/bulk_neighbor (20 hours ago) [StephenWangGoogle]
|\ 
| failure_prs.log skip_prs.log c123c44c - Bulk support in neighbor manager. (28 hours ago) [mint570]
|/ 
* 505c4b7b - [HFT]: keep STATE_DB session stream_status in sync with profile stream_state (sonic-net#4107) (34 hours ago) [Ze Gan]
* eed7900c - [countersyncd]: Add benchmark suite for countersyncd and optimize otel actor (sonic-net#4016) (34 hours ago) [Janet Cui]
* 2de94bb9 - Merge pull request sonic-net#3868 from rkavitha-hcl/neighbor_changes (2 days ago) [StephenWangGoogle]
|\ 
| failure_prs.log skip_prs.log 3f839f64 - Rename neiighor methods (3 days ago) [kishanps]
|/ 
* 34da384f - Merge pull request sonic-net#3862 from rkavitha-hcl/acl_actions (3 days ago) [StephenWangGoogle]
|\ 
| failure_prs.log skip_prs.log b17e712d - Merge branch 'master' into acl_actions (3 days ago) [StephenWangGoogle]
| |\ 
| |/ 
|/| 
* | 0e233d18 - [gearsyncd,macsec]: Deterministic MACsec backend selection for gearbox ports (sonic-net#3926) (3 days ago) [rajshekhar-nexthop]
* | c6b0c3e6 - [portsorch] DOM config change causes interface link to flap (sonic-net#4056) (3 days ago) [mihirpat1]
| failure_prs.log skip_prs.log f4e65c92 - Merge branch 'master' into acl_actions (4 days ago) [rkavitha-hcl]
| |\ 
| |/ 
|/| 
* | 7297d14c - Merge pull request sonic-net#3894 from divyagayathri-hcl/acl_vrf_id_oid (4 days ago) [StephenWangGoogle]
|\ \ 
| failure_prs.log skip_prs.log \ d7fc9535 - Merge branch 'master' into acl_vrf_id_oid (4 days ago) [StephenWangGoogle]
| |\ \ 
| |/ / 
|/| | 
* | | 1244a7df - [orchagent] support single ASIC VOQ Fixed-System (sonic-net#4054) (4 days ago) [saravanan sellappa]
* | | c330c450 - [countersyncd]: Fix netlink fd leakage and deadlock issue (sonic-net#4043) (4 days ago) [Ze Gan]
* | | ff585509 - Refactor OtelActor and message modules by removing unused code and improving clarity (sonic-net#4072) (4 days ago) [Ze Gan]
* | | 68f081ae - [hftorch]: Handle exception of HFT instead of exit (sonic-net#3887) (4 days ago) [Ze Gan]
* | | 4b81a3b8 - [hft]: Fix TAM type capability enable list (sonic-net#4075) (4 days ago) [Ze Gan]
 / / 
* / 93ea8ef0 - [P4Orch] Update ACL VRF ID to use oid instead of u16. (5 weeks ago) [mint570]
 / 
* af78f1c0 - [P4Orch] Enable ACL Action SET_ACL_META_DATA and packet action COPY_CANCEL and DENY support (4 days ago) [mint570]
```
#### How I did it
#### How to verify it
#### Description for the changelog
FengPan-Frank pushed a commit to FengPan-Frank/sonic-buildimage that referenced this pull request Mar 6, 2026
…lly (sonic-net#24979)

#### Why I did it
src/sonic-swss
```
* a63ba5da - (HEAD -> master, origin/master, origin/HEAD) [LPO] Added support for serdes Tx/Rx polarity settings (sonic-net#4053) (4 hours ago) [Prince George]
* bb691a5d - Merge pull request sonic-net#3899 from rkavitha-hcl/bulk_neighbor (20 hours ago) [StephenWangGoogle]
|\
| failure_prs.log skip_prs.log c123c44c - Bulk support in neighbor manager. (28 hours ago) [mint570]
|/
* 505c4b7b - [HFT]: keep STATE_DB session stream_status in sync with profile stream_state (sonic-net#4107) (34 hours ago) [Ze Gan]
* eed7900c - [countersyncd]: Add benchmark suite for countersyncd and optimize otel actor (sonic-net#4016) (34 hours ago) [Janet Cui]
* 2de94bb9 - Merge pull request sonic-net#3868 from rkavitha-hcl/neighbor_changes (2 days ago) [StephenWangGoogle]
|\
| failure_prs.log skip_prs.log 3f839f64 - Rename neiighor methods (3 days ago) [kishanps]
|/
* 34da384f - Merge pull request sonic-net#3862 from rkavitha-hcl/acl_actions (3 days ago) [StephenWangGoogle]
|\
| failure_prs.log skip_prs.log b17e712d - Merge branch 'master' into acl_actions (3 days ago) [StephenWangGoogle]
| |\
| |/
|/|
* | 0e233d18 - [gearsyncd,macsec]: Deterministic MACsec backend selection for gearbox ports (sonic-net#3926) (3 days ago) [rajshekhar-nexthop]
* | c6b0c3e6 - [portsorch] DOM config change causes interface link to flap (sonic-net#4056) (3 days ago) [mihirpat1]
| failure_prs.log skip_prs.log f4e65c92 - Merge branch 'master' into acl_actions (4 days ago) [rkavitha-hcl]
| |\
| |/
|/|
* | 7297d14c - Merge pull request sonic-net#3894 from divyagayathri-hcl/acl_vrf_id_oid (4 days ago) [StephenWangGoogle]
|\ \
| failure_prs.log skip_prs.log \ d7fc9535 - Merge branch 'master' into acl_vrf_id_oid (4 days ago) [StephenWangGoogle]
| |\ \
| |/ /
|/| |
* | | 1244a7df - [orchagent] support single ASIC VOQ Fixed-System (sonic-net#4054) (4 days ago) [saravanan sellappa]
* | | c330c450 - [countersyncd]: Fix netlink fd leakage and deadlock issue (sonic-net#4043) (4 days ago) [Ze Gan]
* | | ff585509 - Refactor OtelActor and message modules by removing unused code and improving clarity (sonic-net#4072) (4 days ago) [Ze Gan]
* | | 68f081ae - [hftorch]: Handle exception of HFT instead of exit (sonic-net#3887) (4 days ago) [Ze Gan]
* | | 4b81a3b8 - [hft]: Fix TAM type capability enable list (sonic-net#4075) (4 days ago) [Ze Gan]
 / /
* / 93ea8ef0 - [P4Orch] Update ACL VRF ID to use oid instead of u16. (5 weeks ago) [mint570]
 /
* af78f1c0 - [P4Orch] Enable ACL Action SET_ACL_META_DATA and packet action COPY_CANCEL and DENY support (4 days ago) [mint570]
```
#### How I did it
#### How to verify it
#### Description for the changelog

Signed-off-by: Feng Pan <[email protected]>
dprital pushed a commit that referenced this pull request Mar 19, 2026
…lly (#24979)

#### Why I did it
src/sonic-swss
```
* a63ba5da - (HEAD -> master, origin/master, origin/HEAD) [LPO] Added support for serdes Tx/Rx polarity settings (#4053) (4 hours ago) [Prince George]
* bb691a5d - Merge pull request #3899 from rkavitha-hcl/bulk_neighbor (20 hours ago) [StephenWangGoogle]
|\
| failure_prs.log skip_prs.log c123c44c - Bulk support in neighbor manager. (28 hours ago) [mint570]
|/
* 505c4b7b - [HFT]: keep STATE_DB session stream_status in sync with profile stream_state (#4107) (34 hours ago) [Ze Gan]
* eed7900c - [countersyncd]: Add benchmark suite for countersyncd and optimize otel actor (#4016) (34 hours ago) [Janet Cui]
* 2de94bb9 - Merge pull request #3868 from rkavitha-hcl/neighbor_changes (2 days ago) [StephenWangGoogle]
|\
| failure_prs.log skip_prs.log 3f839f64 - Rename neiighor methods (3 days ago) [kishanps]
|/
* 34da384f - Merge pull request #3862 from rkavitha-hcl/acl_actions (3 days ago) [StephenWangGoogle]
|\
| failure_prs.log skip_prs.log b17e712d - Merge branch 'master' into acl_actions (3 days ago) [StephenWangGoogle]
| |\
| |/
|/|
* | 0e233d18 - [gearsyncd,macsec]: Deterministic MACsec backend selection for gearbox ports (#3926) (3 days ago) [rajshekhar-nexthop]
* | c6b0c3e6 - [portsorch] DOM config change causes interface link to flap (#4056) (3 days ago) [mihirpat1]
| failure_prs.log skip_prs.log f4e65c92 - Merge branch 'master' into acl_actions (4 days ago) [rkavitha-hcl]
| |\
| |/
|/|
* | 7297d14c - Merge pull request #3894 from divyagayathri-hcl/acl_vrf_id_oid (4 days ago) [StephenWangGoogle]
|\ \
| failure_prs.log skip_prs.log \ d7fc9535 - Merge branch 'master' into acl_vrf_id_oid (4 days ago) [StephenWangGoogle]
| |\ \
| |/ /
|/| |
* | | 1244a7df - [orchagent] support single ASIC VOQ Fixed-System (#4054) (4 days ago) [saravanan sellappa]
* | | c330c450 - [countersyncd]: Fix netlink fd leakage and deadlock issue (#4043) (4 days ago) [Ze Gan]
* | | ff585509 - Refactor OtelActor and message modules by removing unused code and improving clarity (#4072) (4 days ago) [Ze Gan]
* | | 68f081ae - [hftorch]: Handle exception of HFT instead of exit (#3887) (4 days ago) [Ze Gan]
* | | 4b81a3b8 - [hft]: Fix TAM type capability enable list (#4075) (4 days ago) [Ze Gan]
 / /
* / 93ea8ef0 - [P4Orch] Update ACL VRF ID to use oid instead of u16. (5 weeks ago) [mint570]
 /
* af78f1c0 - [P4Orch] Enable ACL Action SET_ACL_META_DATA and packet action COPY_CANCEL and DENY support (4 days ago) [mint570]
```
#### How I did it
#### How to verify it
#### Description for the changelog

Signed-off-by: dprital <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants