Skip to content

Restart SwSS, syncd and dependent services if a critical process in syncd container exits unexpectedly#3534

Merged
lguohan merged 2 commits intosonic-net:masterfrom
jleveque:restart_swss_syncd_crash
Nov 9, 2019
Merged

Restart SwSS, syncd and dependent services if a critical process in syncd container exits unexpectedly#3534
lguohan merged 2 commits intosonic-net:masterfrom
jleveque:restart_swss_syncd_crash

Conversation

@jleveque
Copy link
Copy Markdown
Contributor

- What I did

Restart SwSS, syncd and dependent services if a critical process in the syncd container exits unexpectedly

- How I did it

Add the same mechanism I developed for the SwSS service in #2845 to the syncd service. However, in order to cause the SwSS service to also exit and restart in this situation, I developed a docker-wait-any program which the SwSS service uses to wait for either the swss or syncd containers to exit.

- How to verify it

Run etiher sudo pkill -11 <critical_process_in_syncd_container>, and observe that syncd service exits, swss and all dependent services exit, then all of those services start back up.

@lguohan
Copy link
Copy Markdown
Collaborator

lguohan commented Nov 9, 2019

retest broadcom please

@lguohan lguohan merged commit 85b0de3 into sonic-net:master Nov 9, 2019
@jleveque jleveque deleted the restart_swss_syncd_crash branch November 9, 2019 20:45
zhenggen-xu pushed a commit to zhenggen-xu/sonic-buildimage that referenced this pull request Jan 10, 2020
…cal process in syncd container exits unexpectedly (sonic-net#3534)

Add the same mechanism I developed for the SwSS service in sonic-net#2845 to the syncd service. However, in order to cause the SwSS service to also exit and restart in this situation, I developed a docker-wait-any program which the SwSS service uses to wait for either the swss or syncd containers to exit.
mssonicbld added a commit that referenced this pull request Jun 7, 2025
…lly (#22884)

#### Why I did it
src/sonic-swss
```
* db7d939 - (HEAD -> master, origin/master, origin/HEAD) Remove cache for high volume DASH objects (#3534) (4 hours ago) [Lawrence Lee]
* 80294d5 - add parseBoolList in request parser (#3675) (9 hours ago) [Jing Zhang]
* a7607be - [fpmsyncd]Fixing fpmsyncd to handle routes without protocol (#3657) (11 hours ago) [Sudharsan Dhamal Gopalarathnam]
* 8067dee - [muxorch] Catch error when checking active state of missing neighbor (#3674) (11 hours ago) [Nikola Dancejic]
* d073fc7 - [vstest]: Skip flaky icmp echo tests (#3687) (12 hours ago) [prabhataravind]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants