Skip to content

[Dockers]: Manage all Docker containers with Supervisord#573

Merged
jleveque merged 17 commits intosonic-net:masterfrom
jleveque:supervisorize_dockers
May 8, 2017
Merged

[Dockers]: Manage all Docker containers with Supervisord#573
jleveque merged 17 commits intosonic-net:masterfrom
jleveque:supervisorize_dockers

Conversation

@jleveque
Copy link
Contributor

@jleveque jleveque commented May 6, 2017

This PR serves a number of purposes:

  1. Consolidates config.sh and start.sh scripts into one script (start.sh).
  2. Solves issue [dockers]: "docker stop" command fails on dockers with shell entrypoints #435 - All dockers now run supervisord as their ENTRYPOINT
    • Note that while supervisord has a "priority" system, it does not have a mechanism to start processes in a guaranteed order (i.e., it will not block until a process starts). I worked around this by using start.sh as a bootstrap script in dockers that start multiple processes. This script starts each process that is to be managed by supervisord using supervisorctl start, which will block until it completes. Each supervisor.conf file is configured to autostart only start.sh. All processes started using supervisorctl in start.sh are added to supervisor.conf but set to not autostart. This allows them to be started by start.sh yet be monitored by supervisord (the process must be started using supervisorctl in start.sh to be monitored).
  3. All stdout/stderr output from processes managed by supervisord is now sent to syslog instead of their own files. This not only consolidates all output to syslog, it also saves disk space.
  4. Supervisord log messages also sent to syslog (unfortunately supervisord has no native ability to log to syslog, so I accomplished this by using rsyslogd to monitor the supervisor log. This also allowed me to shrink the max supervisor log size (1MB) and number of rotated copies (2), as the supervisor logs are now redundant. We could further reduce these values in the future if we see a need.

These changes were made to all dockers currently in use except docker-fpm-frr as I have no device to test on (will create an issue once this PR is merged). Other dockers not updated are deprecated and should be removed in the near future (see issue #572)

  • Also removed unused smartmontools package from docker-platform-monitor

Joe LeVeque added 17 commits May 4, 2017 06:50
 - Also now installing supervisor package in docker-base so that
   supervisor is available in all docker containers and can be used as a
   standard, so removed supervisor installation from dockers built on
   top of docker-base.
 - Capped supervisor log max filesize to 1MB and max of 2 rotated files.
   We shouldn't need more because rsyslog should be constantly
   monitoring changes to these files and appending them to syslog.
 - Also some unrelated minor cleanup in docker-ptf Dockerfile

trap clean_up SIGTERM SIGKILL

service syncd start
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syncd still managed by service not supervisord?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syncd has a complex init script. I don't have much experience with syncd operations and testing methods, so I figured this task would be best left to someone with more experience. Init script functionality just needs to be ported into the syncd.sh script I created.

@lguohan
Copy link
Collaborator

lguohan commented May 8, 2017

:shipit:

@jleveque jleveque merged commit 8f34839 into sonic-net:master May 8, 2017
@jleveque jleveque deleted the supervisorize_dockers branch May 8, 2017 22:43
madhanmellanox pushed a commit to madhanmellanox/sonic-buildimage that referenced this pull request Mar 23, 2020
* Allow decap tunnel configuration without "src_ip"

Signed-off-by: Stepan Blyschak <[email protected]>

* Remove overlay loopback interface when removing tunnel

After tunnel removal there was still overlay loopback
interface in ASIC DB.

Signed-off-by: Stepan Blyschak <[email protected]>

* Add more test cases to test_tunnel.py

Seperate symmetric and decap tunnel test
Add checks for tunnel removal

Signed-off-by: Stepan Blyschak <[email protected]>
judyjoseph added a commit that referenced this pull request Mar 27, 2020
The following changes are included in this submodule advance

54b2510 [syncd] Use correct VID when GET will fail to obrain object type (#577)
59b0430 [syncd] Unlock vendor api lock if enabling diag shell (#571)
910d45e [vs] Add more logs when setting MTU on port (#576)
c0d9947 [vs] Fix setting correct port mtu value (#573)
lguohan pushed a commit that referenced this pull request Apr 8, 2020
* f4d9398 2020-04-07 | [vs] Set mto only on tap device (#592) [Kamil Cudnik]
* 0ad13f5 2020-04-07 | [lgtm]: add lgtm static analysis configuration (#589) [lguohan]
* c961260 2020-04-07 | add swss-common-{inc,lib} to specify the prefix of swss-common library (#590) [lguohan]
* 2d68abc 2020-04-06 | [syncd] Load correct global context id (#588) [Kamil Cudnik]
* cd82389 2020-04-06 | Return correct error code when port is in use (#565) [Vasant Patil]
* 2189c2f 2020-04-02 | [syncd] Pass correct switch RID when staring diag shell (#587) [Kamil Cudnik]
* 91792db 2020-04-01 | [syncd] Fix crash during stats polling (#586) [Vitaliy Senchyshyn]
* d13521e 2020-04-01 | [meta] Flush fdb entries after flush api success (#581) [Kamil Cudnik]
* 54b2510 2020-03-17 | [syncd] Use correct VID when GET will fail to obrain object type (#577) [Kamil Cudnik]
* 59b0430 2020-03-16 | [syncd] Unlock vendor api lock if enabling diag shell (#571) [Kamil Cudnik]
* 910d45e 2020-03-16 | [vs] Add more logs when setting MTU on port (#576) [Kamil Cudnik]
* c0d9947 2020-03-13 | [vs] Fix setting correct port mtu value (#573) [Kamil Cudnik]
dmytroxshevchuk pushed a commit to dmytroxshevchuk/sonic-buildimage that referenced this pull request Aug 31, 2020
stephenxs added a commit to stephenxs/sonic-buildimage that referenced this pull request Jan 18, 2022
d00a25bb [ci] refer 202106 branch resources rather than master branch. (sonic-net#573)

Signed-off-by: Stephen Sun <[email protected]>
liat-grozovik pushed a commit that referenced this pull request Jan 18, 2022
d00a25bb [ci] refer 202106 branch resources rather than master branch. (#573)

Signed-off-by: Stephen Sun <[email protected]>
mssonicbld added a commit that referenced this pull request Dec 20, 2024
…D automatically (#21203)

#### Why I did it
src/sonic-platform-daemons
```
* c61323f - (HEAD -> master, origin/master, origin/HEAD) [chassisd] Address the chassisd crash issue and add UT for it (#573) (2 days ago) [Marty Y. Lok]
* 0d79916 - [chassis][psud] Move the PSU parent information generation to the loop run function from the initialization function (#576) (2 days ago) [Jianquan Ye]
```
#### How I did it
#### How to verify it
#### Description for the changelog
mssonicbld added a commit that referenced this pull request Dec 22, 2024
…D automatically (#21252)

#### Why I did it
src/sonic-platform-daemons
```
* 291d3bb - (HEAD -> 202405, origin/202405) Take non-CMIS xcvrs out of lpmode in SFF Manager (#565) (22 hours ago) [Peter Bailey]
* ef80d32 - [chassisd] Address the chassisd crash issue and add UT for it (#573) (31 hours ago) [Marty Y. Lok]
```
#### How I did it
#### How to verify it
#### Description for the changelog
VladimirKuk pushed a commit to Marvell-switching/sonic-buildimage that referenced this pull request Jan 21, 2025
…D automatically (sonic-net#21203)

#### Why I did it
src/sonic-platform-daemons
```
* c61323f - (HEAD -> master, origin/master, origin/HEAD) [chassisd] Address the chassisd crash issue and add UT for it (sonic-net#573) (2 days ago) [Marty Y. Lok]
* 0d79916 - [chassis][psud] Move the PSU parent information generation to the loop run function from the initialization function (sonic-net#576) (2 days ago) [Jianquan Ye]
```
#### How I did it
#### How to verify it
#### Description for the changelog
yuazhe pushed a commit to yuazhe/sonic-buildimage that referenced this pull request Feb 12, 2025
[202412] Code sync sonic-net/sonic-buildimage:202411 => 202412
mssonicbld added a commit that referenced this pull request Jul 7, 2025
… automatically (#23214)

#### Why I did it
src/sonic-platform-common
```
* 7c70958 - (HEAD -> 202411, origin/202411) Adding is_copper api to cmis (#547) (#573) (4 days ago) [ravil-nexthop]
```
#### How I did it
#### How to verify it
#### Description for the changelog
bobby-nexthop pushed a commit to bobby-nexthop/sonic-buildimage that referenced this pull request Aug 1, 2025
…net#573)

Description
On Nokia platform, slot name of Supervisor is string "A" instead of a number. Using "int" to convert it could cause issue backtrace. We should use slot value to any checking without any conversion. This will fixes sonic-net#21131

Motivation and Context
Modify the _get_module_info not to convert "slot" to a string value. And also modify the code not to convert slot value to an to do any checking. Just directly use the returned value of get_slot(). Also add UT test_moduleupdater_check_slot_string() to valid it.

How Has This Been Tested?
Tested on 202405 branch


Signed-off-by: mlok <[email protected]>
bobby-nexthop pushed a commit to bobby-nexthop/sonic-buildimage that referenced this pull request Aug 1, 2025
…evice is in detaching mode (sonic-net#546)

* Skip logging the warning, if device is in detaching mode

* Add detach_info table and unittests

* Fix unit tests

* Increase code coverage

* Remove unused header import

* Fix dict get values

* Increase code coverage

* Increase test coverage

* [SmartSwitch] Extend implementation of the DPU chassis daemon. (sonic-net#563)

* Addition of DPU Chassis for thermalctld (sonic-net#564)

* [stormond] Added new dynamic field 'last_sync_time' to STATE_DB (sonic-net#535)

* Added new dynamic field 'last_sync_time' that shows when STORAGE_INFO for disk was last synced to STATE_DB

* Moved 'start' message to actual starting point of the daemon

* Added functions for formatted and epoch time for user friendly time display

* Made changes per prgeor review comments

* Pivot to SysLogger for all logging

* Increased log level so that they are seen in syslogs

* Code coverage improvement

* [lag_id] Add lagid to free_list when LC absent for 30 minutes (sonic-net#542)

When LC is absent for 30 minutes, the database cleanup kicks in. When LagId is released, it needs to be appended to the SYSTEM_LAG_IDS_FREE_LIST

This PR works with the following 2 PRs:
sonic-net/sonic-swss#3303
sonic-net#20369

Signed-off-by: mlok <[email protected]>

* Fixed bug in chassisd causing incorrect number of ASICs in CHASSIS_STATE_DB (sonic-net#560)

Fixed the bug in chassisd due to which incorrect number of ASICs were being pushed to CHASSIS_STATE_DB.

* thermalctld: Add support for fans on non-CPU modules (sonic-net#555)

* thermalctld: Add support for fans on non-CPU modules

* Add module fan to unit tests

* Advanced Azure pipeline to Bookworm (sonic-net#572)

Description
This PR advances the azure pipeline on sonic_platform_daemons from bullseye to bookworm. This fixes the issue where sonic-platform-daemons azp is having some issues due to upgrade to bookworm. See Pipelines - Run 20241210.8 logs for details.

* Take non-CMIS xcvrs out of lpmode in SFF Manager (sonic-net#565)

Description
Fix non-CMIS transceivers in down state by bringing them out of low power mode in the SFF Manager Task.
This is intended to work together with the change in sonic-net#20886.

Motivation and Context
Non-CMIS transceivers were not functioning correctly when put into Low Power mode. So XCVRD now brings them out of lpmode.

How Has This Been Tested?
Loaded an image containing this change alongside the change from sonic-net#20886 on an Arista chassis containing a Clearwater2 linecard.
Verified that without this image some interfaces were in a down state but with the image all interfaces came up as expected.

* Added SmartSwitch support in chassisd and enabling chassisd  (sonic-net#467)

Added SmartSwitch support in chassisd and enabling chassisd

* [chassis][psud] Move the PSU parent information generation to the loop run function from the initialization function (sonic-net#576)

Description
Move the PSU parent information generation to the loop run function from the initialization function

Motivation and Context
Fixes sonic-net#575

How Has This Been Tested?
Tested on Cisco chassis, the PHYSICAL_ENTITY_INFO|PSU * can be re-inserted after thermalctld restart.
And monitored the stated db for memory for hours, works well:

* [chassisd] Address the chassisd crash issue and add UT for it (sonic-net#573)

Description
On Nokia platform, slot name of Supervisor is string "A" instead of a number. Using "int" to convert it could cause issue backtrace. We should use slot value to any checking without any conversion. This will fixes sonic-net#21131

Motivation and Context
Modify the _get_module_info not to convert "slot" to a string value. And also modify the code not to convert slot value to an to do any checking. Just directly use the returned value of get_slot(). Also add UT test_moduleupdater_check_slot_string() to valid it.

How Has This Been Tested?
Tested on 202405 branch


Signed-off-by: mlok <[email protected]>

* Fix a comment

---------

Signed-off-by: mlok <[email protected]>
Co-authored-by: Oleksandr Ivantsiv <[email protected]>
Co-authored-by: Gagan Punathil Ellath <[email protected]>
Co-authored-by: Ashwin Srinivasan <[email protected]>
Co-authored-by: Marty Y. Lok <[email protected]>
Co-authored-by: Vivek Verma <[email protected]>
Co-authored-by: Patrick MacArthur <[email protected]>
Co-authored-by: Peter Bailey <[email protected]>
Co-authored-by: rameshraghupathy <[email protected]>
Co-authored-by: Jianquan Ye <[email protected]>
tshalvi pushed a commit to tshalvi/sonic-buildimage that referenced this pull request Aug 25, 2025
…est HEAD automatically (sonic-net#1362)

#### Why I did it
src/sonic-platform-common
```
* e3e1d76 - (HEAD -> 202412, origin/202412) Adding is_copper api to cmis (sonic-net#547) (sonic-net#573) (sonic-net#100) (33 hours ago) [Riff]
* 1485079 - Merge remote-tracking branch 'base/202411' into code-sync-202412 (2 days ago) [r12f]
* 7c70958 - Adding is_copper api to cmis (sonic-net#547) (sonic-net#573) (13 days ago) [ravil-nexthop]
```
#### How I did it
#### How to verify it
#### Description for the changelog
vmittal-msft pushed a commit to vmittal-msft/sonic-buildimage that referenced this pull request Oct 20, 2025
…est HEAD automatically (sonic-net#1370)

#### Why I did it
src/sonic-platform-common
```
* 1f222d6 - (HEAD -> 202503, origin/202503) Merge pull request sonic-net#101 from mssonicbld/sonicbld/202503-merge (23 hours ago) [mssonicbld]
* 28243d5 - Merge branch '202412' of https://github.com/Azure/sonic-platform-common.msft into 202503 (23 hours ago) [Sonic Automation]
* e3e1d76 - (origin/202412) Adding is_copper api to cmis (sonic-net#547) (sonic-net#573) (sonic-net#100) (2 days ago) [Riff]
* 1485079 - Merge remote-tracking branch 'base/202411' into code-sync-202412 (3 days ago) [r12f]
* 7c70958 - Adding is_copper api to cmis (sonic-net#547) (sonic-net#573) (2 weeks ago) [ravil-nexthop]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants