Fix rsyslogd memory growth in syncd swss containers over long term#25874
Fix rsyslogd memory growth in syncd swss containers over long term#25874yxieca merged 1 commit intosonic-net:masterfrom
Conversation
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
This PR addresses rsyslogd memory growth in the syncd and swss containers by reducing PID churn that was causing rsyslog's imuxsock ratelimiter to accumulate entries for short-lived senders. Two strategies are applied: suppressing unnecessary output from phc_ctl in phcsync.sh, and anchoring syslog messages to a stable PID ($$) in syncd_common.sh and swss.sh.
Changes:
phcsync.shnow runsphc_ctlwith-q -Qflags and redirects stdout to/dev/nullto suppress normal output, with explicit error logging on non-zero exit.syncd_common.shandswss.shdebug()functions uselogger --id=$$to emit all messages under the parent shell's PID, preventing a new ratelimiter entry perloggerinvocation.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
platform/mellanox/docker-syncd-mlnx/phcsync.sh |
Adds -q -Q flags to silence normal phc_ctl output; redirects only stdout to /dev/null, removing the previous 2>/dev/null stderr suppression |
files/scripts/syncd_common.sh |
Adds --id=$$ to logger in the debug() function to anchor all log messages to the parent shell's PID |
files/scripts/swss.sh |
Same --id=$$ fix as syncd_common.sh for the debug() function in the swss service script |
|
/azpw run Azure.sonic-buildimage |
|
/AzurePipelines run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azpw run |
|
/AzurePipelines run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azpw run |
|
/AzurePipelines run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azpw run |
|
/AzurePipelines run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azpw run |
|
/AzurePipelines run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azpw run |
1 similar comment
|
/azpw run |
|
/AzurePipelines run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azpw. run |
1 similar comment
|
/azpw. run |
Signed-off-by: Hemanth Kumar Tirupati <[email protected]>
a72dfb4 to
2ef93c5
Compare
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
yxieca
left a comment
There was a problem hiding this comment.
Looks good. The PID-stable logger and quiet phc_ctl usage match the known pattern to reduce imuxsock ratelimiter growth. AI agent on behalf of Ying.
|
Cherry-pick PR to 202511: #26298 |
Why I did it
Work item tracking
How I did it
phc_ctl -q -Q ... >/dev/null 2>&1logger -i "$$" -- "$1"in syncd_common.sh and swss.sh. This reduces per-call sender churn during script execution phases (start/wait/stop).syncd
Every second we currently see following log from syncd and it creates a new ratelimiter context in rsyslogd because of new PID each time
logger commands
before
After
How to verify it
Which release branch to backport (provide reason below if selected)