Wait till CHASIS_APP_DB PING is successful, host_name and asic_name are valid in CONIFG_DB before starting chassis-db-cleanup#17962
Conversation
|
@judyjoseph for viz |
|
@judyjoseph @arlakshm @abdosi , please review this PR |
|
We ran the complete oc with this fix and the error "Unable to connect to redis: Cannot assign requested address" is not seen. Also we didn't see the orchagent crash |
|
/AzurePipelines run |
|
You have several pipelines (over 10) configured to build pull requests in this repository. Specify which pipelines you would like to run by using /azp run [pipelines] command. You can specify multiple pipelines using a comma separated list. |
8b72589 to
08c773a
Compare
|
/azp run Azure.sonic-buildimage |
1 similar comment
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
judyjoseph
left a comment
There was a problem hiding this comment.
Change looks good to me
|
@saksarav-nokia, Trying to find an alternative solution here as this change to add hostname-config.service dependency will affect all platforms. I checked this script, can we add a specific check in this script to proceed with changes in /etc/hosts file only if HOSTNAME changes ?: .That should help our case and we need not add this hostname-config.service dependency |
|
@judyjoseph , I think that will also fix the issue. I will test it out and update the PR. |
d36977c to
283d1ff
Compare
…re valid in CONIFG_DB before starting chassis-db-cleanup Signed-off-by: saksarav <[email protected]>
Signed-off-by: saksarav <[email protected]>
Signed-off-by: saksarav <[email protected]>
|
@judyjoseph , Addressed your comments and verified the changes and ensured the issue is not seen with current changes. Please review it. |
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
MSFT ADO: 27704026 |
…re valid in CONIFG_DB before starting chassis-db-cleanup (sonic-net#17962) This PR fixes the issue reported in Issu sonic-net#17945 We noticed that chassis db clean up is skipped sometimes when the CHASSIS_APP_DB PING fails. Also if host_name and asic_name are not written to CONIG_DB, it could pass the empty strings to CHASSIS_APP_DB EVAL commands. The service hostname-config.service is restarted whenever the config-reload or load-minigraph is done and this services renames the file /etc/hosts to updates it with the new file. This interferes with [email protected] and when swss.sh script CHASSIS_APP_DPP when the /etc/hosts file is renamed, the error "Unable to connect to redis: Cannot assign requested address" is seen and the CHASSIS_APP_DB EVAL command fails. This causes the chassis db entries not getting cleaned up and causes orchagent crash in remote LC's. --------- Signed-off-by: saksarav <[email protected]>
|
Cherry-pick PR to 202305: #18756 |
…re valid in CONIFG_DB before starting chassis-db-cleanup (#17962) This PR fixes the issue reported in Issu #17945 We noticed that chassis db clean up is skipped sometimes when the CHASSIS_APP_DB PING fails. Also if host_name and asic_name are not written to CONIG_DB, it could pass the empty strings to CHASSIS_APP_DB EVAL commands. The service hostname-config.service is restarted whenever the config-reload or load-minigraph is done and this services renames the file /etc/hosts to updates it with the new file. This interferes with [email protected] and when swss.sh script CHASSIS_APP_DPP when the /etc/hosts file is renamed, the error "Unable to connect to redis: Cannot assign requested address" is seen and the CHASSIS_APP_DB EVAL command fails. This causes the chassis db entries not getting cleaned up and causes orchagent crash in remote LC's. --------- Signed-off-by: saksarav <[email protected]>
…re valid in CONIFG_DB before starting chassis-db-cleanup (sonic-net#17962) This PR fixes the issue reported in Issu sonic-net#17945 We noticed that chassis db clean up is skipped sometimes when the CHASSIS_APP_DB PING fails. Also if host_name and asic_name are not written to CONIG_DB, it could pass the empty strings to CHASSIS_APP_DB EVAL commands. The service hostname-config.service is restarted whenever the config-reload or load-minigraph is done and this services renames the file /etc/hosts to updates it with the new file. This interferes with [email protected] and when swss.sh script CHASSIS_APP_DPP when the /etc/hosts file is renamed, the error "Unable to connect to redis: Cannot assign requested address" is seen and the CHASSIS_APP_DB EVAL command fails. This causes the chassis db entries not getting cleaned up and causes orchagent crash in remote LC's. --------- Signed-off-by: saksarav <[email protected]>
|
@yxieca , who can review/approve for 202311 for this PR? |
Why I did it
This PR fixes the issue reported in Issu #17945
We noticed that chassis db clean up is skipped sometimes when the CHASSIS_APP_DB PING fails. Also if host_name and asic_name are not written to CONIG_DB, it could pass the empty strings to CHASSIS_APP_DB EVAL commands.
The service hostname-config.service is restarted whenever the config-reload or load-minigraph is done and this services renames the file /etc/hosts to updates it with the new file. This interferes with [email protected] and when swss.sh script CHASSIS_APP_DPP when the /etc/hosts file is renamed, the error "Unable to connect to redis: Cannot assign requested address" is seen and the CHASSIS_APP_DB EVAL command fails. This causes the chassis db entries not getting cleaned up and causes orchagent crash in remote LC's.
Work item tracking
How I did it
Wait till CHASS_APP_DB PING is successful before checking for entries in CHASSIS_APP_DB table. Also wait till host_name and asic_name are valis in CONFIG_DB.
Modified [email protected] to start after hostname-config.service
How to verify it
Ran a script with 200 times config reload & load-minigraph and verified that chassis db cleanup is done every time and the orchagent crash is not seen .
Which release branch to backport (provide reason below if selected)
Tested branch (Please provide the tested image version)
Description for the changelog
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)