Add workflow to detect stuck submodule update PRs by hdwhdw · Pull Request #590 · sonic-net/sonic-gnmi

hdwhdw · 2026-02-25T17:25:16Z

Why I did it

The mssonicbld bot creates submodule update PRs in sonic-buildimage automatically, but these PRs can get stuck for days or weeks without anyone noticing (e.g. CI failures, merge conflicts). This delays rollout of merged fixes to the build.

For example, sonic-net/sonic-buildimage#25285 has been open since Jan 31 — over a month — blocking multiple merged commits from reaching the build.

How I did it

Added a GitHub Actions workflow (.github/workflows/check-submodule-update.yml) that:

Runs on weekdays at 8AM UTC (and supports manual trigger)
Searches for open mssonicbld submodule update PRs in sonic-buildimage
If any PR has been open longer than a configurable threshold (default 96 hours), files an issue in sonic-gnmi with the submodule-stuck label
Avoids duplicate issues by checking for existing open issues referencing the same PR
Only runs on the sonic-net organization (skipped on forks)

Assignees and threshold are read from .github/submodule-watchers.json so the policy can be updated without modifying the workflow.

How to verify it

Trigger the workflow manually via workflow_dispatch
Verify it detects the currently stuck PRs (#25285, #25432) and files issues

Description for the changelog

Add scheduled workflow to detect and alert on stuck sonic-gnmi submodule update PRs in sonic-buildimage.

mssonicbld · 2026-02-25T17:25:24Z

/azp run

azure-pipelines · 2026-02-25T17:25:35Z

Azure Pipelines successfully started running 1 pipeline(s).

Copilot

Pull request overview

This PR adds automation to detect and file issues for “stuck” sonic-buildimage submodule-update PRs, and also changes Redis client initialization behavior in sonic_data_client to skip databases whose address/socket cannot be resolved (instead of failing fast).

Changes:

Add a scheduled GitHub Actions workflow that searches for long-open submodule update PRs in sonic-buildimage and files an issue in sonic-gnmi.
Update Redis client initialization (TCP test path and default unix-socket path) to log warnings and continue when DB address/socket lookup fails.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
`sonic_data_client/db_client.go`	Changes init/TCP-client setup to “skip on error” when DB socket/address lookup fails.
`.github/workflows/check-submodule-update.yml`	Introduces scheduled workflow to detect and report stuck submodule-update PRs.

Copilot · 2026-02-25T17:29:35Z

sonic_data_client/db_client.go

+					log.Warningf("Skipping %s in namespace %s: %v", dbName, dbNamespace, err)
+					continue


Changing GetDbTcpAddr failure handling from return err to continue can leave Target2RedisDb[namespace][dbName] unset. Much of db_client.go later assumes the redis client is non-nil (e.g., TableData2Msi reads it without an ok check) and will panic if a skipped DB is later used (including via V2R mappings that bypass the presence check in populateDbtablePath). Consider either keeping fail-fast behavior for required DBs, or consistently guarding all Target2RedisDb[...] lookups (returning a clear error) so skipping cannot lead to a nil dereference at runtime.

Suggested change

log.Warningf("Skipping %s in namespace %s: %v", dbName, dbNamespace, err)

continue

log.Warningf("Failed to get TCP address for %s in namespace %s: %v", dbName, dbNamespace, err)

return err

Copilot · 2026-02-25T17:29:36Z

sonic_data_client/db_client.go

 				addr, err := sdcfg.GetDbSock(dbName, dbNamespace)
 				if err != nil {
-					log.Errorf("init error:  %v", err)
-					return
+					log.Warningf("Skipping %s in namespace %s: %v", dbName, dbNamespace, err)
+					continue


With the new continue behavior on GetDbSock errors, init() may partially populate Target2RedisDb and leave some DB clients missing. Since several call sites in this file index into Target2RedisDb without checking presence, this can turn a config/init problem into a later nil-pointer panic. Recommend either failing initialization for mandatory DBs or adding presence checks (and error returns) at all Target2RedisDb access points to make skipped DBs safe.

Copilot · 2026-02-25T17:29:36Z

.github/workflows/check-submodule-update.yml

+  issues: write
+
+jobs:
+  check:


This workflow hard-codes actions that only make sense in the upstream repo (searching sonic-net/sonic-buildimage and creating issues in sonic-net/sonic-gnmi). Other workflows in this repo guard against running on forks (e.g., .github/workflows/semgrep.yml uses if: github.repository_owner == 'sonic-net'). Add a similar if: guard to this job to prevent failures/noisy runs when triggered from forks or non-sonic-net repos.

Suggested change

check:

check:

if: github.repository_owner == 'sonic-net'

Add a scheduled GitHub Actions workflow that checks sonic-buildimage for sonic-gnmi submodule update PRs that have been open longer than 96 hours. When found, it files an issue in sonic-gnmi with details about the stuck PR to ensure visibility and prompt investigation. Signed-off-by: Dawei Huang <daweihuang@microsoft.com>

mssonicbld · 2026-02-25T17:32:57Z

/azp run

azure-pipelines · 2026-02-25T17:33:08Z

Azure Pipelines successfully started running 1 pipeline(s).

Signed-off-by: Dawei Huang <daweihuang@microsoft.com>

mssonicbld · 2026-02-25T17:42:58Z

/azp run

azure-pipelines · 2026-02-25T17:43:08Z

Azure Pipelines successfully started running 1 pipeline(s).

mssonicbld · 2026-03-04T17:32:04Z

/azp run

azure-pipelines · 2026-03-04T17:32:14Z

Azure Pipelines successfully started running 1 pipeline(s).

Load assignees and threshold_hours from .github/submodule-watchers.json so the policy can be updated without modifying the workflow itself. Signed-off-by: Dawei Huang <daweihuang@microsoft.com>

mssonicbld · 2026-03-05T18:07:46Z

/azp run

azure-pipelines · 2026-03-05T18:07:57Z

Azure Pipelines successfully started running 1 pipeline(s).

Copilot AI review requested due to automatic review settings February 25, 2026 17:25

Copilot started reviewing on behalf of hdwhdw February 25, 2026 17:26 View session

Copilot AI reviewed Feb 25, 2026

View reviewed changes

hdwhdw force-pushed the check-submodule-update branch from e1d151d to e6f1c7c Compare February 25, 2026 17:32

Add fork guard to skip workflow on non-sonic-net repos

2fad583

Signed-off-by: Dawei Huang <daweihuang@microsoft.com>

hdwhdw closed this Mar 3, 2026

hdwhdw reopened this Mar 4, 2026

Read assignees and threshold from config file

accc787

Load assignees and threshold_hours from .github/submodule-watchers.json so the policy can be updated without modifying the workflow itself. Signed-off-by: Dawei Huang <daweihuang@microsoft.com>

hdwhdw changed the title ~~Check submodule update~~ Add workflow to detect stuck submodule update PRs Mar 5, 2026

hdwhdw requested review from prsunny, vaibhavhd and zbud-msft March 5, 2026 18:13

		log.Warningf("Skipping %s in namespace %s: %v", dbName, dbNamespace, err)
		continue

Conversation

hdwhdw commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why I did it

How I did it

How to verify it

Description for the changelog

Uh oh!

mssonicbld commented Feb 25, 2026

Uh oh!

azure-pipelines bot commented Feb 25, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

mssonicbld commented Feb 25, 2026

Uh oh!

azure-pipelines bot commented Feb 25, 2026

Uh oh!

mssonicbld commented Feb 25, 2026

Uh oh!

azure-pipelines bot commented Feb 25, 2026

Uh oh!

mssonicbld commented Mar 4, 2026

Uh oh!

azure-pipelines bot commented Mar 4, 2026

Uh oh!

mssonicbld commented Mar 5, 2026

Uh oh!

azure-pipelines bot commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hdwhdw commented Feb 25, 2026 •

edited

Loading