fix(ci): Add retry logic for pulling registry:2 in prepare-docker-buildx#8236
Conversation
The opensearch e2e CI job was failing spuriously because `docker run --rm -d -p 5000:5000 --name registry registry:2` timed out when pulling the registry:2 image from Docker Hub due to intermittent network issues. The fix adds retry logic (up to 3 attempts with 15s sleep between each) for pulling registry:2 before starting the container. It also: - Redirects stderr from `docker inspect registry` to suppress noise - Skips the sleep on the last retry attempt - Adds explicit failure detection if all pull attempts fail Fixes: https://github.com/jaegertracing/jaeger/actions/runs/23439845443/job/68188774639 Co-authored-by: yurishkuro <[email protected]> Agent-Logs-Url: https://github.com/jaegertracing/jaeger/sessions/b43c3538-1fea-453a-9768-60927028af7b
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #8236 +/- ##
=======================================
Coverage 95.63% 95.63%
=======================================
Files 319 319
Lines 16795 16795
=======================================
Hits 16062 16062
Misses 579 579
Partials 154 154
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
Improves CI reliability for jobs that rely on the local Docker registry by adding retry logic when pulling registry:2 during prepare-docker-buildx, reducing spurious failures due to transient Docker Hub timeouts.
Changes:
- Adds up-to-3-attempt retry loop for
docker pull registry:2with a 15s backoff between failed attempts. - Verifies the
registry:2image is present viadocker image inspectand fails fast with a clearer error if not. - Suppresses expected
docker inspectstderr noise when the registry container does not exist.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @echo "::group:: prepare-docker-buildx" | ||
| docker buildx inspect jaeger-build > /dev/null || docker buildx create --use --name=jaeger-build --buildkitd-flags="--allow-insecure-entitlement security.insecure --allow-insecure-entitlement network.host" --driver-opt="network=host" | ||
| docker inspect registry > /dev/null || docker run --rm -d -p 5000:5000 --name registry registry:2 | ||
| docker inspect registry > /dev/null 2>&1 || \ |
There was a problem hiding this comment.
docker inspect registry succeeds for both containers and images, and it also returns success for a stopped container. That can cause prepare-docker-buildx to skip starting the local registry even though no registry container is running. Consider switching to docker container inspect (or docker ps / docker inspect -f '{{.State.Running}}') and ensuring the container is running (start/recreate if stopped).
The
opensearch 3.x e2eCI job fails spuriously whendocker runattempts to pullregistry:2from Docker Hub and the connection times out (Client.Timeout exceeded while awaiting headers), causingprepare-docker-buildxto exit with code 125.Changes
scripts/makefiles/Docker.mk—prepare-docker-buildxtarget:docker pull registry:2up to 3 times with a 15s pause between failures (no unnecessary sleep on the final attempt)docker image inspectand abort with a clear error message if all pull attempts faileddocker inspect registryto suppress expected "No such object" noisedocker inspect registry > /dev/null 2>&1 || \ { for i in 1 2 3; do \ docker pull registry:2 && break; \ echo "Attempt $$i/3 to pull registry:2 failed"; \ [ "$$i" -lt 3 ] && sleep 15; \ done; \ docker image inspect registry:2 > /dev/null 2>&1 \ || { echo "ERROR: Failed to pull registry:2 after 3 attempts"; exit 1; }; \ docker run --rm -d -p 5000:5000 --name registry registry:2; }💬 Send tasks to Copilot coding agent from Slack and Teams to turn conversations into code. Copilot posts an update in your thread when it's finished.