ansible: support pulling ceos_image from docker registry#22794
ansible: support pulling ceos_image from docker registry#22794wangxinbot merged 2 commits intosonic-net:masterfrom
Conversation
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Add support for an optional docker registry as an additional source for ceos_image, before falling back to the existing download-and-build path. When a registry is configured, containers always reference the registry- prefixed image name so no locally-built alias is created. Changes in ansible/group_vars/vm_host/ceos.yml: - Add commented-out placeholder variables ceos_registry, ceos_registry_username, and ceos_registry_password with instructions. Username/password are optional and only needed when the registry requires authentication. Changes in ansible/roles/vm_set/tasks/add_ceos_list.yml: - When ceos_registry is defined, check if the registry-prefixed image is already cached locally. If found, use it directly without re-tagging. - If not cached, login to the registry (only when credentials are provided) and attempt to pull the image. - ceos_image_found is registry-aware: when ceos_registry is defined, a plain local ceos_image is not counted as found, ensuring the registry path is always taken. - Only fall back to the download-ceos_image_orig / build-ceos_image path when no registry is configured or the registry pull failed. Changes in ansible/roles/eos/tasks/ceos.yml and ceos_ensure_reachable.yml: - After loading ceos vars, check whether the registry image is available locally and set ceos_effective_image accordingly. - All docker_container tasks use ceos_effective_image instead of ceos_image, so containers always reference the registry image in registry-enabled environments without creating a local alias. Signed-off-by: Xin Wang <[email protected]> Co-authored-by: Copilot <[email protected]>
abdb30a to
8f859a6
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
StormLiangMS
left a comment
There was a problem hiding this comment.
Review
Overall clean PR — the approach of optional registry support with download-and-build fallback is sound. A few concerns:
Issues
1. Credentials in plain text: ceos_registry_password is stored as a plain-text variable in group_vars. This could get committed to the repo or inventory. Consider using Ansible Vault, environment variables, or at minimum adding a comment warning users not to commit credentials in plain text.
2. ceos_effective_image undefined in ceos_ensure_reachable.yml: This file references {{ ceos_effective_image }} but only ceos.yml sets it via set_fact. If ceos_ensure_reachable.yml is invoked independently (not through the main ceos.yml flow), ceos_effective_image will be undefined. It needs the same include_vars + docker_image_info + set_fact logic, or the variable needs to be set at a higher scope.
3. ceos_image_found is a string, not a boolean: The set_fact with Jinja2 {{ ... }} produces a string "True"/"False", not a Python boolean. The when: not ceos_image_found check at the bottom might not behave as expected — in Ansible, the string "False" is truthy. Should use:
ceos_image_found: "{{ ... | bool }}"4. No docker logout: After docker_login, there's no corresponding cleanup to remove credentials from ~/.docker/config.json on the host.
5. Registry pull failure is silently ignored: ignore_errors: yes on the pull task means a registry misconfiguration (wrong URL, bad credentials) silently falls through to the download-and-build path. Consider adding a debug message when the pull fails, so users know why the fallback was triggered.
Minor
- The
whencondition at the bottom could be simplified —ceos_registry_pull_result is defined and ceos_registry_pull_result is failedcould use| default(false)pattern. - Good that
pull: nois kept on the container create tasks.
Fix three issues raised in code review: Issue sonic-net#2 - ceos_effective_image undefined in ceos_ensure_reachable.yml: Add a defensive set_fact at the top of ceos_ensure_reachable.yml that sets ceos_effective_image to ceos_image as a fallback, guarded by 'when: ceos_effective_image is not defined'. In the normal flow the variable is already set by the caller (ceos.yml in the eos role) so this task is a no-op. It prevents an undefined variable error if the file is ever invoked independently. Issue sonic-net#3 - ceos_image_found stored as string instead of boolean: The set_fact used a >- block scalar which causes Jinja2 to produce the string 'True'/'False'. In Ansible the string 'False' is truthy, so 'when: not ceos_image_found' would never enter the prepare block when the image is absent. Fixed by collapsing to a single quoted expression and appending '| bool' to ensure Python boolean storage. Issue sonic-net#5 - registry pull failure silently ignored: Added a debug task immediately after the pull task that emits a WARNING message when ceos_registry_pull_result is failed. Users can now see in the Ansible output why the fallback to download-and-build was triggered. Signed-off-by: Xin Wang <[email protected]> Co-authored-by: Copilot <[email protected]>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Thanks @StormLiangMS for the thorough review! Here is a response to each issue: Issue #1 - Credentials in plain text: Acknowledged. In our automation environment, secrets are downloaded at runtime from a vault as a JSON file, so Issue #2 - Issue #3 - Issue #4 - No docker logout: Intentionally left as-is. The registry credentials are scoped to the testbed VM host and managed by our automation, so leaving the session open is acceptable in this environment. Issue #5 - Registry pull failure silently ignored: Fixed. Added a Issues #2, #3, and #5 are addressed in the follow-up commit |
StormLiangMS
left a comment
There was a problem hiding this comment.
Thanks for addressing the review feedback.
Addressed ✅:
- #2:
ceos_effective_imagefallback inceos_ensure_reachable.yml— good defensive approach - #3:
| boolfilter onceos_image_found— correct fix - #5: Debug warning on registry pull failure — helpful for troubleshooting
Remaining (minor, non-blocking):
- #1 (credentials): Acceptable since they're commented-out placeholders. Maybe add a note like
# WARNING: Do not commit real credentials here. Use Ansible Vault or environment variables. - #4 (docker logout): Low risk since the credentials persist only on the VM host, but worth a follow-up if security-sensitive registries are used.
LGTM overall. The code is clean, well-commented, and the fallback logic is solid.
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
…2794) What is the motivation for this PR? In environments that enforce registry-only container image policies (e.g. Microsoft requiring all images to originate from ACR or MCR), the previous approach of building ceos_image locally or re-tagging a pulled image with a local name would trigger security alerts. This PR allows ceos_image to be sourced directly from a user-configured registry, so containers always reference the registry-prefixed image name and no locally-built alias is created. How did you do it? Three files are changed: ansible/group_vars/vm_host/ceos.yml Added three commented-out placeholder variables: ceos_registry, ceos_registry_username, ceos_registry_password. When ceos_registry is uncommented and set, the new registry logic is activated. Credentials are optional and only needed for authenticated registries. ansible/roles/vm_set/tasks/add_ceos_list.yml When ceos_registry is defined, check whether <registry>/<ceos_image> is already cached locally (a prior pull). If found, it is used directly — no docker tag is performed, so no local alias is created. If not cached and ceos_registry is defined, optionally log in (only when credentials are provided) and docker pull the image. Only falls back to the existing download-ceos_image_orig-and-build path when registry is not configured or the pull fails. ceos_image_found logic is registry-aware: when ceos_registry is defined, a plain local ceos_image does not count as "found", ensuring the registry path is always taken. ansible/roles/eos/tasks/ceos.yml and ceos_ensure_reachable.yml After loading group_vars/vm_host/ceos.yml, a docker_image_info check determines whether the registry image is available locally. ceos_effective_image is set to <registry>/<ceos_image> when the registry image is present, or plain ceos_image otherwise. All docker_container tasks use ceos_effective_image instead of ceos_image, so containers always reference the registry image in registry-enabled environments. How did you verify/test it? Tested "add-topo" by specifying ceos_registry. The ceos image from the registry is used successfully. Tested "add-topo" without specifying ceos_registry. The existing logic of using a local ceos image or preparing ceos image from ceos_image_orig works as expected. Signed-off-by: Xin Wang <[email protected]> Co-authored-by: Copilot <[email protected]> Signed-off-by: selldinesh <[email protected]>
…2794) What is the motivation for this PR? In environments that enforce registry-only container image policies (e.g. Microsoft requiring all images to originate from ACR or MCR), the previous approach of building ceos_image locally or re-tagging a pulled image with a local name would trigger security alerts. This PR allows ceos_image to be sourced directly from a user-configured registry, so containers always reference the registry-prefixed image name and no locally-built alias is created. How did you do it? Three files are changed: ansible/group_vars/vm_host/ceos.yml Added three commented-out placeholder variables: ceos_registry, ceos_registry_username, ceos_registry_password. When ceos_registry is uncommented and set, the new registry logic is activated. Credentials are optional and only needed for authenticated registries. ansible/roles/vm_set/tasks/add_ceos_list.yml When ceos_registry is defined, check whether <registry>/<ceos_image> is already cached locally (a prior pull). If found, it is used directly — no docker tag is performed, so no local alias is created. If not cached and ceos_registry is defined, optionally log in (only when credentials are provided) and docker pull the image. Only falls back to the existing download-ceos_image_orig-and-build path when registry is not configured or the pull fails. ceos_image_found logic is registry-aware: when ceos_registry is defined, a plain local ceos_image does not count as "found", ensuring the registry path is always taken. ansible/roles/eos/tasks/ceos.yml and ceos_ensure_reachable.yml After loading group_vars/vm_host/ceos.yml, a docker_image_info check determines whether the registry image is available locally. ceos_effective_image is set to <registry>/<ceos_image> when the registry image is present, or plain ceos_image otherwise. All docker_container tasks use ceos_effective_image instead of ceos_image, so containers always reference the registry image in registry-enabled environments. How did you verify/test it? Tested "add-topo" by specifying ceos_registry. The ceos image from the registry is used successfully. Tested "add-topo" without specifying ceos_registry. The existing logic of using a local ceos image or preparing ceos image from ceos_image_orig works as expected. Signed-off-by: Xin Wang <[email protected]> Co-authored-by: Copilot <[email protected]> Signed-off-by: Abhishek <[email protected]>
…2794) What is the motivation for this PR? In environments that enforce registry-only container image policies (e.g. Microsoft requiring all images to originate from ACR or MCR), the previous approach of building ceos_image locally or re-tagging a pulled image with a local name would trigger security alerts. This PR allows ceos_image to be sourced directly from a user-configured registry, so containers always reference the registry-prefixed image name and no locally-built alias is created. How did you do it? Three files are changed: ansible/group_vars/vm_host/ceos.yml Added three commented-out placeholder variables: ceos_registry, ceos_registry_username, ceos_registry_password. When ceos_registry is uncommented and set, the new registry logic is activated. Credentials are optional and only needed for authenticated registries. ansible/roles/vm_set/tasks/add_ceos_list.yml When ceos_registry is defined, check whether <registry>/<ceos_image> is already cached locally (a prior pull). If found, it is used directly — no docker tag is performed, so no local alias is created. If not cached and ceos_registry is defined, optionally log in (only when credentials are provided) and docker pull the image. Only falls back to the existing download-ceos_image_orig-and-build path when registry is not configured or the pull fails. ceos_image_found logic is registry-aware: when ceos_registry is defined, a plain local ceos_image does not count as "found", ensuring the registry path is always taken. ansible/roles/eos/tasks/ceos.yml and ceos_ensure_reachable.yml After loading group_vars/vm_host/ceos.yml, a docker_image_info check determines whether the registry image is available locally. ceos_effective_image is set to <registry>/<ceos_image> when the registry image is present, or plain ceos_image otherwise. All docker_container tasks use ceos_effective_image instead of ceos_image, so containers always reference the registry image in registry-enabled environments. How did you verify/test it? Tested "add-topo" by specifying ceos_registry. The ceos image from the registry is used successfully. Tested "add-topo" without specifying ceos_registry. The existing logic of using a local ceos image or preparing ceos image from ceos_image_orig works as expected. Signed-off-by: Xin Wang <[email protected]> Co-authored-by: Copilot <[email protected]> Signed-off-by: Venkata Gouri Rajesh Etla <[email protected]>
|
@wangxin PR conflicts with 202505 branch |
|
@wangxin PR conflicts with 202411 branch |
…2794) What is the motivation for this PR? In environments that enforce registry-only container image policies (e.g. Microsoft requiring all images to originate from ACR or MCR), the previous approach of building ceos_image locally or re-tagging a pulled image with a local name would trigger security alerts. This PR allows ceos_image to be sourced directly from a user-configured registry, so containers always reference the registry-prefixed image name and no locally-built alias is created. How did you do it? Three files are changed: ansible/group_vars/vm_host/ceos.yml Added three commented-out placeholder variables: ceos_registry, ceos_registry_username, ceos_registry_password. When ceos_registry is uncommented and set, the new registry logic is activated. Credentials are optional and only needed for authenticated registries. ansible/roles/vm_set/tasks/add_ceos_list.yml When ceos_registry is defined, check whether <registry>/<ceos_image> is already cached locally (a prior pull). If found, it is used directly — no docker tag is performed, so no local alias is created. If not cached and ceos_registry is defined, optionally log in (only when credentials are provided) and docker pull the image. Only falls back to the existing download-ceos_image_orig-and-build path when registry is not configured or the pull fails. ceos_image_found logic is registry-aware: when ceos_registry is defined, a plain local ceos_image does not count as "found", ensuring the registry path is always taken. ansible/roles/eos/tasks/ceos.yml and ceos_ensure_reachable.yml After loading group_vars/vm_host/ceos.yml, a docker_image_info check determines whether the registry image is available locally. ceos_effective_image is set to <registry>/<ceos_image> when the registry image is present, or plain ceos_image otherwise. All docker_container tasks use ceos_effective_image instead of ceos_image, so containers always reference the registry image in registry-enabled environments. How did you verify/test it? Tested "add-topo" by specifying ceos_registry. The ceos image from the registry is used successfully. Tested "add-topo" without specifying ceos_registry. The existing logic of using a local ceos image or preparing ceos image from ceos_image_orig works as expected. Signed-off-by: Xin Wang <[email protected]> Co-authored-by: Copilot <[email protected]> Signed-off-by: mssonicbld <[email protected]>
|
Cherry-pick PR to 202511: #23252 |
…2794) What is the motivation for this PR? In environments that enforce registry-only container image policies (e.g. Microsoft requiring all images to originate from ACR or MCR), the previous approach of building ceos_image locally or re-tagging a pulled image with a local name would trigger security alerts. This PR allows ceos_image to be sourced directly from a user-configured registry, so containers always reference the registry-prefixed image name and no locally-built alias is created. How did you do it? Three files are changed: ansible/group_vars/vm_host/ceos.yml Added three commented-out placeholder variables: ceos_registry, ceos_registry_username, ceos_registry_password. When ceos_registry is uncommented and set, the new registry logic is activated. Credentials are optional and only needed for authenticated registries. ansible/roles/vm_set/tasks/add_ceos_list.yml When ceos_registry is defined, check whether <registry>/<ceos_image> is already cached locally (a prior pull). If found, it is used directly — no docker tag is performed, so no local alias is created. If not cached and ceos_registry is defined, optionally log in (only when credentials are provided) and docker pull the image. Only falls back to the existing download-ceos_image_orig-and-build path when registry is not configured or the pull fails. ceos_image_found logic is registry-aware: when ceos_registry is defined, a plain local ceos_image does not count as "found", ensuring the registry path is always taken. ansible/roles/eos/tasks/ceos.yml and ceos_ensure_reachable.yml After loading group_vars/vm_host/ceos.yml, a docker_image_info check determines whether the registry image is available locally. ceos_effective_image is set to <registry>/<ceos_image> when the registry image is present, or plain ceos_image otherwise. All docker_container tasks use ceos_effective_image instead of ceos_image, so containers always reference the registry image in registry-enabled environments. How did you verify/test it? Tested "add-topo" by specifying ceos_registry. The ceos image from the registry is used successfully. Tested "add-topo" without specifying ceos_registry. The existing logic of using a local ceos image or preparing ceos image from ceos_image_orig works as expected. Signed-off-by: Xin Wang <[email protected]> Co-authored-by: Copilot <[email protected]>
…2794) What is the motivation for this PR? In environments that enforce registry-only container image policies (e.g. Microsoft requiring all images to originate from ACR or MCR), the previous approach of building ceos_image locally or re-tagging a pulled image with a local name would trigger security alerts. This PR allows ceos_image to be sourced directly from a user-configured registry, so containers always reference the registry-prefixed image name and no locally-built alias is created. How did you do it? Three files are changed: ansible/group_vars/vm_host/ceos.yml Added three commented-out placeholder variables: ceos_registry, ceos_registry_username, ceos_registry_password. When ceos_registry is uncommented and set, the new registry logic is activated. Credentials are optional and only needed for authenticated registries. ansible/roles/vm_set/tasks/add_ceos_list.yml When ceos_registry is defined, check whether <registry>/<ceos_image> is already cached locally (a prior pull). If found, it is used directly — no docker tag is performed, so no local alias is created. If not cached and ceos_registry is defined, optionally log in (only when credentials are provided) and docker pull the image. Only falls back to the existing download-ceos_image_orig-and-build path when registry is not configured or the pull fails. ceos_image_found logic is registry-aware: when ceos_registry is defined, a plain local ceos_image does not count as "found", ensuring the registry path is always taken. ansible/roles/eos/tasks/ceos.yml and ceos_ensure_reachable.yml After loading group_vars/vm_host/ceos.yml, a docker_image_info check determines whether the registry image is available locally. ceos_effective_image is set to <registry>/<ceos_image> when the registry image is present, or plain ceos_image otherwise. All docker_container tasks use ceos_effective_image instead of ceos_image, so containers always reference the registry image in registry-enabled environments. How did you verify/test it? Tested "add-topo" by specifying ceos_registry. The ceos image from the registry is used successfully. Tested "add-topo" without specifying ceos_registry. The existing logic of using a local ceos image or preparing ceos image from ceos_image_orig works as expected. Signed-off-by: Xin Wang <[email protected]> Co-authored-by: Copilot <[email protected]>
Description of PR
Summary:
Add support for an optional docker registry (e.g. ACR, MCR) as an image source for
ceos_image, so that deployments with strict registry-only policies (e.g. Microsoft internal) can pull the image directly from a registry without ever creating a locally-built image.Type of change
Back port request
Approach
What is the motivation for this PR?
In environments that enforce registry-only container image policies (e.g. Microsoft requiring all images to originate from ACR or MCR), the previous approach of building
ceos_imagelocally or re-tagging a pulled image with a local name would trigger security alerts. This PR allowsceos_imageto be sourced directly from a user-configured registry, so containers always reference the registry-prefixed image name and no locally-built alias is created.How did you do it?
Three files are changed:
ansible/group_vars/vm_host/ceos.ymlceos_registry,ceos_registry_username,ceos_registry_password.ceos_registryis uncommented and set, the new registry logic is activated. Credentials are optional and only needed for authenticated registries.ansible/roles/vm_set/tasks/add_ceos_list.ymlceos_registryis defined, check whether<registry>/<ceos_image>is already cached locally (a prior pull). If found, it is used directly — nodocker tagis performed, so no local alias is created.ceos_registryis defined, optionally log in (only when credentials are provided) anddocker pullthe image.ceos_image_orig-and-build path when registry is not configured or the pull fails.ceos_image_foundlogic is registry-aware: whenceos_registryis defined, a plain localceos_imagedoes not count as "found", ensuring the registry path is always taken.ansible/roles/eos/tasks/ceos.ymlandceos_ensure_reachable.ymlgroup_vars/vm_host/ceos.yml, adocker_image_infocheck determines whether the registry image is available locally.ceos_effective_imageis set to<registry>/<ceos_image>when the registry image is present, or plainceos_imageotherwise.docker_containertasks useceos_effective_imageinstead ofceos_image, so containers always reference the registry image in registry-enabled environments.How did you verify/test it?
ceos_registry. The ceos image from the registry is used successfully.ceos_registry. The existing logic of using a local ceos image or preparing ceos image fromceos_image_origworks as expected.Any platform specific information?
N/A
Supported testbed topology if it's a new test case?
N/A
Documentation
No documentation update needed. The new variables are self-documented with inline comments in
ansible/group_vars/vm_host/ceos.yml.