Skip to content

Conversation

@gmargaritis
Copy link
Contributor

@gmargaritis gmargaritis commented May 5, 2023

Part of #373 along with #4259

Added --detach and --quiet / -q flags to stack deploy. Setting --detach=false waits until all of the services have converged. Shows progress bars for each individual task, unless --quiet / -q is specified.

I have used the WaitOnService under the hood, so that the behavior is in line with service update or service create while using the --detach flag.

@gmargaritis gmargaritis force-pushed the 373-support-detach-in-docker-stack-deploy branch from 7f455cd to 93ef689 Compare May 5, 2023 14:00
@neersighted
Copy link
Member

PTAL @corhere @dperny

@gmargaritis
Copy link
Contributor Author

Hey @thaJeztah and @neersighted

I understand you're super busy, but wanted to see if there's anything I could do on my part to help move this forward. It will be a great improvement for our workflow, as we'll be able to deploy with confidence! Thanks!

@marden
Copy link

marden commented Aug 22, 2023

Waiting for this too

@dperny
Copy link
Contributor

dperny commented Sep 1, 2023

This looks fine to me, matches behavior in docker service create.

return errors.Wrapf(err, "failed to create service %s", name)
response, err := apiClient.ServiceCreate(ctx, serviceSpec, createOpts)
if err != nil {
return errors.Wrapf(err, "Failed to create service %s", name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These errors should probably remain lowercase (go convention it to have lowercase errors). (If needed, we can still add some prefix when printing)

Copy link
Contributor Author

@gmargaritis gmargaritis Sep 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted this change in 0dcc6f0!

if !detach {
for _, serviceID := range serviceIDs {
if err := servicecli.WaitOnService(ctx, dockerCli, serviceID, quiet); err != nil {
errs = append(errs, fmt.Sprintf("%s: %v", serviceID, err))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haven't tried if it'd look good, and probably fine for a follow-up, but we may be able to use the new errors.Join here, and create a multi-error 🤔

Copy link
Contributor Author

@gmargaritis gmargaritis Sep 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thaJeztah 👋 I have updated the code according to the suggestion by using errors.Join() in 0dcc6f0.

Given the opportunity, I replaced errors.Errorf and errors.Wrapf with fmt.Errorf in this file, in order to avoid renaming the errors library.

@gmargaritis gmargaritis force-pushed the 373-support-detach-in-docker-stack-deploy branch from 93ef689 to 0dcc6f0 Compare September 14, 2023 10:33
@gmargaritis
Copy link
Contributor Author

Hey @thaJeztah and @neersighted 👋

Is there anything else I can do to make sure that this one moves forward?

}

func deployServices(ctx context.Context, dockerCli command.Cli, services map[string]swarm.ServiceSpec, namespace convert.Namespace, sendAuth bool, resolveImage string) error {
func deployServices(ctx context.Context, dockerCli command.Cli, services map[string]swarm.ServiceSpec, namespace convert.Namespace, sendAuth bool, resolveImage string, detach bool, quiet bool) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not critical (as it's a non-exported function), but the list of arguments is getting really long, and consecutive arguments of the same type make for an awkward api.

Starting to wondering if we should instead pass a struct, or perhaps we could even pass the options.Deploy struct, as we already have that (haven't checked if it's in the proper package for that)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @thaJeztah 👋,

Passing options.Deploy in deployServices was increasing the cyclomatic complexity. I've decided to keep deployServices as it previously was and call waitOnServices one level higher. This way the logic is clearer while keeping the cyclomatic complexity as it was.

https://github.com/gmargaritis/cli/blob/e22a17dc65a56dd1f6b5ba01e0f440049c04482b/cli/command/stack/swarm/deploy_composefile.go#L65-L74

Comment on lines +45 to +46
flags.BoolVarP(&opts.Detach, "detach", "d", true, "Exit immediately instead of waiting for the stack services to converge")
flags.BoolVarP(&opts.Quiet, "quiet", "q", false, "Suppress progress output")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this PR (currently at least) keeps the old behavior (docker stack deploy being non-interactive / detached as default).

If the plan is to eventually make "interactive" (--detach=false) the default, I also went looking how we handled the transition for docker service create; moby/moby@21ec12b (moby/moby#31144)

For that we chose to print a warning; wondering if (and if: when? / which release?) we want to do the same for docker stack; inform users that we plan to switch non-detach as default, with the option to return to the old (--detach[=true]) behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a message just like in moby/moby@21ec12b (moby/moby#31144) as you said, indicating that we plan to switch non-detach as default.

@gmargaritis gmargaritis force-pushed the 373-support-detach-in-docker-stack-deploy branch from 0dcc6f0 to e22a17d Compare November 30, 2023 08:49
@gmargaritis gmargaritis force-pushed the 373-support-detach-in-docker-stack-deploy branch from e22a17d to 73a60fc Compare January 24, 2024 13:12
@gmargaritis
Copy link
Contributor Author

Hey @thaJeztah @neersighted 👋

I've implemented all the suggested fixes and rebased the PR!

Is there anything else that I can do to move this one forward?

@codecov-commenter
Copy link

codecov-commenter commented Feb 8, 2024

Codecov Report

Merging #4258 (bc35d72) into master (9e2615b) will increase coverage by 1.65%.
Report is 30 commits behind head on master.
The diff coverage is 17.39%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4258      +/-   ##
==========================================
+ Coverage   59.58%   61.24%   +1.65%     
==========================================
  Files         287      287              
  Lines       24722    20063    -4659     
==========================================
- Hits        14731    12287    -2444     
+ Misses       9106     6884    -2222     
- Partials      885      892       +7     

@gmargaritis gmargaritis force-pushed the 373-support-detach-in-docker-stack-deploy branch from 7f23cad to bc35d72 Compare February 8, 2024 15:10
@gmargaritis
Copy link
Contributor Author

gmargaritis commented Feb 9, 2024

@thaJeztah It seems that we have to re-run the ci, since it failed because of an unrelated error.

Let me know if there's anything left to be done here ✌️

@thaJeztah
Copy link
Member

I gave it a kick.

For posterity; this was the error. Probably flaky, or some connection issues on GitHub actions (not sure if I've seen that specific one before, so cc @crazy-max - just in case it pops-up more frequently)

Dockerfile:1
--------------------
   1 | >>> # syntax=docker/dockerfile:1
   2 |     
   3 |     ARG BASE_VARIANT=alpine
--------------------
ERROR: failed to solve: frontend grpc server closed unexpectedly
Error: buildx bake failed with: ERROR: failed to solve: frontend grpc server closed unexpectedly

@gmargaritis
Copy link
Contributor Author

@thaJeztah @crazy-max

What's the status of this PR? Are we ok here?

@thaJeztah
Copy link
Member

Sorry for the delay (again) jumping between "etoomany" things. I gave this a try, and I see a couple of things happening when trying this PR;

  • waitOnServices is waiting for services sequentially. This means that it only reports progress of the service it's currently waiting for, but won't report progress for other services that are being deployed.
  • ^^ likely due to the internal use of maps, the order in which services get deployed is randomised. This wasn't very apparent when "detached", but now becomes more visible.
  • ^^ it also becomes more apparent that the existing progress bars / progress output was clearly designed to be used for a single service. With multiple services being deployed, the progress-bars become a bit confusing;
    • There's separate output for "creating", "progress", "converging" output
    • Progress-bar and "converging" steps don't include the name of the service (so it's hard to see what service is shown) but may include service ID

Perhaps it's not a strict blocker for this PR, but it does all feel a bit "unfinished".

Perhaps we can't fix all the progress output (some of that may also be related to the daemon response - haven't checked that), but we could consider

  • Flipping the implementation, and make WaitOnService accept multiple services. When used for docker service create, we would only pass a single one, but when used for docker stack deploy, it would pass multiple services.
  • When called with multiple services, can we make those print progress in parallel? (I guess there may be some complication there, because of the multiple outputs per service, so perhaps that would still be a for a follow-up)
  • This doesn't look like anything "new", but wondering if we can sort the order in which services are deployed (if we can't preserve order from the compose-file, we could sort alphabetically) at least so that there's a consistent order in which services are deployed.

Here's what I tried with this PR;

A compose file with 3 services. One of which had a health-check that takes some time to become healthy;

services:
  a_service:
    image: nginx:alpine
    healthcheck:
      test: ["CMD", "sh", "-c", "if [ ! -f \"/count\" ] ; then ctr=0; else ctr=`cat /count`; fi; ctr=`expr $${ctr} + 1`; echo \"$${ctr}\" > /count; if [ \"$$ctr\" -gt 4 ] ; then exit 0; else exit 1; fi"]
      interval: 10s
      timeout: 3s
      retries: 3
      start_period: 60s
    deploy:
      replicas: 3
  b_service:
    image: nginx:alpine
  c_service:
    image: nginx:alpine

When deploying the service, the order was randomized; the first service ("service_c") deployed immediately, but then was checked for 5 seconds to verify if it was "stable". After this, the second service was waited for ("service_a"). This service has the health-check so it now had to wait for a minute or so to become healthy;

time docker stack deploy -c ./docker-compose.yaml --detach=false mystack
Creating network mystack_default
Creating service mystack_c_service
Creating service mystack_a_service
Creating service mystack_b_service
overall progress: 3 out of 3 tasks
1/3: running   [==================================================>]
2/3: running   [==================================================>]
3/3: running   [==================================================>]
verify: Service oj3m4chiob0af8t9qrt320gmj converged
overall progress: 0 out of 3 tasks
1/3: starting  [============================================>      ]
2/3: starting  [============================================>      ]
3/3: starting  [============================================>      ]

After that, it continued with the next service ("service_b") which (in the meantime) already was deployed successfully, but due to the client-side logic to verify if it's stable, it still waits for 5 seconds to verify that;

,,,
verify: Service z9xhes4ie4maux02hwskinv2i converged
overall progress: 3 out of 3 tasks
1/3: running   [==================================================>]
2/3: running   [==================================================>]
3/3: running   [==================================================>]
verify: Service nbzvce7qc0vh7shn9rpdkaxdd converged

________________________________________________________
Executed in   64.18 secs      fish           external
   usr time  440.30 millis    0.18 millis  440.12 millis
   sys time  182.75 millis    1.22 millis  181.53 millis

@thaJeztah
Copy link
Member

/cc @laurazard @krissetto @Benehiko

Copy link
Collaborator

@laurazard laurazard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, thank you!

Comment on lines +295 to +297
if len(errs) > 0 {
return errors.Join(errs...)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just return errors.Join(errs...) here, it'll be nil if len(errs) == 0

Added --detach and --quiet/-q flags to stack deploy. Setting --detach=false
waits until all of the stack services have converged. Shows progress bars for
each individual task, unless  --quiet/-q is specified.

Co-authored-by: Sebastiaan van Stijn <[email protected]>
Signed-off-by: George Margaritis <[email protected]>
@thaJeztah thaJeztah force-pushed the 373-support-detach-in-docker-stack-deploy branch from bc35d72 to b086d72 Compare March 4, 2024 08:38
@thaJeztah
Copy link
Member

Did a quick rebase and re-generated the markdown as we updated the cli-docs-tool with a fix for the boolean flags;

diff --git a/docs/reference/commandline/stack_deploy.md b/docs/reference/commandline/stack_deploy.md
index c8b116a09..82934f6f9 100644
--- a/docs/reference/commandline/stack_deploy.md
+++ b/docs/reference/commandline/stack_deploy.md
@@ -12,7 +12,7 @@ Deploy a new stack or update an existing stack
 | Name                                                     | Type          | Default  | Description                                                                                       |
 |:---------------------------------------------------------|:--------------|:---------|:--------------------------------------------------------------------------------------------------|
 | [`-c`](#compose-file), [`--compose-file`](#compose-file) | `stringSlice` |          | Path to a Compose file, or `-` to read from stdin                                                 |
-| `-d`, `--detach`                                         |               |          | Exit immediately instead of waiting for the stack services to converge                            |
+| `-d`, `--detach`                                         | `bool`        | `true`   | Exit immediately instead of waiting for the stack services to converge                            |
 | `--prune`                                                |               |          | Prune services that are no longer referenced                                                      |
 | `-q`, `--quiet`                                          |               |          | Suppress progress output                                                                          |
 | `--resolve-image`                                        | `string`      | `always` | Query the registry to resolve image digest and supported platforms (`always`, `changed`, `never`) |

I also opened a tracking ticket to look at follow-up improvements discussed above; #4907

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM let's bring this one in as a first iteration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants