Skip to content

Conversation

@cpuguy83
Copy link
Collaborator

@cpuguy83 cpuguy83 commented Feb 29, 2024

This changes things to rely on a plugin server that manages all
connections made to the server.

An optional handler can be passed into the server when the caller wants
to do extra things with the connection.

It is the caller's responsibility to close the server.
When the server is closed, first all existing connections are closed
(and new connections are prevented).

Now the signal loop only needs to close the server and not deal with
net.Conn's directly (or double-indirects as the case was before this
change).

The socket, when present in the filesystem, is no longer unlinked
eagerly, as reconnections require it to be present for the lifecycle of
the plugin server.

@codecov-commenter
Copy link

codecov-commenter commented Feb 29, 2024

Codecov Report

Merging #4905 (d68cc0e) into master (4468148) will increase coverage by 0.06%.
The diff coverage is 93.87%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4905      +/-   ##
==========================================
+ Coverage   61.71%   61.78%   +0.06%     
==========================================
  Files         289      289              
  Lines       20241    20267      +26     
==========================================
+ Hits        12492    12521      +29     
+ Misses       6863     6859       -4     
- Partials      886      887       +1     

Copy link
Member

@neersighted neersighted left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An excellent simplification. I've been on a bit of a "use more contexts" kick recently, but I don't know how I totally overlooked them as the ideal/correct synchronization primitive here.

I'd like to see the loop used to handle plugin connections simplified a bit.

In addition, two other thoughts:

  • FreeBSD doesn't appear to have anonymous sockets either, so really we appear to have "the Linuxes" (Windows + Linux) and "the Berkley Unixes." Based on that, I think we should have a socket_linux_windows and a socket_nolinux_nowindows instead of the current arrangement.
  • To make things flow a little smoother, why not make listen() return the UnixAddr pointer so we don't have to pull it back out of the listener and typecast it?

@neersighted neersighted changed the title Plugin conn refactor refactor: context-based plugin notification socket Feb 29, 2024
@neersighted neersighted added this to the 26.0.0 milestone Feb 29, 2024
@Benehiko
Copy link
Member

Benehiko commented Mar 1, 2024

yeah honestly I think this is a better solution.

I guess based on your comment (#4878 (comment)) we would use a pattern similar to http.Handler to pass in a function that uses the connection, instead of returning it to the caller? Thinking in terms of if we use the connection for something other than just closing it :)

func SocketListener(ctx context.Context, h func(context.Context, net.UnixConn) error ) {
 for {
  // accept conn
  conn, err := l.Accept()
  // send it off to the handler
  go h(ctx, conn)
 }
}

In the above example, we spawn many conn instances that are used inside the handler h. Thinking more in line with how we handle http connections when using http.ListenAndServe.

@cpuguy83
Copy link
Collaborator Author

cpuguy83 commented Mar 1, 2024

Can we use this for anything other than just closing it without breaking existing plugins?

@cpuguy83
Copy link
Collaborator Author

cpuguy83 commented Mar 1, 2024

To clarify:

If the plugin is never expecting to receive data on this socket and is only waiting for a read to return, I'm not sure it is possible to re-use the same socket without breaking those assumptions.

@neersighted
Copy link
Member

The existing code does a busy-read, discarding any data read until an EOF is reached, upon which it cancels the CLI plugin context (which should terminate the process).

@thaJeztah
Copy link
Member

Looks like the linter is complaining;

 > [lint 4/4] RUN --mount=type=bind,target=.     --mount=type=cache,target=/root/.cache         golangci-lint run:
62.42 cli-plugins/socket/socket.go:40:4: shadow: declaration of "conn" shadows declaration at line 31 (govet)
62.42 			conn, err := l.Accept()
62.42 			^
62.42 cli-plugins/socket/socket_test.go:20:3: shadow: declaration of "ctx" shadows declaration at line 18 (govet)
62.42 		ctx, cancel := context.WithCancel(ctx)
62.42 		^

@neersighted
Copy link
Member

neersighted commented Mar 4, 2024

Looks like the linter is complaining

With my suggested changes the shadowing the linter complains about should go away.

Copy link
Collaborator

@laurazard laurazard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice change overall, thanks!

@cpuguy83 cpuguy83 force-pushed the plugin_notify_conn_cleanup branch 2 times, most recently from 72a6233 to 0b44902 Compare March 4, 2024 20:24
@cpuguy83
Copy link
Collaborator Author

cpuguy83 commented Mar 4, 2024

I've updated this:

  • No longer uses context
  • Returns a closer which is what gets called in the signal loop
  • Closer is a PluginServer which manages any/all connections internally
  • PluginServer takes an optional handler which is called for every new connection

Copy link
Member

@Benehiko Benehiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff! :)

Copy link
Member

@neersighted neersighted left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you're thinking a fair bit more elaborately/with an eye to implementing the connection-handling code sooner than later. Thanks!

A couple minor nits, otherwise LGTM.

Copy link
Contributor

@krissetto krissetto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM after the other small suggestions are in, thanks!

@neersighted neersighted changed the title refactor: context-based plugin notification socket refactor: closer-based plugin notification socket Mar 7, 2024
@vvoland vvoland modified the milestones: 26.0.0, 27.0.0 Mar 14, 2024
@neersighted neersighted force-pushed the plugin_notify_conn_cleanup branch from 0b44902 to 4a09dc0 Compare March 21, 2024 13:10
Copy link
Member

@neersighted neersighted left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've carried the requested review changes; LGTM now with @cpuguy83's own review.

@neersighted neersighted force-pushed the plugin_notify_conn_cleanup branch 2 times, most recently from f7889f7 to ef7991b Compare March 21, 2024 13:15
Copy link
Member

@Benehiko Benehiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@neersighted neersighted force-pushed the plugin_notify_conn_cleanup branch from ef7991b to 1f9ffd8 Compare March 21, 2024 13:20
@thaJeztah
Copy link
Member

Failure looks to be in this area?

--- FAIL: TestPluginServer (0.10s)
    --- FAIL: TestPluginServer/allows_reconnects (0.10s)
        socket_test.go:89: assertion failed: error is not nil: dial unix /var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/docker_cli_eb8ce106e1522d6ad46ca1fbd263a81f: connect: no such file or directory: failed to redial server

@neersighted neersighted force-pushed the plugin_notify_conn_cleanup branch from 1f9ffd8 to a984f97 Compare March 21, 2024 13:49
@neersighted
Copy link
Member

neersighted commented Mar 21, 2024

Failure looks to be in this area?

--- FAIL: TestPluginServer (0.10s)
    --- FAIL: TestPluginServer/allows_reconnects (0.10s)
        socket_test.go:89: assertion failed: error is not nil: dial unix /var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/docker_cli_eb8ce106e1522d6ad46ca1fbd263a81f: connect: no such file or directory: failed to redial server

Yes; @cpuguy83 added the ability to reconnect to the server, instead of it being a one-shot. This means we can no longer do the eager unlink (so, we're back to the behavior @krissetto expected, where we unlink at the end of the lifecycle); it is now more possible to 'leak' sockets if the process is ungracefully terminated, but that seems to be the cost of the paradigm shift to the PluginServer.

I've refactored the code around this, and pushed a new version that makes this behavior explicit and passes all tests.

Copy link
Member

@Benehiko Benehiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cpuguy83
Copy link
Collaborator Author

Sorry I forgot I had some changes pending here.

@neersighted
Copy link
Member

neersighted commented Mar 21, 2024

Sorry I forgot I had some changes pending here.

No problem! Does my set of changes look good to you (I kept reconnection as a feature, which meant dropping the early unlink... Before the for {} loop was really a poor man's synchronization point/barrier)?

@cpuguy83
Copy link
Collaborator Author

LGTM, it won't let me approve it officially since its my own PR.

@neersighted neersighted changed the title refactor: closer-based plugin notification socket plugin: closer-based plugin notification socket Mar 21, 2024
This changes things to rely on a plugin server that manages all
connections made to the server.

An optional handler can be passed into the server when the caller wants
to do extra things with the connection.

It is the caller's responsibility to close the server.
When the server is closed, first all existing connections are closed
(and new connections are prevented).

Now the signal loop only needs to close the server and not deal with
`net.Conn`'s directly (or double-indirects as the case was before this
change).

The socket, when present in the filesystem, is no longer unlinked
eagerly, as reconnections require it to be present for the lifecycle of
the plugin server.

Co-authored-by: Bjorn Neergaard <[email protected]>
Signed-off-by: Brian Goff <[email protected]>
Signed-off-by: Bjorn Neergaard <[email protected]>
@neersighted neersighted force-pushed the plugin_notify_conn_cleanup branch from a984f97 to d68cc0e Compare March 21, 2024 21:08
@neersighted neersighted added the kind/refactor PR's that refactor, or clean-up code label Mar 21, 2024
Copy link
Collaborator

@laurazard laurazard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/plugins kind/bugfix PR's that fix bugs kind/refactor PR's that refactor, or clean-up code status/2-code-review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants