Skip to content

Conversation

@DrJosh9000
Copy link
Contributor

@DrJosh9000 DrJosh9000 commented Nov 25, 2025

Description

Undo the rearrangement of the agent pool error channel that happened in #3576.

Context

Fixes bug introduced in #3576.

Prior to #3576, the deferred Disconnect call would happen before sending the error value on the error channel. #3576 looked correct, but made Disconnect happen after sending the error value. The agent pool can return as soon as it has read the error channel N times, and the whole agent can exit very soon after. Disconnect is deferred, so if the error channel send happens before the worker returns, the Disconnect API request can still be in flight while the whole program is exiting, so it may not complete.

Changes

See Description.

Testing

  • Tests have run locally (with go test ./...). Buildkite employees may check this if the pipeline has run automatically.
  • Code is formatted (with go tool gofumpt -extra -w .)

Disclosures / Credits

Me.
I guess I am both the arsonist and the firefighter in this instance?

@DrJosh9000 DrJosh9000 requested a review from a team November 25, 2025 05:39
@DrJosh9000 DrJosh9000 merged commit b4fc16f into main Nov 25, 2025
1 check passed
@DrJosh9000 DrJosh9000 deleted the fix-quit-disconnect-race branch November 25, 2025 05:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants