Skip to content

fix: Exit with non-zero status if ping or heartbeat fail unrecoverably#3687

Merged
DrJosh9000 merged 1 commit intomainfrom
ps-1593-investigate-using-non-zero-exit-code-when-ping-receives-http
Jan 28, 2026
Merged

fix: Exit with non-zero status if ping or heartbeat fail unrecoverably#3687
DrJosh9000 merged 1 commit intomainfrom
ps-1593-investigate-using-non-zero-exit-code-when-ping-receives-http

Conversation

@DrJosh9000
Copy link
Contributor

@DrJosh9000 DrJosh9000 commented Jan 28, 2026

Description

Context: A recent incident at Buildkite caused an issue to occur... wait, where have we heard this before?

It turns out the agent was exiting 0 in cases where the ping or heartbeat are unrecoverable, so this PR changes them to exit 1.

Context

https://linear.app/buildkite/issue/PS-1593/investigate-using-non-zero-exit-code-when-ping-receives-http-401

Changes

  • Return the unrecoverable error from runPingLoop.
  • Return an error from runHeartbeatLoop. Collect both errors and return either nil or one of the non-nil loop errors from agentWorker.Start. (That error bubbles up to main, which should exit 1.)
  • Remove heartbeatCtx, and instead have the heartbeat loop respond to <-a.stop, so that "ordinary" stopping is not an error.
  • Move calls to StopUngracefully out of PingandHeartbeat, instead making them the responsibility of runPingLoopandrunHeartbeatLoop`. (Could go either way on this.)
  • Replace errors.Is(err, &errUnrecoverable{}) with a slightly nicer looking isUnrecoverable(err), and remove the Is method (note that the method removal could also be achieved with errors.As(err, new(*errUnrecoverable))).
  • Add a new test checking that an unrecoverable error bubbles up from agentWorker.Start, and necessary changes to the fake API server to avoid a deadlock

Testing

  • Tests have run locally (with go test ./...). Buildkite employees may check this if the pipeline has run automatically.
  • Code is formatted (with go tool gofumpt -extra -w .)

Disclosures / Credits

I did not use AI tools at all

@DrJosh9000 DrJosh9000 requested a review from a team January 28, 2026 04:49
Copy link
Contributor

@moskyb moskyb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@zhming0 zhming0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the consensus mechanism between ping loop and heart beat loop isn't quite right I think.

@DrJosh9000 DrJosh9000 force-pushed the ps-1593-investigate-using-non-zero-exit-code-when-ping-receives-http branch 2 times, most recently from 9a637e2 to 7a5cd91 Compare January 28, 2026 05:25
@zhming0 zhming0 dismissed their stale review January 28, 2026 05:36

Not a problem.

@DrJosh9000 DrJosh9000 force-pushed the ps-1593-investigate-using-non-zero-exit-code-when-ping-receives-http branch 3 times, most recently from 3bd617f to 0fe0c22 Compare January 28, 2026 05:42
@DrJosh9000 DrJosh9000 enabled auto-merge January 28, 2026 05:43
@DrJosh9000 DrJosh9000 disabled auto-merge January 28, 2026 05:43
@DrJosh9000 DrJosh9000 force-pushed the ps-1593-investigate-using-non-zero-exit-code-when-ping-receives-http branch from 0fe0c22 to c3e8c31 Compare January 28, 2026 06:05
@DrJosh9000 DrJosh9000 force-pushed the ps-1593-investigate-using-non-zero-exit-code-when-ping-receives-http branch from c3e8c31 to 1f2dce2 Compare January 28, 2026 06:06
@DrJosh9000 DrJosh9000 enabled auto-merge January 28, 2026 06:09
@DrJosh9000 DrJosh9000 merged commit 5d12bc1 into main Jan 28, 2026
2 checks passed
@DrJosh9000 DrJosh9000 deleted the ps-1593-investigate-using-non-zero-exit-code-when-ping-receives-http branch January 28, 2026 06:10
@DrJosh9000 DrJosh9000 mentioned this pull request Jan 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants