Skip to content

Commit 216895f

Browse files
maxtacoclaude
andauthored
test: dial agent socket instead of stat'ing it in waitForSocket (#261)
Tests that stop and immediately restart the agent — e.g. TestPassphraseSetViaKex doing `y.stop(t); y.runAgent(t); verifyPassphraseLockedThenUnlock(...)` — can race on CI: `stop()` returns when the shutdown RPC is acked, which may be before the old agent's socket file is unlinked. The new agent's goroutine then races against the next CLI invocation. `os.Stat(sock) == nil` succeeds against the stale file before the new listener exists, so waitForSocket returns "ready" too early and the next dial fails with "failed to connect to agent at path ...". Replace the stat with `net.DialTimeout("unix", sock, 50*ms)` so we only return once a connection actually completes. A stale unlinked file no longer fools us; if no one's listening the dial fails fast with ECONNREFUSED and we keep polling. Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
1 parent 63e82eb commit 216895f

1 file changed

Lines changed: 8 additions & 3 deletions

File tree

integration-tests/cli/base_test.go

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ import (
88
"encoding/json"
99
"fmt"
1010
"io"
11+
"net"
1112
"os"
1213
"strings"
1314
"testing"
@@ -463,19 +464,23 @@ func (a *testAgent) waitForSocket(t *testing.T) {
463464
}, "ctl", "socket")
464465
sock := tui.TrimmedString()
465466

466-
// Now poll for it, shoudln't take long
467+
// Poll until the agent is actually accepting connections, not just until
468+
// the socket file exists. A stale socket file from a previous agent run
469+
// can satisfy os.Stat before the new listener is up, leading to a flaky
470+
// "failed to connect to agent" on the next CLI invocation.
467471
wait := time.Millisecond * 1
468472
for i := 0; i < 20; i++ {
469-
_, err := os.Stat(sock)
473+
conn, err := net.DialTimeout("unix", sock, 50*time.Millisecond)
470474
if err == nil {
475+
_ = conn.Close()
471476
return
472477
}
473478
time.Sleep(wait)
474479
if wait < time.Millisecond*100 {
475480
wait *= 2
476481
}
477482
}
478-
t.Fatal("socket not found")
483+
t.Fatal("agent socket not reachable")
479484
}
480485

481486
func (a *testAgent) stop(t *testing.T) {

0 commit comments

Comments
 (0)