-
Notifications
You must be signed in to change notification settings - Fork 2.1k
test: fix flaky TestConnectAndWait/connect_goroutine_exits_after_EOF #5927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #5927 +/- ##
==========================================
- Coverage 59.33% 59.32% -0.02%
==========================================
Files 358 358
Lines 29783 29783
==========================================
- Hits 17672 17669 -3
- Misses 11142 11145 +3
Partials 969 969 🚀 New features to boost your workflow:
|
Signed-off-by: Alano Terblanche <[email protected]>
df722e9 to
ce9a64c
Compare
| return poll.Continue("waiting for connect goroutine to exit") | ||
| } | ||
| return poll.Success() | ||
| }, poll.WithDelay(1*time.Millisecond), poll.WithTimeout(10*time.Millisecond)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC we specifically wanted to assert that goroutine is stopped.
Wondering if just increasing the timeout would help with the flakiness?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i honestly don't know why we would want to do that. The goroutine will stop once the callback function returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also the failure is not always on the poll, it's on the first assertion on line 190 https://github.com/docker/cli/pull/5927/files/ce9a64c8f4524e6f339f48b1892374c0a5e1580e#diff-2b3ea721470acb354ca37a32dc40b7c0ae55bd43397a084adbd922fca5964e20L190
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah in this case I think we should also have a poll.WaitOn instead of a straight assert (because the goroutine may not be spawned immediately).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goroutine will stop once the callback function returns.
Unless there's a bug in ConnectAndWait 🙈
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the code is changed and the implementation is bad, then there is no guarantee that a test would catch the regression anyway. Focusing on a hypothetical case won't solve the fact that this test fails multiple times per day.
Either we re-design the ConnectAndWait to be more resilient or we change the test as I've done here. We cannot predict when a goroutine will run or when it will exit. The changes I make here are more robust from the test perspective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it no longer tests what the test claims to: connect goroutine exits after EOF.
Maybe it's not possible to reliably test that, okay, but let's try to fix the actual test before actually changing it into completely different test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened: #5932
If it's still flaky after that, I'm 100% agreeing on changing this test according to your proposal :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it no longer tests what the test claims to: connect goroutine exits after EOF.
That's not true, it still tests that the goroutine exits on EOF.
The callback is only called once we receive an EOF here:
cli/cli-plugins/socket/socket.go
Lines 163 to 166 in ce9a64c
| if errors.Is(err, io.EOF) { | |
| cb() | |
| return | |
| } |
After the callback is called the goroutine will exit due to the return.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it tests that the callback gets called after an EOF.
After the callback is called the goroutine will exit due to the return.
Yes, that's true with the current code, but tests are also there to detect an unwanted change in future. If this function gets changed in future and introduces a bug where the callback gets called, but there's no return out of the goroutine then this test would be able to detect it.
|
In favor of #5932 |
Closes #5837
- What I did
The test currently measures the reduction in the number of goroutines, which is strange as there is a callback function called by
ConnectAndWaitonce theio.EOFis reached.- How I did it
- How to verify it
go test -race -run=TestConnectAndWait/connect_goroutine_exits_after_EOF -count=1000 -v -failfast ./cli-plugins/socket/...- Human readable description for the release notes
- A picture of a cute animal (not mandatory but encouraged)