-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Description
Alongside #6913 and #6912, I have ran the test/xds suite on master since I added tests to it for my xDS Server fix #6889. I have encountered numerous flakes on g3, particularly those outlined in custom lb tests for distribution #6601. However, I have encountered almost every client and server side xDS test flake with a context timeout for a RPC expected to proceed. Each has different logs/events preceeding it's timeout, but every test seems susceptible to timeout. The flakes are generally rare, but due to the number of tests in the test suite you can successfully trigger by running the full test suite enough times. My initial inkling tells me there's some synchronization needed or something gets stuck in the management server/testing xDS Client flow. This also manifests in rare flakes for my xDS Server fix, where I expect something like an err that represents Accept and Close, and I get a context timeout instead.