Deflake replica selection test by relaxing cluster configurations #2672

sarthakaggarwal97 · 2025-10-02T04:43:04Z

We have relaxed the cluster-ping-interval and cluster-node-timeout so that cluster has enough time to stabilize and propagate changes.

Today's failed test run: https://github.com/valkey-io/valkey/actions/runs/18179260254/job/51751751729#step:6:11262

[err]: Node #10 should eventually replicate node #5 in tests/unit/cluster/slave-selection.tcl
#10 didn't became slave of #5

codecov · 2025-10-02T05:09:31Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72.62%. Comparing base (3073979) to head (2525015).
⚠️ Report is 12 commits behind head on unstable.

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable    #2672      +/-   ##
============================================
+ Coverage     72.18%   72.62%   +0.44%     
============================================
  Files           128      128              
  Lines         70994    71273     +279     
============================================
+ Hits          51246    51762     +516     
+ Misses        19748    19511     -237

see 32 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

zuiderkwast

Do you think it will help or is it a wild guess?

sarthakaggarwal97 · 2025-10-08T04:43:57Z

@zuiderkwast The test fails with only valgrind in the past couple of weeks, so it should be related to general slowness with valgrind. Also, I have few passing valgrind runs in my local repo after this change, so it should work!

enjoy-binbin

5000 200 is quite a huge timeout and look odd to me, we have a lot of the same cluster test (i belive) under the daily, have you measured its testing time in daily ci? Do you think adjusting cluster-ping-interval and cluster-node-timeout would help?

sarthakaggarwal97 · 2025-10-08T13:46:09Z

Valgrind tests take about 3hrs 50mins ~ something that we see in daily tests too. Let me explore cluster-ping-interval and cluster-node-timeout.

sarthakaggarwal97 · 2025-10-08T23:13:06Z

@enjoy-binbin I think your suggestion has worked. I somehow didn't notice that the values for ping internal and node timeout are less by default. Just increasing for this test have gotten me 2-3 successful runs together.

Signed-off-by: Sarthak Aggarwal <[email protected]>

enjoy-binbin

Just increasing for this test have gotten me 2-3 successful runs together.

thanks, please try running it a few more times before we merge it.

sarthakaggarwal97 · 2025-10-09T20:49:04Z

@enjoy-binbin the test is green for 6 last runs in my local repo!

…lkey-io#2672) We have relaxed the `cluster-ping-interval` and `cluster-node-timeout` so that cluster has enough time to stabilize and propagate changes. Fixes this test occasional failure when running with valgrind: [err]: Node #10 should eventually replicate node #5 in tests/unit/cluster/slave-selection.tcl #10 didn't became slave of #5 Signed-off-by: Sarthak Aggarwal <[email protected]>

) We have relaxed the `cluster-ping-interval` and `cluster-node-timeout` so that cluster has enough time to stabilize and propagate changes. Fixes this test occasional failure when running with valgrind: [err]: Node #10 should eventually replicate node #5 in tests/unit/cluster/slave-selection.tcl #10 didn't became slave of #5 Backported to the 9.0 branch in #2731. Signed-off-by: Sarthak Aggarwal <[email protected]>

…lkey-io#2672) We have relaxed the `cluster-ping-interval` and `cluster-node-timeout` so that cluster has enough time to stabilize and propagate changes. Fixes this test occasional failure when running with valgrind: [err]: Node valkey-io#10 should eventually replicate node valkey-io#5 in tests/unit/cluster/slave-selection.tcl valkey-io#10 didn't became slave of valkey-io#5 Signed-off-by: Sarthak Aggarwal <[email protected]>

sarthakaggarwal97 requested a review from enjoy-binbin October 2, 2025 04:43

github-actions bot assigned sarthakaggarwal97 Oct 2, 2025

Nikhil-Manglore approved these changes Oct 2, 2025

View reviewed changes

zuiderkwast approved these changes Oct 6, 2025

View reviewed changes

zuiderkwast changed the title ~~Deflake slave selection test by increasing wait condition time~~ Deflake replica selection test by increasing wait condition time Oct 6, 2025

sarthakaggarwal97 force-pushed the slave-selection-flaky-test branch from 3d1a7a3 to 0d3b28f Compare October 8, 2025 03:33

enjoy-binbin reviewed Oct 8, 2025

View reviewed changes

sarthakaggarwal97 force-pushed the slave-selection-flaky-test branch from 0d3b28f to 65d857e Compare October 8, 2025 23:11

update cluster conf

2525015

Signed-off-by: Sarthak Aggarwal <[email protected]>

sarthakaggarwal97 force-pushed the slave-selection-flaky-test branch from 65d857e to 2525015 Compare October 8, 2025 23:13

sarthakaggarwal97 changed the title ~~Deflake replica selection test by increasing wait condition time~~ Deflake replica selection test by relaxing cluster configurations Oct 8, 2025

enjoy-binbin approved these changes Oct 9, 2025

View reviewed changes

zuiderkwast merged commit f598d11 into valkey-io:unstable Oct 9, 2025
52 checks passed

sarthakaggarwal97 mentioned this pull request Oct 9, 2025

[TEST-FAILURE] Node #10 should eventually replicate node #5 #2698

Closed

sarthakaggarwal97 added this to Valkey 9.0 Oct 10, 2025

sarthakaggarwal97 moved this to To be backported in Valkey 9.0 Oct 10, 2025

sarthakaggarwal97 mentioned this pull request Oct 13, 2025

[Backport 9.0] Deflake replica selection test by relaxing cluster configurations (#2672) #2731

Merged

zuiderkwast moved this from To be backported to Done in Valkey 9.0 Oct 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Deflake replica selection test by relaxing cluster configurations #2672

Deflake replica selection test by relaxing cluster configurations #2672

Uh oh!

sarthakaggarwal97 commented Oct 2, 2025 •

edited by zuiderkwast

Loading

Uh oh!

codecov bot commented Oct 2, 2025 •

edited

Loading

Uh oh!

zuiderkwast left a comment

Uh oh!

sarthakaggarwal97 commented Oct 8, 2025 •

edited

Loading

Uh oh!

enjoy-binbin left a comment

Uh oh!

sarthakaggarwal97 commented Oct 8, 2025

Uh oh!

sarthakaggarwal97 commented Oct 8, 2025

Uh oh!

enjoy-binbin left a comment

Uh oh!

sarthakaggarwal97 commented Oct 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Deflake replica selection test by relaxing cluster configurations #2672

Deflake replica selection test by relaxing cluster configurations #2672

Uh oh!

Conversation

sarthakaggarwal97 commented Oct 2, 2025 • edited by zuiderkwast Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

zuiderkwast left a comment

Choose a reason for hiding this comment

Uh oh!

sarthakaggarwal97 commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

enjoy-binbin left a comment

Choose a reason for hiding this comment

Uh oh!

sarthakaggarwal97 commented Oct 8, 2025

Uh oh!

sarthakaggarwal97 commented Oct 8, 2025

Uh oh!

enjoy-binbin left a comment

Choose a reason for hiding this comment

Uh oh!

sarthakaggarwal97 commented Oct 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sarthakaggarwal97 commented Oct 2, 2025 •

edited by zuiderkwast

Loading

codecov bot commented Oct 2, 2025 •

edited

Loading

sarthakaggarwal97 commented Oct 8, 2025 •

edited

Loading