Skip to content

Add retry mechanism for NOREPLICAS error#3647

Merged
ndyakov merged 1 commit intomasterfrom
add-retry-mechanism-for-noreplicas-error
Dec 10, 2025
Merged

Add retry mechanism for NOREPLICAS error#3647
ndyakov merged 1 commit intomasterfrom
add-retry-mechanism-for-noreplicas-error

Conversation

@ofekshenawa
Copy link
Copy Markdown
Collaborator

Whitelist NOREPLICAS as a retryable error, following the same pattern as other transient errors like LOADING, READONLY, CLUSTERDOWN, TRYAGAIN, and MASTERDOWN.
fix #3636

Copy link
Copy Markdown
Member

@ndyakov ndyakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks @ofekshenawa!

@ndyakov ndyakov added the bug label Dec 10, 2025
@ndyakov ndyakov merged commit 4edf494 into master Dec 10, 2025
33 checks passed
@ndyakov ndyakov deleted the add-retry-mechanism-for-noreplicas-error branch December 10, 2025 16:14
ndyakov added a commit that referenced this pull request Feb 13, 2026
* [maintnotif] Cluster specific handlers (#3613)

* maint notification handlers for cluster messages

* unrelax all conns

* trigger ci on feature branches

* feat(maintnotif): lazy cluster topology reload (#3614)

* lazy cluster topology reload

* fix discrepancies between options structs

* Update osscluster_lazy_reload_test.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update osscluster.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* feat(e2e): mock maintnotif e2e tests (#3639)

* lazy cluster topology reload

* fix discrepancies between options structs

* Update osscluster_lazy_reload_test.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update osscluster.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* wip fault with mock proxy

* make lint happy

* fix linter issues

* faster tests with mocks

* linter once again

* add complex node test

* add ci e2e

* use correct redis container

* e2e fix

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix(retry): Add retry mechanism for NOREPLICAS error (#3647)

* fix(queuedNewConn): protect against nil context (#3649)

* fix(maintnotif): fix smigrated parser and add cluster state reload interval option (#3663)

* fix SMIGRATED parsing

* fix smigrated parser

* add ClusterStateReloadInterval to ClusterOptions

* fix tests

* set default cluster reload interval to 10s

* chore(lint): format

* feat(smigrated): new format & remember original host:port (#3697)

* lazy cluster topology reload

* fix discrepancies between options structs

* Update osscluster_lazy_reload_test.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update osscluster.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* wip fault with mock proxy

* make lint happy

* fix linter issues

* faster tests with mocks

* linter once again

* add complex node test

* add ci e2e

* use correct redis container

* e2e fix

* additional e2e tests

* fix data race

* fix random shard picker

* fix e2e tests

* fix for empty endpoint

* fix case when semaphore is full, but still need to check idle

* scenario tests

* create database from config

* wip

* feat(client): store original addrs for later use

* fix(notif): change smigrated notification

* fix(lint): fix linter

* fix(smigrated): use array

* fix(e2e): wip

* Update options.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update redis.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix(notif): if the conn has no original addr, trigger reload with first target

* chore(e2e): wip cluster e2e

* chore(e2e): fix nil pointer from e2e tests

* chore(e2e): fix tests and reports

* chore(e2e): proper logging in e2e

* chore(e2e): add pubsub in the tests as well

* chore(e2e): mockproxy fixes

* chore(e2e): mockproxy fixes

* chore(e2e): mockproxy fixes

* chore(e2e): mockproxy fixes v3

* stop background routines

* fix(e2e): tests

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* chore(docs): add example app (#3651)

* chore(lint): fmt

* chore(docs): improve docs

* chore(docs): update features.md

* chore(github): remove example

* chore(maintnotif): rename option and address pr comments

* fix(e2e): command runner should use client timeout

* chore(e2e): refactor tests

* fix(e2e): set default timeout to 90m

* fix(e2e): skip tests if proxy cannot be started

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: ofekshenawa <104765379+ofekshenawa@users.noreply.github.com>
Co-authored-by: Elena Kolevska <elena-kolevska@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FailoverClusterClient does not retry during failover related NOREPLICAS errors

2 participants