Skip to content

Reduce log noise for transient RPC failures in multi-URL setups#292

Open
ameten wants to merge 1 commit intoindexsupply:mainfrom
ameten:reduce-converge-retry-log-noise
Open

Reduce log noise for transient RPC failures in multi-URL setups#292
ameten wants to merge 1 commit intoindexsupply:mainfrom
ameten:reduce-converge-retry-log-noise

Conversation

@ameten
Copy link

@ameten ameten commented Feb 24, 2026

Summary

  • When multiple RPC URLs are configured, transient per-URL failures are now logged at warn level instead of error
  • A consecutive failure counter escalates to error only after all configured URLs have been tried and failed
  • The per-block loading blocks error in load() is downgraded to warn since it propagates up to runTask where it's logged at the appropriate level
  • Added NumURLs() int to the Source interface and jrpc2.Client

Motivation

Shovel's round-robin URL rotation naturally recovers from single-endpoint failures on the next retry. Logging every retry at error level creates noise in monitoring when only one of several RPCs is temporarily down, making it harder to spot real issues where all endpoints are failing.

Test plan

  • go build ./... passes
  • go test ./shovel/ -short passes
  • Manual verification with multi-URL config where one URL is intentionally unreachable

🤖 Generated with Claude Code

@ameten
Copy link
Author

ameten commented Feb 24, 2026

@ryansmith3136, @lechengfan, could you please take a look on this change?

When multiple RPC URLs are configured, Shovel's round-robin naturally
rotates to healthy endpoints on retry. Previously, every single retry
was logged at error level, creating noise when only one of several RPCs
was temporarily down.

This change tracks consecutive failures and only escalates to error
level after all configured URLs have been tried and failed. Individual
URL failures are logged at warn level. The per-block loading error
in the load() function is also downgraded to warn since it propagates
up to runTask where it is logged at the appropriate level.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@ameten ameten force-pushed the reduce-converge-retry-log-noise branch from dc9e10f to 335fd73 Compare February 24, 2026 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant