Skip to content

Feat: sales error retry backoff#100

Open
benbierens wants to merge 7 commits intomainfrom
feat/sales-error-retry-backoff
Open

Feat: sales error retry backoff#100
benbierens wants to merge 7 commits intomainfrom
feat/sales-error-retry-backoff

Conversation

@benbierens
Copy link
Contributor

When RPC connection troubles occur, the sales state machine responds by moving to the Errored state. From there, slots are cleaned up and the flow ends.

This is undesirable for active slots, which the host should be storing and proving.

This PR introduces a new flow that allows active slots to recover from error state when connectivity is restored. It introduces an exponential-backoff module to protect against flooding the RPC connection.

@benbierens benbierens requested a review from markspanbroek March 2, 2026 14:59
trace "Errored slot is not in MySlots."
except CancelledError as e:
trace "SaleErrored.run was cancelled", error = e.msgDetail
trace "SaleErrored.isMySlot was cancelled", error = e.msgDetail
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also be the applyDelay that was cancelled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true. I'll put it back to "run". Less informative, but better than providing details that may be incorrect.

@benbierens
Copy link
Contributor Author

Not ready for merge:

I've made a system test for this recovery flow and it is not working. The asyncstatemachine.run is being used instead of the errored.nim one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants