Skip to content

Conversation

@elfkuzco
Copy link
Collaborator

Rationale

In #1307, for some unknown reasons, the requests from the worker to the API throws errors either due to file size being too large or bad gateway. While the former can be configured at the nginx level, the latter is difficult to troubleshoot. Given the errors are transient, the PR applies a retry mechanism for requests to the backend.

Changes

  • retry requests to the API on specific exceptions

@elfkuzco elfkuzco self-assigned this Oct 28, 2025
@elfkuzco elfkuzco requested a review from benoit74 October 28, 2025 01:04
Copy link
Collaborator

@benoit74 benoit74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a way too naive solution and I'm afraid it might cause more harm than good.

It will delay error messages (in the logs) when something is badly configured.

It will retry without taking into account what API was called (a PATCH about the fact that scraper is still running is issued very often, so if one fails it is maybe simply better to ignore it and wait for next one ... but at the same time we should not have PATCH failing forever without worker noticing).

It will silence errors we should probably have fixed and hence been informed of (transient backend unavailability should not be the norm).

It will not solve issue about request entity been too large. Request entity will still be too large, and we will repeat uploading 100M to the API before the API asks to give up.

I do not feel like we are yet at a stage where we know what we want to implement, my last comment in the issue was a question.

It is still a prio1 because it has significant impact on production, but I prefer we take sufficient time to align on what needs to be done.

@elfkuzco elfkuzco marked this pull request as draft October 31, 2025 15:00
@elfkuzco elfkuzco closed this Nov 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants