Skip to content

Conversation

@Pijukatel
Copy link
Collaborator

@Pijukatel Pijukatel commented Jun 3, 2025

Description

Add new_error_count to the ClientSnapshot. Calculate overload based on the new error count instead of the absolute error count.

Issues

@github-actions github-actions bot added this to the 116th sprint - Tooling team milestone Jun 3, 2025
@github-actions github-actions bot added t-tooling Issues with this label are in the ownership of the tooling team. tested Temporary label used only programatically for some analytics. labels Jun 3, 2025
@Pijukatel Pijukatel requested a review from Copilot June 3, 2025 11:44
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes the ClientSnapshot overload calculation by introducing a new field, new_error_count, and updating the overload logic accordingly. Key changes include updating the snapshot creation logic in the autoscaling module, modifying unit tests to reflect the new overload rule, and updating type definitions for ClientSnapshot.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
tests/unit/_autoscaling/test_system_status.py Updated tests to initialize ClientSnapshot with a new error count and overload logic.
tests/unit/_autoscaling/test_snapshotter.py Added tests to validate the new overload behavior on client snapshots.
src/crawlee/_autoscaling/snapshotter.py Modified snapshot creation to compute new_error_count based on previous data.
src/crawlee/_autoscaling/_types.py Extended the ClientSnapshot type with the new_error_count attribute and updated overload logic.
Comments suppressed due to low confidence (2)

src/crawlee/_autoscaling/snapshotter.py:316

  • Consider ensuring that new_error_count is non-negative, for example by using max(error_count - previous_error_count, 0), to handle cases where the error count might unexpectedly decrease.
new_error_count=error_count - previous_error_count,

tests/unit/_autoscaling/test_system_status.py:205

  • [nitpick] Consider clarifying the inline comment about the overloaded snapshot ratio to better reflect how snapshots are selected and how the ratio is computed.
# Ratio of overloaded snapshots is 2/3 (2 minutes out of 3)

@Pijukatel
Copy link
Collaborator Author

(This change aligns with JS implementation.)

@Pijukatel Pijukatel marked this pull request as ready for review June 3, 2025 11:46
@Pijukatel Pijukatel requested review from janbuchar and vdusek June 3, 2025 11:46
Copy link
Collaborator

@vdusek vdusek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@Pijukatel Pijukatel merged commit a4fc1b6 into master Jun 4, 2025
24 checks passed
@Pijukatel Pijukatel deleted the fix-client-snapshot-overload branch June 4, 2025 06:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

t-tooling Issues with this label are in the ownership of the tooling team. tested Temporary label used only programatically for some analytics.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Investigate potential autoscaledpool deadlock when client_info=1

3 participants