[Test] Add load tests and behavioral checks to incremental upgrade E2E by JiangJiaWei1103 · Pull Request #4541 · ray-project/kuberay

JiangJiaWei1103 · 2026-02-26T03:23:02Z

Why are these changes needed?

The existing RayService Incremental Upgrade E2E test is implemented as a single, monolithic functional test and doesn't include load testing. Hence, it doesn't verify system behavior under real traffic during upgrades. In addition, the current functional test doesn't provide comprehensive behavioral checks.

This PR introduces Locust-based load tests to validate incremental upgrade behavior under continuous traffic. It also adds more thorough behavioral checks to the functional test. Both test cases are executed across multiple upgrade strategies to improve test coverage.

Test Summary

RayCluster Setup

To run the Locust load tests, two RayClusters are required: one acting as the client (Locust) and the other managed by the RayService.

Locust Cluster (Client-Side)

The client cluster configuration follows the example defined here.

Component	Replicas	CPU (req/limit)	Memory (req/limit)
Head	1	300m / 500m	1Gi / 2Gi
Worker	0 (head-only)	-	-

RayService Cluster (Server-Side)

~~Worker resource limits are intentionally not set to align with the original example proposed by @Future-Outlier here.~~

UPDATE:

Without resource limits, workers may consume more CPU and memory than their requested resources. This can lead to node-level resource contention and cause the incremental upgrade process to time out. To avoid this issue, resource limits are kept for worker pods, particularly in CI environments where compute resources are constrained (e.g., Buildkite runners with 8 vCPUs).

In addition, setting explicit CPU requests for the head pod can make the pod unschedulable in single-node test environments due to Insufficient cpu. To ensure the test environment remains schedulable, the head is configured with rayStartParams["num-cpus"]: 0 while keeping minimal resource requests.

Component	Replicas (min/max)	CPU (req/limit)	Memory (req/limit)
Head	1	- (`rayStartParams["num-cpus"]: 0`)	-
Worker	1 / 4	2 / 2	2Gi / 2Gi

Test Matrix

Both test cases are executed with multiple upgrade strategies to improve coverage.

Upgrade Strategy	(maxSurgePercent, stepSizePercent, intervalSeconds)	Description
`BlueGreen`	(100, 100, 1)	Instant traffic cutover
`AggressiveGradual`	(50, 25, 2)	Larger traffic migration steps with shorter intervals
`ConservativeGradual`	(25, 5, 10)	Smaller traffic migration steps with longer intervals

Behavioral Checks

To improve the robustness of the e2e tests, we introduce additional behavioral checks, including verification that TargetCapacity and TrafficRoutedPercent progress monotonically. See the Change Summary below for details.

Test Results

Throughput

The CI throughput (~500 RPS, and sometimes ~450) is approximately half of the local test (~1000+ RPS). This is likely due to the smaller compute capacity of the Buildkite hosted runners:

Buildkite large instances provide 8 vCPUs, where each vCPU typically maps to one logical thread (hyper-thread) on a physical core. This means the available compute might be roughly half that of our local test machine. Furthermore, CI providers may enforce cgroup limits or CPU throttling, which can cause additional performance degradation. For full instance specifications, refer to the Buildkite hosted Linux sizes documentation.

NOTE: The greatest instance supported by the Ray ecosystem CI is large. For error on selecting xlarge, please see this CI failure.

Local (~1000+ RPS)	CI (~500 RPS)

Overall E2E

`TestRayServiceIncrementalUpgrade`	`TestRayServiceIncrementalUpgradeWithLocust`	`TestRayServiceIncrementalUpgradeRollback`

Change Summary

`TestRayServiceIncrementalUpgrade`

Add comprehensive behavioral checks covering:

The current state value matches the expected value, or the RayService has already finished upgrading
Both old and new versions serve traffic during the upgrade, ensuring no requests are dropped
Traffic migration respects the configured interval seconds
Active TargetCapacity is monotonically decreasing while Pending TargetCapacity is monotonically increasing
Active TrafficRoutedPercent is monotonically decreasing while Pending TrafficRoutedPercent is monotonically increasing

`TestRayServiceIncrementalUpgradeWithLocust`

Add a new E2E test case TestRayServiceIncrementalUpgradeWithLocust that runs Locust load tests
- Run Locust in a background goroutine
- Warm up Locust first (entering a steady state) before triggering the upgrade
- Verify that no requests are dropped during the upgrade under load
Add RayCluster and ConfigMap manifests required to run the Locust load tests
Add a new ServeConfigV2 configuration with a lightweight Serve application that directly returns a response
- This is designed for high-RPS load testing scenarios

Limitations and Future Improvements

~~The Serve application source code is currently hosted in my personal repo. It should be migrated to an official ray-project organization repository.~~
- We also considered mounting the Serve app via a ConfigMap, but working_dir currently does not support local paths.
- (Solved) We've changed the working_dir to https://github.com/ray-project/serve_config_examples.
Ray and image versions are currently hardcoded in locust-cluster.incremental-upgrade.yaml, following the same pattern used in the existing HA E2E tests here.
Locust warm-up parameters are currently defined as named constants. We may need to further tune the ServeConfigV2 implementation and the RayCluster configuration to support higher RPS in CI:
- Current throughput: ~500 RPS
- Target throughput: 1000+ RPS

Related issue number

#3209

Checks

I've made sure the tests are passing.
Testing Strategy
- Unit tests
- Manual tests
- This PR is not tested :(

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

JiangJiaWei1103 · 2026-02-26T13:00:02Z

Current Local E2E Test Results

The next steps are:

Address flaky parts (see TODO in the e2e test file)
Add more test cases with diverse (stepSize, interval, maxSurge) combinations

JiangJiaWei1103 · 2026-02-26T08:52:41Z

.buildkite/test-e2e.yml

    - kind create cluster --wait 900s --config ./ci/kind-config-buildkite-1-29.yml
    - kubectl config set clusters.kind-kind.server https://docker:6443

-    # Install MetalLB for LoadBalancer IPs on Kind


The setup order is rearranged to align with the official Ray docs example, so developers can follow along without wondering why the steps differ.

Reverted at dc98432 due to duplicate installation of Istio GatewayClass. We might revisit this in the future.

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

This reverts commit 067ba1f.

…roject#4109)" This reverts commit b8b66d4.

Signed-off-by: Future-Outlier <eric901201@gmail.com>

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

This reverts commit 7e821f3.

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

cursor · 2026-03-16T08:52:14Z

ray-operator/test/e2eincrementalupgrade/rayservice_incremental_upgrade_test.go

-		Should(Not(BeNil()))
+			LogWithTimestamp(test.T(), "Verifying both old and new versions served traffic during the upgrade")
+			g.Expect(oldVersionServed).To(BeTrue(), "The old version of the service should have served traffic during the upgrade.")
+			g.Expect(newVersionServed).To(BeTrue(), "The new version of the service should have served traffic during the upgrade.")


BlueGreen scenario may fail both-versions-served assertion

Medium Severity

For the BlueGreen scenario (stepSize=100, interval=1, maxSurge=100), the upgrade may complete during the preceding validation steps (pending head pod readiness, HTTPRoute backend checks at lines 106–123), since the entire traffic shift happens in a single 1-second step. When the behavioral check loop starts, g.Eventually can succeed via the !IsRayServiceUpgrading(svc) escape clause, the curl returns only "8" (new version), and the loop breaks — leaving oldVersionServed as false. The assertion at line 205 then fails. This is a race condition that could cause flaky test failures.

Additional Locations (1)

ray-operator/test/e2eincrementalupgrade/rayservice_incremental_upgrade_test.go#L140-L166

This can happen when the upgrade completes too quickly, even before the test enters the upgradeSteps loop. In practice, this scenario is rare.

For now, I suggest keeping the current behavior unchanged. We can revisit whether it is reasonable to assert that both clusters should serve traffic during the upgrade for the Blue/Green strategy, which is effectively a single-step upgrade rather than a gradual traffic migration.

JiangJiaWei1103 · 2026-03-16T05:39:28Z

.buildkite/test-e2e.yml

    - mkdir -p "$(pwd)/tmp" && export KUBERAY_TEST_OUTPUT_DIR=$(pwd)/tmp
    - echo "KUBERAY_TEST_OUTPUT_DIR=$$KUBERAY_TEST_OUTPUT_DIR"
-    - KUBERAY_TEST_TIMEOUT_SHORT=1m KUBERAY_TEST_TIMEOUT_MEDIUM=5m KUBERAY_TEST_TIMEOUT_LONG=10m go test -timeout 30m -v ./test/e2eincrementalupgrade 2>&1 | awk -f ../.buildkite/format.awk | tee $$KUBERAY_TEST_OUTPUT_DIR/gotest.log || (kubectl logs --tail -1 -l app.kubernetes.io/name=kuberay | tee $$KUBERAY_TEST_OUTPUT_DIR/kuberay-operator.log && cd $$KUBERAY_TEST_OUTPUT_DIR && find . -name "*.log" | tar -cf /artifact-mount/e2e-log.tar -T - && exit 1)
+    - KUBERAY_TEST_TIMEOUT_SHORT=1m KUBERAY_TEST_TIMEOUT_MEDIUM=10m KUBERAY_TEST_TIMEOUT_LONG=20m go test -timeout 60m -v ./test/e2eincrementalupgrade 2>&1 | awk -f ../.buildkite/format.awk | tee $$KUBERAY_TEST_OUTPUT_DIR/gotest.log || (kubectl logs --tail -1 -l app.kubernetes.io/name=kuberay | tee $$KUBERAY_TEST_OUTPUT_DIR/kuberay-operator.log && cd $$KUBERAY_TEST_OUTPUT_DIR && find . -name "*.log" | tar -cf /artifact-mount/e2e-log.tar -T - && exit 1)


We increase the timeout to deflake the e2e test.

JiangJiaWei1103 · 2026-03-16T05:42:03Z

ray-operator/test/e2eincrementalupgrade/testdata/locust-cluster.incremental-upgrade.yaml

+            resources:
+              requests:
+                cpu: 300m
+                memory: 1G
+              limits:
+                cpu: 500m
+                memory: 2G


For resources setup for the locust RayCluster, we follow practices here:

kuberay/ray-operator/test/e2erayservice/testdata/locust-cluster.const-rate.yaml

Lines 13 to 19 in 44ae9e2

resources:

requests:

cpu: 300m

memory: 1G

limits:

cpu: 500m

memory: 2G

JiangJiaWei1103 · 2026-03-16T09:18:14Z

ray-operator/test/e2eincrementalupgrade/constant.go

+    import_path: simple_serve.app
+    route_prefix: /test
+    runtime_env:
+      working_dir: "https://github.com/jiangjiawei1103/incr-upgrade-locust/archive/a185bb29374388e801db4331ae73af3ad1e79a5f.zip"


Thanks Ryan!! I'll change the URL once the PR is merged.

JiangJiaWei1103 · 2026-03-16T09:20:09Z

ray-operator/test/e2eincrementalupgrade/support.go

 								corev1ac.ContainerPort().WithName(utils.DashboardPortName).WithContainerPort(utils.DefaultDashboardPort),
 								corev1ac.ContainerPort().WithName(utils.ClientPortName).WithContainerPort(utils.DefaultClientPort),
-							).
-							WithResources(corev1ac.ResourceRequirements().


The resource setup is mainly constrained by buildkite hardware limitation (8 vCPU). For details, please refer to the PR description.

JiangJiaWei1103 · 2026-03-16T09:20:21Z

ray-operator/test/e2eincrementalupgrade/support.go

@@ -137,12 +168,12 @@ func IncrementalUpgradeRayServiceApplyConfiguration(
 							WithImage(GetRayImage()).
 							WithResources(corev1ac.ResourceRequirements().
 								WithRequests(corev1.ResourceList{


ryanaoleary · 2026-03-16T10:04:11Z

ray-operator/test/e2eincrementalupgrade/support.go

+	serveConfigV2 serveConfigV2,
 ) *rayv1ac.RayServiceSpecApplyConfiguration {
 	return rayv1ac.RayServiceSpec().
 		WithUpgradeStrategy(rayv1ac.RayServiceUpgradeStrategy().


I recommend adding:

WithRayClusterDeletionDelaySeconds(0).

here since the default deletion delay is 60 seconds, which adds unnecessary lag to the test since we check for cluster deletion. We could lower it to 0 or even just a value like 10 seconds to speed up these tests.

Hi @ryanaoleary,

I noticed that only TestRayServiceIncrementalUpgradeRollback verifies whether the pending cluster is deleted after the rollback completes.

For TestRayServiceIncrementalUpgrade and TestRayServiceIncrementalUpgradeWithLocust, neither test checks whether the previous active clusters are cleaned up. Instead, they focus on verifying that traffic is correctly served by the new cluster (i.e., the newly promoted active cluster). Therefore, the RayClusterDeletionDelaySeconds setup wouldn't help speed up the e2e tests at this stage.

I suggest the following adjustments in this PR:

Remove the rollback E2E tests for now, since this PR will be merged before the rollback logic itself.

Reintroduce the rollback logic along with the corresponding rollback E2E tests. For each test, we should:

Run it under Locust load

Enable WithRayClusterDeletionDelaySeconds to accelerate verification that the pending cluster is cleaned up.

WDYT? If I misunderstood anything, please let me know. Thanks!

The rollback e2e has been removed at baf2d6c. I'll merge master again once #4604 is merged. Thanks!

We add the basic rollback e2e back at 44fdb7e. And, we'll enhance rollback e2e coverage in the follow-up PRs.

ryanaoleary

LGTM - just one small comment on the config that gets applied and the test path needs to be updated when ray-project/serve_config_examples#15 is merged.

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

This reverts commit baf2d6c, reversing changes made to 73e6637.

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

cursor · 2026-03-16T14:32:29Z

ray-operator/test/e2eincrementalupgrade/support.go

+			test.T().Logf("failed to parse RPS, retrying in 2 seconds: %s", err.Error())
+			time.Sleep(2 * time.Second)
+			continue
+		}


Warmup stableCount not reset on error retries

Medium Severity

In warmupLocust, when the stats query fails (stderr non-empty/stdout empty), the stats slice is too short, or float parsing fails, the loop continues without resetting stableCount. Since locustWarmupStableWindowSeconds represents consecutive seconds of stability, intermittent failures during the stable window are silently skipped, and the function can prematurely declare steady state based on non-consecutive stable checks.

We reset stableCount only when the RPS value is successfully queried and parsed, and is below the rpsThreshold:

https://github.com/ray-project/kuberay/pull/4541/changes#diff-e421bf294e3026a8e3ee0aad96d7d11b2e1714e705fb859d1889bceac8cd2ba5R426-R430

The three cases you mentioned are commonly observed formatting/parsing issues, which don't reliably indicate that the actual RPS is below the threshold. Therefore, we don't reset stableCount in those scenarios.

ryanaoleary · 2026-03-16T19:59:48Z

Since #4604 is closed I think we should actually keep the changes from 44fdb7e

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

win5923

Sorry for the late review, I spent some time getting familiar with the incremental upgrade PR.
Overall LGTM. Although I wasn’t able to reach 400 RPS in my local tests, the CI results look stable, which should be sufficient.

ray-operator/test/e2eincrementalupgrade/rayservice_incremental_upgrade_test.go

Co-authored-by: Jun-Hao Wan <ken89@kimo.com> Signed-off-by: Jia-Wei Jiang <36886416+JiangJiaWei1103@users.noreply.github.com>

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

JiangJiaWei1103 added 3 commits February 26, 2026 11:14

docs: Add step-by-step process of the basic e2e

44c4aef

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

chore: Align test infra setup order with Ray docs

01bd86e

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

test: Add locust load test for incr upgrade e2e

40f1d5c

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

JiangJiaWei1103 commented Feb 26, 2026

View reviewed changes

JiangJiaWei1103 added 2 commits February 27, 2026 10:27

docs: Improve maintainability of locust yaml

bb4a26f

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

refactor: Remove redundant helper

afc9dcc

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

JiangJiaWei1103 moved this to Work in progress in My Kuberay & Ray Feb 27, 2026

JiangJiaWei1103 added this to My Kuberay & Ray Feb 27, 2026

JiangJiaWei1103 added 21 commits February 27, 2026 21:56

fix: Deflake hardcoded sleep for Locust ramp up

89f5478

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

refactor: Remove legacy and extract a helper to get rps index

004a757

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

test: Ensure remaining traffic routed to the new cluster

f00ab71

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

test: Support high-rps serve application with expected rps over 900

ef802a6

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

Test CI

87ca63d

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

Test CI

1b8ab65

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

Test CI

504ecf5

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

test: Recover basic incr upgrade test

e64a6e7

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

docs: Improve maintainability

25f34ee

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

fix: Deflake CI istio gc installation

6b23507

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

fix: Deflake istio gc installation

0aa45a1

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

revert: Use orig install order

dc98432

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

test: Support diverse incr upgrade parameter combinations

7e4d888

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

Test CI

8f6df3d

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

fix: Skip transient state check right before promotion

f71836a

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

test: Retest standard gradual incr upgrade

b1eeab8

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

refactor: Make curl function clearer

7c2e504

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

fix: Avoid t.FailNow from non-test goroutines

62fa293

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

fix: Deflake by using commit hash

d152779

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

refactor: Remove redundant checks

b9db8ee

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

refactor: Get rps col index without hardcoded int

442db2d

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

Revert "chore: Use a larger runner for higher RPS"

d0025dc

This reverts commit 067ba1f.

This was referenced Mar 14, 2026

[RayService] Promote Incremental Upgrade Feature to Beta #4599

Merged

RayService incremental upgrade reliability improvements #4601

Open

Future-Outlier mentioned this pull request Mar 16, 2026

[Feature] RayService Incremental Upgrade Project Tracker #3209

Open

5 tasks

Future-Outlier and others added 6 commits March 16, 2026 12:25

Revert "[RayService] Rollback Support for Incremental Upgrades (ray-p…

5eef493

…roject#4109)" This reverts commit b8b66d4.

codex issue

675d3ef

Signed-off-by: Future-Outlier <eric901201@gmail.com>

chore: Deflake by increasing timeout

4c38b7e

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

chore: Remove worker rsc limits to align with the orig example

7e821f3

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

Revert "chore: Remove worker rsc limits to align with the orig example"

13011d6

This reverts commit 7e821f3.

docs: Better strategy naming

73e6637

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

cursor bot reviewed Mar 16, 2026

View reviewed changes

JiangJiaWei1103 commented Mar 16, 2026

View reviewed changes

JiangJiaWei1103 requested a review from ryanaoleary March 16, 2026 09:31

ryanaoleary reviewed Mar 16, 2026

View reviewed changes

ryanaoleary approved these changes Mar 16, 2026

View reviewed changes

Revert: Remove rollback e2e

baf2d6c

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

JiangJiaWei1103 requested review from MortalHappiness, andrewsykim, kevin85421 and rueian as code owners March 16, 2026 12:27

JiangJiaWei1103 mentioned this pull request Mar 16, 2026

Revert "[RayService] Rollback Support for Incremental Upgrades" #4604

Closed

JiangJiaWei1103 added 2 commits March 16, 2026 22:03

Revert "Revert: Remove rollback e2e"

44fdb7e

This reverts commit baf2d6c, reversing changes made to 73e6637.

chore: Use Ray project serve config example link

d1591b9

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

cursor bot reviewed Mar 16, 2026

View reviewed changes

chore: Deflake CI by lowering steady state RPS

ffd38b6

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

win5923 reviewed Mar 18, 2026

View reviewed changes

ray-operator/test/e2eincrementalupgrade/rayservice_incremental_upgrade_test.go Outdated Show resolved Hide resolved

win5923 approved these changes Mar 18, 2026

View reviewed changes

JiangJiaWei1103 and others added 2 commits March 19, 2026 10:02

fix: Make log msg accurate

9f745c3

Co-authored-by: Jun-Hao Wan <ken89@kimo.com> Signed-off-by: Jia-Wei Jiang <36886416+JiangJiaWei1103@users.noreply.github.com>

fix: Make log msg more accurate

fdf988f

Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>

	resources:
	requests:
	cpu: 300m
	memory: 1G
	limits:
	cpu: 500m
	memory: 2G

Conversation

JiangJiaWei1103 commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are these changes needed?

Test Summary

RayCluster Setup

Locust Cluster (Client-Side)

RayService Cluster (Server-Side)

Test Matrix

Behavioral Checks

Test Results

Throughput

Overall E2E

Change Summary

TestRayServiceIncrementalUpgrade

TestRayServiceIncrementalUpgradeWithLocust

Limitations and Future Improvements

Related issue number

Checks

Uh oh!

JiangJiaWei1103 commented Feb 26, 2026

Current Local E2E Test Results

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 16, 2026

Choose a reason for hiding this comment

BlueGreen scenario may fail both-versions-served assertion

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ryanaoleary left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 16, 2026

Choose a reason for hiding this comment

Warmup stableCount not reset on error retries

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ryanaoleary commented Mar 16, 2026

Uh oh!

win5923 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

JiangJiaWei1103 commented Feb 26, 2026 •

edited

Loading

`TestRayServiceIncrementalUpgrade`

`TestRayServiceIncrementalUpgradeWithLocust`

Warmup `stableCount` not reset on error retries