Skip to content

Conversation

@zhzhuang-zju
Copy link
Contributor

What type of PR is this?
/kind flake

What this PR does / why we need it:
When updating resources and a conflict occurs, the operation should retry instead of returning an error directly. This prevents unexpected e2e test failures.

Which issue(s) this PR fixes:

Fixes #6842
Parts of #6841

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

None.

@karmada-bot karmada-bot added the kind/flake Categorizes issue or PR as related to a flaky test. label Oct 17, 2025
@gemini-code-assist
Copy link

Summary of Changes

Hello @zhzhuang-zju, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the robustness of end-to-end tests by introducing a retry mechanism for resource update operations. By wrapping update calls in gomega.Eventually blocks, the tests are now more resilient to transient conflicts that can occur during resource modifications, thereby reducing test flakiness and improving the overall reliability of the e2e test suite.

Highlights

  • Improved E2E Test Stability: Implemented retry logic for resource update operations across several e2e test framework functions to prevent flaky test failures caused by update conflicts.
  • Retry Mechanism for Resource Updates: Wrapped existing resource Get and Update calls within gomega.Eventually blocks, allowing these operations to automatically retry for a defined period (PollTimeout, PollInterval) until successful or the timeout is reached.
  • Affected Resources: The retry logic has been applied to update functions for ClusterPropagationPolicy, CronFederatedHPA, HPA, PropagationPolicy, Karmada instances, and ResourceRegistry.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@karmada-bot karmada-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Oct 17, 2025
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to improve the reliability of end-to-end tests by introducing a retry mechanism for resource updates, which is a great way to handle potential conflicts. The changes across most files correctly implement a get-and-update pattern within a gomega.Eventually block. However, I've found a logical issue in test/e2e/framework/resourceregistry.go where the retry logic is implemented incorrectly and will not resolve update conflicts as intended. I've left a specific comment with a suggested fix for this issue. The other changes look solid.

Copy link
Member

@XiShanYongYe-Chang XiShanYongYe-Chang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under parallel operation, update conflicts are relatively common, so it is acceptable to retry operations in case of update conflicts. Is there similar logic elsewhere?

@zhzhuang-zju
Copy link
Contributor Author

In e2e tests, common resource operations include create, update, delete, and patch. Among these, create and delete do not have this issue, and patch operations have a relatively lower chance of conflict. However, if needed, I can also enhance the patch operations in this PR to handle conflicts more robustly.

@codecov-commenter
Copy link

codecov-commenter commented Oct 17, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 45.62%. Comparing base (b0cb266) to head (71b0683).
⚠️ Report is 4 commits behind head on master.
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6843      +/-   ##
==========================================
- Coverage   45.64%   45.62%   -0.02%     
==========================================
  Files         692      692              
  Lines       57703    57703              
==========================================
- Hits        26337    26329       -8     
- Misses      29720    29726       +6     
- Partials     1646     1648       +2     
Flag Coverage Δ
unittests 45.62% <ø> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@RainbowMango RainbowMango left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/assign

@karmada-bot karmada-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 20, 2025
@zhzhuang-zju
Copy link
Contributor Author

cc @RainbowMango @XiShanYongYe-Chang How does it look now?

@RainbowMango
Copy link
Member

/genimi review

@RainbowMango RainbowMango requested a review from Copilot October 22, 2025 07:37
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses flaky e2e tests by implementing retry logic with conflict handling when updating Kubernetes resources. The changes prevent test failures when concurrent modifications cause update conflicts by automatically retrying the operation with the latest resource version.

Key changes:

  • Added retry.RetryOnConflict wrapper to all resource update operations in the e2e test framework
  • Updated error handling to return errors from retry functions instead of using inline assertions
  • Renamed UpdateResourceRegistry to UpdateResourceRegistryWithSpec for consistency

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file
File Description
test/e2e/suites/base/search_test.go Updated function call to use renamed UpdateResourceRegistryWithSpec
test/e2e/framework/resourceregistry.go Renamed function and added retry logic with conflict handling for ResourceRegistry updates
test/e2e/framework/resource/operator/karmada.go Added retry logic with conflict handling for Karmada instance updates
test/e2e/framework/propagationpolicy.go Added retry logic with conflict handling for PropagationPolicy updates
test/e2e/framework/hpa.go Added retry logic with conflict handling for HPA updates
test/e2e/framework/cronfederatedhpa.go Added retry logic with conflict handling for CronFederatedHPA updates
test/e2e/framework/clusterpropagationpolicy.go Added retry logic with conflict handling for ClusterPropagationPolicy updates

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Member

@RainbowMango RainbowMango left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@karmada-bot karmada-bot added the lgtm Indicates that a PR is ready to be merged. label Oct 22, 2025
@karmada-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: RainbowMango

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@karmada-bot karmada-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 22, 2025
@karmada-bot karmada-bot merged commit 9727144 into karmada-io:master Oct 22, 2025
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/flake Categorizes issue or PR as related to a flaky test. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Flake] Karmada Operator E2E test: PriorityClass configuration testing

5 participants