Skip to content

fix(plan): deflake plan mode integration tests#20477

Merged
Adib234 merged 32 commits intomainfrom
adibakm/deflake-plan-mode
Mar 2, 2026
Merged

fix(plan): deflake plan mode integration tests#20477
Adib234 merged 32 commits intomainfrom
adibakm/deflake-plan-mode

Conversation

@Adib234
Copy link
Copy Markdown
Contributor

@Adib234 Adib234 commented Feb 26, 2026

Summary

As title suggests, the main fixes made were to harden the directory the plans were being stored in to make sure the policy passes and convert the test to interactive which aligns with how plan mode writes can happen.

Details

Related Issues

Fixes #20458

How to Validate

Pre-Merge Checklist

  • Updated relevant documentation and README (if needed)
  • Added/updated tests (if needed)
  • Noted breaking changes (if any)
  • Validated on required platforms/methods:
    • MacOS
      • npm run
      • npx
      • Docker
      • Podman
      • Seatbelt
    • Windows
      • npm run
      • npx
      • Docker
    • Linux
      • npm run
      • npx
      • Docker

@Adib234 Adib234 self-assigned this Feb 26, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @Adib234, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request primarily focuses on improving the reliability and coverage of existing tests. It re-enables a previously skipped integration test to ensure proper validation of file writing behavior in plan mode and enhances the test utility's ability to accurately determine the success status of tool calls by handling varied data types for the success attribute.

Highlights

  • Test Enablement: An integration test in integration-tests/plan-mode.test.ts that verifies write_file behavior within the plans directory in plan mode has been unskipped, enabling its execution.
  • Test Utility Improvement: The TestRig utility now robustly parses the success attribute for tool calls, correctly interpreting both boolean true and string 'true' values, which addresses potential flakiness in test reporting.
Changelog
  • integration-tests/plan-mode.test.ts
    • Unskipped an integration test to validate write_file behavior in plan mode.
  • packages/test-utils/src/test-rig.ts
    • Improved tool call success parsing to handle both boolean true and string 'true' values.
    • Added intermediate variables for successValue and success for clarity in tool call logging logic.
Activity
  • The pull request was created by Adib234.
  • No specific review comments or activity have been recorded yet based on the provided context.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-cli
Copy link
Copy Markdown
Contributor

gemini-cli bot commented Feb 26, 2026

Hi @Adib234, thank you so much for your contribution to Gemini CLI! We really appreciate the time and effort you've put into this.

We're making some updates to our contribution process to improve how we track and review changes. Please take a moment to review our recent discussion post: Improving Our Contribution Process & Introducing New Guidelines.

Key Update: Starting January 26, 2026, the Gemini CLI project will require all pull requests to be associated with an existing issue. Any pull requests not linked to an issue by that date will be automatically closed.

Thank you for your understanding and for being a part of our community!

@Adib234 Adib234 changed the title deflake fix(plan): deflake plan mode integration tests Feb 26, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses test flakiness by enabling a previously skipped integration test for plan mode. The underlying fix improves the robustness of the test rig by handling both boolean and string values for the 'success' attribute when parsing tool call logs. This is a good change to improve test stability. I've kept the original comment suggesting a small refactoring to address duplicated logic in the test rig, which will enhance maintainability.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Feb 26, 2026

Size Change: -2 B (0%)

Total Size: 25.8 MB

ℹ️ View Unchanged
Filename Size Change
./bundle/gemini.js 25.3 MB -2 B (0%)
./bundle/node_modules/@google/gemini-cli-devtools/dist/client/main.js 221 kB 0 B
./bundle/node_modules/@google/gemini-cli-devtools/dist/src/_client-assets.js 227 kB 0 B
./bundle/node_modules/@google/gemini-cli-devtools/dist/src/index.js 11.5 kB 0 B
./bundle/node_modules/@google/gemini-cli-devtools/dist/src/types.js 132 B 0 B
./bundle/sandbox-macos-permissive-open.sb 890 B 0 B
./bundle/sandbox-macos-permissive-proxied.sb 1.31 kB 0 B
./bundle/sandbox-macos-restrictive-open.sb 3.36 kB 0 B
./bundle/sandbox-macos-restrictive-proxied.sb 3.56 kB 0 B
./bundle/sandbox-macos-strict-open.sb 4.82 kB 0 B
./bundle/sandbox-macos-strict-proxied.sb 5.02 kB 0 B

compressed-size-action

@Adib234 Adib234 force-pushed the adibakm/deflake-plan-mode branch from a54ad70 to b734b75 Compare February 26, 2026 19:54
@gemini-cli gemini-cli bot added area/platform Issues related to Build infra, Release mgmt, Testing, Eval infra, Capacity, Quota mgmt 🔒 maintainer only ⛔ Do not contribute. Internal roadmap item. labels Feb 26, 2026
@Adib234
Copy link
Copy Markdown
Contributor Author

Adib234 commented Feb 26, 2026

deflake.yaml was changed to run my integration test multiple times to see if it succeeds to show that it's not flakey anymore

@jerop jerop self-assigned this Feb 27, 2026
@Adib234 Adib234 marked this pull request as ready for review March 1, 2026 19:53
@Adib234 Adib234 requested a review from a team as a code owner March 1, 2026 19:53
gemini-code-assist[bot]

This comment was marked as off-topic.

core: ['write_file', 'read_file', 'list_directory'],
allowed: ['write_file'],
it('should allow write_file to the plans directory in plan mode', async () => {
const plansDir = '.gemini/tmp/v1/session/plans';
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this path can be confusing for someone trying to update plans, can we use a realistic path like:

Suggested change
const plansDir = '.gemini/tmp/v1/session/plans';
const plansDir = '.gemini/tmp/foo/123/plans';

same for other examples below

@Adib234 Adib234 enabled auto-merge March 2, 2026 15:23
@Adib234 Adib234 added this pull request to the merge queue Mar 2, 2026
Merged via the queue into main with commit 2e1efae Mar 2, 2026
27 checks passed
@Adib234 Adib234 deleted the adibakm/deflake-plan-mode branch March 2, 2026 20:04
BryanBradfo pushed a commit to BryanBradfo/gemini-cli that referenced this pull request Mar 5, 2026
struckoff pushed a commit to struckoff/gemini-cli that referenced this pull request Mar 6, 2026
liamhelmer pushed a commit to badal-io/gemini-cli that referenced this pull request Mar 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/platform Issues related to Build infra, Release mgmt, Testing, Eval infra, Capacity, Quota mgmt 🔒 maintainer only ⛔ Do not contribute. Internal roadmap item.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix flaky plan mode integration test

3 participants