Skip to content

ci(evals): only run evals in CI if prompts or tools changed#20898

Merged
gundermanc merged 7 commits intomainfrom
gundermanc/optimize-evals-trigger
Mar 3, 2026
Merged

ci(evals): only run evals in CI if prompts or tools changed#20898
gundermanc merged 7 commits intomainfrom
gundermanc/optimize-evals-trigger

Conversation

@gundermanc
Copy link
Copy Markdown
Member

@gundermanc gundermanc commented Mar 2, 2026

Update the CI to require nightly evals to pass, but only when making changes to prompt and tools.

This change comes in the wake of an issue where the CI was blocked on evals this morning. I temporarily removed the block: #20870

I then investigated, and identified and fixed this regression, which caused the failure: #20890

Now that the tests are once again passing, I am making them required, but only for changes to prompts and tools, to minimize the impact of future failures.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 2, 2026

Size Change: -2 B (0%)

Total Size: 25.8 MB

ℹ️ View Unchanged
Filename Size Change
./bundle/gemini.js 25.3 MB -2 B (0%)
./bundle/node_modules/@google/gemini-cli-devtools/dist/client/main.js 221 kB 0 B
./bundle/node_modules/@google/gemini-cli-devtools/dist/src/_client-assets.js 227 kB 0 B
./bundle/node_modules/@google/gemini-cli-devtools/dist/src/index.js 11.5 kB 0 B
./bundle/node_modules/@google/gemini-cli-devtools/dist/src/types.js 132 B 0 B
./bundle/sandbox-macos-permissive-open.sb 890 B 0 B
./bundle/sandbox-macos-permissive-proxied.sb 1.31 kB 0 B
./bundle/sandbox-macos-restrictive-open.sb 3.36 kB 0 B
./bundle/sandbox-macos-restrictive-proxied.sb 3.56 kB 0 B
./bundle/sandbox-macos-strict-open.sb 4.82 kB 0 B
./bundle/sandbox-macos-strict-proxied.sb 5.02 kB 0 B

compressed-size-action

@gundermanc gundermanc marked this pull request as ready for review March 2, 2026 23:38
@gundermanc gundermanc requested review from a team as code owners March 2, 2026 23:38
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refines the continuous integration process by implementing a conditional execution strategy for nightly evaluation tests. The primary goal is to prevent unnecessary CI blockages by ensuring these resource-intensive tests only run when changes directly impacting prompts or tools are detected, thereby streamlining the development workflow and improving CI efficiency.

Highlights

  • CI/CD Workflow Enhancement: Modified the CI process to conditionally execute nightly evaluation tests, aiming to reduce unnecessary CI blockages and improve efficiency.
  • Conditional Evaluation Trigger: Evaluation tests will now only run if changes are detected within the packages/core/src/prompts/ or packages/core/src/tools/ directories.
  • New Script for Change Detection: Introduced a new Node.js script, scripts/changed_prompt.js, responsible for determining whether evaluation tests should be triggered based on file changes.
Changelog
  • scripts/changed_prompt.js
    • Added a new script that uses Git commands to identify changed files and determine if evaluation tests should be executed based on predefined file path prefixes.
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/chained_e2e.yml
Activity
  • No human activity (comments, reviews) has occurred on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a script to conditionally run CI evaluations based on whether prompt or tool files have been modified. The logic is sound, defaulting to running evaluations if any error occurs. However, I've identified a potential issue where the script hardcodes the main branch for comparison, which could lead to incorrect behavior on PRs targeting other branches. My review includes a suggestion to make the script more robust by dynamically determining the target branch from CI environment variables.

@gemini-cli gemini-cli bot added the status/need-issue Pull requests that need to have an associated issue. label Mar 2, 2026
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@gundermanc gundermanc changed the title ci(evals): only run evals if prompts or tools changed ci(evals): only run evals in CI if prompts or tools changed Mar 2, 2026
stdio: 'ignore',
});

// Find the merge base with main
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like the code find the merge base with the target branch instead of main?

*/
import { execSync } from 'node:child_process';

const EVALS_FILE_PREFIXES = [
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider adding the evals/ directory too

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, added. Note that as written, only the ALWAYS_PASSES evals will end up getting run during the CI. Breaking changes to USUALLY_PASSES ones will not.

I guess we could make it smart enough to compute the delta, but I'd rather aspire to stabilizing as many tests as possible and running them during the CI by default.

@gundermanc gundermanc added this pull request to the merge queue Mar 3, 2026
Merged via the queue into main with commit 46231a1 Mar 3, 2026
27 checks passed
@gundermanc gundermanc deleted the gundermanc/optimize-evals-trigger branch March 3, 2026 00:41
BryanBradfo pushed a commit to BryanBradfo/gemini-cli that referenced this pull request Mar 5, 2026
…emini#20898)

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
struckoff pushed a commit to struckoff/gemini-cli that referenced this pull request Mar 6, 2026
…emini#20898)

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
liamhelmer pushed a commit to badal-io/gemini-cli that referenced this pull request Mar 12, 2026
…emini#20898)

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status/need-issue Pull requests that need to have an associated issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants