ci(evals): only run evals in CI if prompts or tools changed by gundermanc · Pull Request #20898 · google-gemini/gemini-cli

gundermanc · 2026-03-02T23:20:11Z

Update the CI to require nightly evals to pass, but only when making changes to prompt and tools.

This change comes in the wake of an issue where the CI was blocked on evals this morning. I temporarily removed the block: #20870

I then investigated, and identified and fixed this regression, which caused the failure: #20890

Now that the tests are once again passing, I am making them required, but only for changes to prompts and tools, to minimize the impact of future failures.

github-actions · 2026-03-02T23:28:42Z

Size Change: -2 B (0%)

Total Size: 25.8 MB

ℹ️ View Unchanged

Filename	Size	Change
`./bundle/gemini.js`	25.3 MB	-2 B (0%)
`./bundle/node_modules/@google/gemini-cli-devtools/dist/client/main.js`	221 kB	0 B
`./bundle/node_modules/@google/gemini-cli-devtools/dist/src/_client-assets.js`	227 kB	0 B
`./bundle/node_modules/@google/gemini-cli-devtools/dist/src/index.js`	11.5 kB	0 B
`./bundle/node_modules/@google/gemini-cli-devtools/dist/src/types.js`	132 B	0 B
`./bundle/sandbox-macos-permissive-open.sb`	890 B	0 B
`./bundle/sandbox-macos-permissive-proxied.sb`	1.31 kB	0 B
`./bundle/sandbox-macos-restrictive-open.sb`	3.36 kB	0 B
`./bundle/sandbox-macos-restrictive-proxied.sb`	3.56 kB	0 B
`./bundle/sandbox-macos-strict-open.sb`	4.82 kB	0 B
`./bundle/sandbox-macos-strict-proxied.sb`	5.02 kB	0 B

_{compressed-size-action}

gemini-code-assist · 2026-03-02T23:38:27Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refines the continuous integration process by implementing a conditional execution strategy for nightly evaluation tests. The primary goal is to prevent unnecessary CI blockages by ensuring these resource-intensive tests only run when changes directly impacting prompts or tools are detected, thereby streamlining the development workflow and improving CI efficiency.

Highlights

CI/CD Workflow Enhancement: Modified the CI process to conditionally execute nightly evaluation tests, aiming to reduce unnecessary CI blockages and improve efficiency.
Conditional Evaluation Trigger: Evaluation tests will now only run if changes are detected within the packages/core/src/prompts/ or packages/core/src/tools/ directories.
New Script for Change Detection: Introduced a new Node.js script, scripts/changed_prompt.js, responsible for determining whether evaluation tests should be triggered based on file changes.

Changelog

scripts/changed_prompt.js
- Added a new script that uses Git commands to identify changed files and determine if evaluation tests should be executed based on predefined file path prefixes.

Ignored Files

Ignored by pattern: .github/workflows/** (1)
- .github/workflows/chained_e2e.yml

Activity

No human activity (comments, reviews) has occurred on this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a script to conditionally run CI evaluations based on whether prompt or tool files have been modified. The logic is sound, defaulting to running evaluations if any error occurs. However, I've identified a potential issue where the script hardcodes the main branch for comparison, which could lead to incorrect behavior on PRs targeting other branches. My review includes a suggestion to make the script more robust by dynamically determining the target branch from CI environment variables.

scripts/changed_prompt.js

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

SandyTao520 · 2026-03-03T00:06:51Z

scripts/changed_prompt.js

+      stdio: 'ignore',
+    });
+
+    // Find the merge base with main


it looks like the code find the merge base with the target branch instead of main?

SandyTao520 · 2026-03-03T00:07:45Z

scripts/changed_prompt.js

+ */
+import { execSync } from 'node:child_process';
+
+const EVALS_FILE_PREFIXES = [


consider adding the evals/ directory too

Ok, added. Note that as written, only the ALWAYS_PASSES evals will end up getting run during the CI. Breaking changes to USUALLY_PASSES ones will not.

I guess we could make it smart enough to compute the delta, but I'd rather aspire to stabilizing as many tests as possible and running them during the CI by default.

…emini#20898) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

gundermanc added 4 commits March 2, 2026 15:19

ci(evals): only run evals if prompts or tools changed

90d0ea0

refactor(ci): clean up check_evals_trigger.js

4462dbb

ci(evals): add error logging to trigger script

4e18959

ci(evals): use origin remote for trigger check

31cd47f

ci(evals): rename script and fix yamllint errors

95a49be

gundermanc marked this pull request as ready for review March 2, 2026 23:38

gundermanc requested review from a team as code owners March 2, 2026 23:38

gemini-code-assist bot reviewed Mar 2, 2026

View reviewed changes

scripts/changed_prompt.js Show resolved Hide resolved

gemini-cli bot added the status/need-issue Pull requests that need to have an associated issue. label Mar 2, 2026

Update scripts/changed_prompt.js

3be302a

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

gundermanc changed the title ~~ci(evals): only run evals if prompts or tools changed~~ ci(evals): only run evals in CI if prompts or tools changed Mar 2, 2026

scidomino approved these changes Mar 3, 2026

View reviewed changes

SandyTao520 reviewed Mar 3, 2026

View reviewed changes

SandyTao520 approved these changes Mar 3, 2026

View reviewed changes

PR feedback.

37726d9

github-actions bot mentioned this pull request Mar 3, 2026

📊 AI CLI 工具社区动态日报 2026-03-03 duanyytop/agents-radar#45

Closed

gundermanc added this pull request to the merge queue Mar 3, 2026

Merged via the queue into main with commit 46231a1 Mar 3, 2026
27 checks passed

gundermanc deleted the gundermanc/optimize-evals-trigger branch March 3, 2026 00:41

gemini-code-assist bot mentioned this pull request Mar 3, 2026

Changelog for v0.33.0-preview.0 #21030

Merged

struckoff pushed a commit to struckoff/gemini-cli that referenced this pull request Mar 6, 2026

ci(evals): only run evals in CI if prompts or tools changed (google-g…

d75b824

…emini#20898) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

wfatih mentioned this pull request Mar 24, 2026

GSoC Behavioral evals, Quality, and the OSS Community #23331

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci(evals): only run evals in CI if prompts or tools changed#20898

ci(evals): only run evals in CI if prompts or tools changed#20898
gundermanc merged 7 commits intomainfrom
gundermanc/optimize-evals-trigger

gundermanc commented Mar 2, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 2, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Mar 2, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

SandyTao520 Mar 3, 2026

Uh oh!

SandyTao520 Mar 3, 2026

Uh oh!

gundermanc Mar 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

gundermanc commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Mar 2, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

SandyTao520 Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

SandyTao520 Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

gundermanc Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gundermanc commented Mar 2, 2026 •

edited

Loading

github-actions bot commented Mar 2, 2026 •

edited

Loading