Skip to content

fix(rules): stop flag alternation from consuming placeholder as flag value#154

Merged
fohte merged 7 commits intomainfrom
fohte/fix-parser-flag-placeholder
Mar 6, 2026
Merged

fix(rules): stop flag alternation from consuming placeholder as flag value#154
fohte merged 7 commits intomainfrom
fohte/fix-parser-flag-placeholder

Conversation

@fohte
Copy link
Owner

@fohte fohte commented Mar 5, 2026

Why

  • Patterns like find * -exec|-execdir|-ok|-okdir <cmd> \;|+ parsed <cmd> as a flag value instead of an independent placeholder
    • should_consume_as_value() fell through to _ => true for LexToken::Placeholder

What

  • Prevent placeholder tokens from being consumed as flag values after flag alternation
  • Unescape backslash-escaped pattern literals (e.g. \; to ;) during matching so they correctly match command tokens

Open with Devin

`should_consume_as_value()` fell through to `_ => true` for
`LexToken::Placeholder`, causing patterns like
`find * -exec|-execdir <cmd> \;|+` to parse `<cmd>` as a flag value
instead of an independent placeholder. This broke wrapper recursive
evaluation for flag-alternation + placeholder patterns.

Also fix `literal_matches` to unescape backslash-escaped pattern
tokens (e.g. `\;` -> `;`) so they match the command tokenizer output,
which resolves shell escapes before comparison.
@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses parsing inaccuracies in command patterns, specifically resolving two issues: the incorrect consumption of placeholder tokens as flag values after flag alternation, and the mishandling of backslash-escaped characters in pattern literals. By preventing placeholders from being treated as flag values and by correctly unescaping backslashes during pattern matching, the changes ensure more robust and accurate evaluation of command rules, particularly for complex find commands.

Highlights

  • Corrected Flag Value Consumption: Fixed an issue where flag alternation patterns (e.g., -exec|-execdir) incorrectly consumed subsequent placeholder tokens (e.g., <cmd>) as part of the flag's value, leading to incorrect parsing.
  • Improved Backslash Handling: Implemented logic to unescape backslash-escaped characters in pattern literals during matching, ensuring that patterns like \; correctly match command tokens like ;.
Changelog
  • src/rules/pattern_matcher.rs
    • Implemented unescape_backslashes function.
    • Modified literal_matches to unescape backslashes in patterns before comparison.
  • src/rules/pattern_parser.rs
    • Prevented LexToken::Placeholder from being consumed as a flag value in should_consume_as_value.
    • Updated parse_flag_with_value test cases to reflect correct parsing of flag alternation followed by placeholders.
    • Added a new test case flag_alternation_then_placeholder to validate parsing of complex flag patterns.
  • tests/integration/wrapper_recursive_evaluation.rs
    • Introduced new integration tests for find -exec wrapper patterns to confirm correct recursive evaluation.
Activity
  • No human activity has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses the reported bug where placeholder tokens were incorrectly consumed as flag values after flag alternation. The changes in pattern_parser.rs correctly prevent this behavior, and the new unescape_backslashes function in pattern_matcher.rs ensures that backslash-escaped pattern literals are matched accurately. The addition of comprehensive unit and integration tests thoroughly validates these fixes, demonstrating adherence to the repository's testing style guide by using rstest with named cases and indoc! for multiline strings. Overall, the changes improve the robustness and correctness of pattern matching.

devin-ai-integration[bot]

This comment was marked as resolved.

…e docs

should_consume_as_value_strict became identical to should_consume_as_value
after the Placeholder exclusion was added. Inline the single call site and
remove the now-redundant function to avoid misleading future readers.

Document backslash escape matching behavior and add find -exec wrapper
example to the placeholders page.
devin-ai-integration[bot]

This comment was marked as resolved.

@codecov
Copy link

codecov bot commented Mar 5, 2026

Codecov Report

❌ Patch coverage is 97.67442% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 89.78%. Comparing base (1c5b101) to head (40c81d2).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/rules/pattern_matcher.rs 97.56% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #154      +/-   ##
==========================================
+ Coverage   89.75%   89.78%   +0.02%     
==========================================
  Files          37       37              
  Lines        7128     7155      +27     
==========================================
+ Hits         6398     6424      +26     
- Misses        730      731       +1     
Flag Coverage Δ
Linux 89.64% <97.67%> (+0.02%) ⬆️
macOS 91.52% <97.67%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

When a pattern contained both `\` and `*` (e.g. `\*` for a literal
asterisk), the `*` check took priority over backslash unescaping,
causing `\*` to be treated as a glob wildcard instead of a literal `*`.

Check for backslash escapes first, unescape them, then only use glob
matching if unescaped `*` characters remain (i.e. ones that were not
preceded by `\`).
glob_match_with_sentinel was a near-duplicate of glob_match, differing
only in sentinel restoration. Inline the sentinel handling into
glob_match itself (guarded by a has_sentinel check) and remove the
separate function.
devin-ai-integration[bot]

This comment was marked as resolved.

fohte and others added 2 commits March 6, 2026 23:36
The find wrapper with <cmd> placeholder was only covered by integration
tests. Add E2E tests that exercise the full binary path (config parsing
-> command evaluation -> JSON output) to verify the fix works end-to-end.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
…-placeholder

# Conflicts:
#	tests/e2e/check_generic.rs
@fohte fohte merged commit 24a951a into main Mar 6, 2026
9 checks passed
@fohte fohte deleted the fohte/fix-parser-flag-placeholder branch March 6, 2026 15:02
@fohte-bot fohte-bot bot mentioned this pull request Mar 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant