support gitignore filter #9

jiacai2050 · 2023-06-12T13:00:24Z

No description provided.

jiacai2050 · 2025-05-16T23:20:12Z

/gemini help

gemini-code-assist · 2025-05-16T23:20:15Z

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

jiacai2050 · 2025-05-16T23:21:20Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces support for .gitignore style filtering, which is a valuable addition. The implementation includes a state machine for pattern matching and a parser for ignore rules. The code is generally well-structured.

I've found a couple of areas that need attention, including a critical issue in the parser logic and a medium-severity issue in how directory-specific rules are checked. Addressing these will improve the correctness and robustness of the new feature.

Summary of Findings

Parser Logic Error: A critical error in IgnoreParser.parse (lines 151-163) causes double processing of the first path component in certain patterns, leading to incorrect ignore rule state machines.
Directory-Specific Rule Handling: In IgnoreRule.check (lines 88-90), directory-specific rules might incorrectly ignore or exclude non-directory entries instead of returning .None.

Merge Readiness

This pull request makes good progress on adding .gitignore support. However, due to the critical parsing issue identified (and a medium-severity logic concern), I recommend that these changes be addressed before merging. Once these issues are resolved, the feature will be much more robust. I am not authorized to approve pull requests, so please ensure further review and approval after the changes are made.

gemini-code-assist · 2025-05-16T23:24:33Z

src/walk_dir.zig

+        var first_item = it.first();
+        if (!std.mem.eql(u8, "", first_item)) {
+            try rule.pushState(State{ .anything = true });
+            try rule.pushState(State{ .exact = first_item });
+        }
+
+        while (it.next()) |item| {
+            if (std.mem.eql(u8, "**", item)) {
+                try rule.pushState(State{ .anything = true });
+            } else {
+                try rule.pushState(State{ .exact = item });
+            }
+        }


There appears to be a critical issue in the parsing logic for rule components. The current approach using it.first() (line 151) to peek at the first component, potentially processing it in lines 153-154, and then iterating with while (it.next()) (line 157) can lead to the first component being processed twice. This is because it.first() does not advance the iterator.

For example, with a pattern like "a/b":

input[start..end] would be "a/b".

first_item (line 151) would be "a".

The if condition on line 152 would be true.

Lines 153-154 would push State{ .anything = true } and then State{ .exact = "a" }.

The while (it.next()) loop on line 157 would then again process "a" as its first item, followed by "b".
This would result in an incorrect state machine like [any, a, a, b] instead of the expected [any, a, b] for a pattern like "a/b" (which implies **/a/b).

This could lead to incorrect matching behavior. Consider refactoring this section to correctly handle the implicit ** for non-anchored patterns and then iterate through all components once.

A revised approach might involve:

Checking if input[start..end] starts with /.

If not (and the pattern is not empty), add the initial State{ .anything = true }.

Then, use a single while (it.next()) loop to process all components, ensuring to skip any empty components that arise (e.g., from a leading / or //).

// Corrected logic to avoid double-processing of the first component. const current_pattern_slice = input[start..end]; if (current_pattern_slice.len > 0 && !std.mem.startsWith(u8, current_pattern_slice, "/")) { // Pattern is not anchored and not empty, so prepend 'anything' (like '**') try rule.pushState(State{ .anything = true }); } // 'it' is already initialized from input[start..end] on line 150. // Now, iterate through its components. while (it.next()) |component| { if (component.len == 0) { // Skip empty components. These arise from: // - A leading slash (e.g., "/foo" -> "" then "foo"). The "" is skipped. // - Consecutive slashes (e.g., "foo//bar" -> "foo", "", "bar"). The "" is skipped. continue; } if (std.mem.eql(u8, "**", component)) { // Add 'anything' state, but avoid duplicates if the last state was also 'anything'. if (rule.state_machine.items.len == 0 or rule.state_machine.items[rule.state_machine.items.len - 1] != .{.anything = true}) { try rule.pushState(State{ .anything = true }); } } else { try rule.pushState(State{ .exact = component }); } }

gemini-code-assist · 2025-05-16T23:24:33Z

src/walk_dir.zig

+        if (self.is_dir and file_entry.kind != .directory) {
+            return if (self.is_exclude) .Exclude else .Ignore;
+        }


The current logic for directory-specific rules seems to have an issue. If a rule is specific to directories (i.e., self.is_dir is true, often from a pattern ending in /), and the file_entry being checked is not a directory, this rule should simply not apply to the entry. In such a case, the function should return .None.

Currently, if self.is_dir is true and file_entry.kind != .directory, it returns .Ignore (or .Exclude if self.is_exclude is true). This could lead to incorrectly ignoring or excluding files.

For example, if the rule is build/ (ignore the build directory) and we are checking a file named other_file.txt, this rule should not cause other_file.txt to be ignored. However, the current logic would return .Ignore.

Could you clarify if this behavior is intended, or if it should return .None when a directory-specific rule encounters a non-directory entry?

// If the rule is for a directory, but the entry is not a directory, // this rule does not apply to this entry. return .None;

Copilot

Pull Request Overview

This PR adds support for gitignore-style filtering by introducing a state-machine-based path matcher and a parser for ignore rules.

Implements match_iter for matching path segments against patterns including ** wildcards, with unit tests.
Adds IgnoreRule to represent parsed ignore/exclude rules and a check method to apply them to file entries.
Introduces IgnoreParser to parse gitignore syntax into IgnoreRule instances, with parser tests.

Comments suppressed due to low confidence (2)

src/walk_dir.zig:9

[nitpick] The alias PathIter is declared but never used; consider removing it to reduce dead code.

const PathIter = mem.SplitIterator(u8, .sequence);

src/walk_dir.zig:87

[nitpick] The check method of IgnoreRule is not covered by any unit tests. Consider adding tests for both .Ignore and .Exclude scenarios on files and directories to ensure the matching logic works as intended.

fn check(self: Self, path: []const u8, file_entry: fs.IterableDir.Entry) !CheckResult {

Copilot · 2025-05-16T23:24:34Z

src/walk_dir.zig

+
+        const remainings = mem.trimLeft(u8, path, self.dir);
+        var path_iter = mem.splitSequence(u8, remainings, "/");
+        var paths = std.ArrayList([]const u8);


The paths ArrayList is declared without calling .init(allocator) and is never deinitialized, leading to a panic on append and a memory leak. Initialize it with the allocator (e.g., std.ArrayList([]const u8).init(self.allocator)) and defer paths.deinit().

Suggested change

var paths = std.ArrayList([]const u8);

var paths = std.ArrayList([]const u8).init(self.state_machine.allocator);

defer paths.deinit();

jiacai2050 force-pushed the feat-walk branch from df93dd3 to 04e5c0e Compare June 24, 2023 12:04

jiacai2050 force-pushed the feat-walk branch from 04e5c0e to 88c5390 Compare July 10, 2023 12:23

jiacai2050 force-pushed the main branch from 43fd164 to 5469b7d Compare November 10, 2023 14:03

jiacai2050 force-pushed the main branch 2 times, most recently from 59b9ed9 to 23bf339 Compare March 24, 2024 04:14

jiacai2050 force-pushed the main branch 2 times, most recently from 0e885fc to c0c48df Compare April 25, 2024 00:33

jiacai2050 force-pushed the main branch 2 times, most recently from b358c93 to f9beb47 Compare May 2, 2024 10:02

jiacai2050 force-pushed the main branch 2 times, most recently from 85952e8 to 0a4d869 Compare August 17, 2024 10:53

jiacai2050 force-pushed the main branch 8 times, most recently from 408a2d7 to 9e01903 Compare September 28, 2024 04:58

jiacai2050 force-pushed the main branch from 3a353b6 to 38be678 Compare October 30, 2024 01:00

jiacai2050 added 5 commits April 7, 2025 21:08

add work dir

9ff525f

add parser

b3c78ef

add dir

521dc9a

add match_iter

6d3bbbb

add special test case

aa98c92

jiacai2050 force-pushed the feat-walk branch from 32a57aa to aa98c92 Compare April 7, 2025 13:08

jiacai2050 requested a review from Copilot May 16, 2025 23:22

gemini-code-assist bot suggested changes May 16, 2025

View reviewed changes

Copilot AI reviewed May 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

support gitignore filter #9

support gitignore filter #9

Uh oh!

jiacai2050 commented Jun 12, 2023

Uh oh!

jiacai2050 commented May 16, 2025

Uh oh!

gemini-code-assist bot commented May 16, 2025

Uh oh!

jiacai2050 commented May 16, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot May 16, 2025

Uh oh!

gemini-code-assist bot May 16, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI May 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	var paths = std.ArrayList([]const u8);
	var paths = std.ArrayList([]const u8).init(self.state_machine.allocator);
	defer paths.deinit();

Uh oh!

support gitignore filter #9

Are you sure you want to change the base?

support gitignore filter #9

Uh oh!

Conversation

jiacai2050 commented Jun 12, 2023

Uh oh!

jiacai2050 commented May 16, 2025

Uh oh!

gemini-code-assist bot commented May 16, 2025

Uh oh!

jiacai2050 commented May 16, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Summary of Findings

Merge Readiness

Uh oh!

gemini-code-assist bot May 16, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot May 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI May 16, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants