-
Notifications
You must be signed in to change notification settings - Fork 2.1k
[Fix][FTP]Fix 'file_filterpattern' cannot filter directories #10049
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
[Fix][FTP]Fix 'file_filterpattern' cannot filter directories #10049
Conversation
|
@davidzollo Could you please help me review the code again |
davidzollo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 if CI passed
| There are some examples. | ||
|
|
||
| File Structure Example: | ||
| If the `path` is `/data/setunnel`, and the file structure example is: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| If the `path` is `/data/setunnel`, and the file structure example is: | |
| If the `path` is `/data/seatunnel`, and the file structure example is: |
| There are some examples. | ||
|
|
||
| File Structure Example: | ||
| If the `path` is `/data/setunnel`, and the file structure example is: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
| There are some examples. | ||
|
|
||
| File Structure Example: | ||
| If the `path` is `/data/setunnel`, and the file structure example is: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
| if (pattern.pattern().startsWith(path)) { | ||
| // filter based on the file directory at the same time | ||
| return pattern.matcher(fileStatus.getPath().getName()).matches(); | ||
| } | ||
| // filter based on file names | ||
| return pattern.matcher(fileStatus.getPath().toUri().getPath()).matches(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the logic reversed?
pattern.pattern().startsWith(path) = true ,Should we filter directories or files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your review. I have made improvements and the fileStatus in this method will only be files, not directories
|
thanks @LiJie20190102 |
Can you help me check why it failed? I am referring to 'org.apache.seatunnel.connectors.seatunnel.file.writer.XmlReadStrategyTest', which can pass
|
| @Test | ||
| public void testJsonFilterPatternWithFilePath() throws URISyntaxException, IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you help me check why it failed? I am referring to 'org.apache.seatunnel.connectors.seatunnel.file.writer.XmlReadStrategyTest', which can pass
What do you think about using the annotation @DisabledOnOs(OS.WINDOWS) here? cc @davidzollo @zhangshenghang
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you help me check why it failed? I am referring to 'org.apache.seatunnel.connectors.seatunnel.file.writer.XmlReadStrategyTest', which can pass
What do you think about using the annotation
@DisabledOnOs(OS.WINDOWS)here? cc @davidzollo @zhangshenghang
@dybyte I agree with your idea, @LiJie20190102 but can you provide a screenshot of the successful run under Windows ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davidzollo @zhangshenghang If I provide a locally successful snapshot, does it not require adding e2e?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| @Test | ||
| public void testJsonFilterPatternWithFileName() throws URISyntaxException, IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
|
Considering that this PR affects lots of file connectors. Please add related e2e test, thx |
e2e has already been added. @davidzollo @zhangshenghang |
| // skip hidden tmp directory, such as .hive-staging_hive | ||
| if (!fileStatus.getPath().getName().startsWith(".")) { | ||
| fileNames.addAll(getFileNamesByPath(fileStatus.getPath().toString())); | ||
| fileNames.addAll(getFileNamesByPath(fileStatus.getPath().toUri().getPath())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will there be any problem with this change? If the file path is this: s3a://my-bucket/data/2025/11/report.csv, is there a problem with it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, after careful consideration, this is the area where I adapted the code previously modified, and this code can be left unchanged
| Assertions.assertEquals(0, extraCommands.getExitCode()); | ||
| }; | ||
|
|
||
| @TestTemplate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current S3 tests are disabled. Could you please modify the changes to enable the S3 tests as well? It should be similar to how MinIO supports the S3 protocol.






Purpose of this pull request
fix #10012. I previously submitted a PR, which is #10014 , but I didn't handle it properly, so I resubmitted it
Does this PR introduce any user-facing change?
How was this patch tested?
If the
pathis/data/setunnel, and the file structure example is:If you only want to filter based on file names, simply write the regular file names; If you want to filter based on the file directory at the same time, the expression needs to start with
path.Example 1: Match all .txt files,Regular Expression:
The result of this example matching is:
Example 2: Match third level folders starting with 202410 and files ending with .csv, the Regular Expression:
The result of this example matching is:
Check list
New License Guide