Skip to content

Conversation

@ConcurrencyPractitioner
Copy link

@ConcurrencyPractitioner ConcurrencyPractitioner commented Feb 27, 2018

What changes were proposed in this pull request?

In this PR, a extra boolean expression was added to test if a regex was present. If returned true, then we exclude the file.

How was this patch tested?

No tests were added.

@ConcurrencyPractitioner
Copy link
Author

Jenkins test this please

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@ConcurrencyPractitioner ConcurrencyPractitioner changed the title [SPARK-8605] Exclude files in StreamingContext. textFileStream(direct… [SPARK-8605] Exclude files in StreamingContext. textFileStream Feb 27, 2018
@jerryshao
Copy link
Contributor

a extra boolean expression was added to test if a regex was present.

Can you please explain what's the meaning of "if a regex was present"?

Seems the fix is not so necessary. If you want to filter out some temp files, you can write your own filter instead of using Spark Streaming's default one.

@ConcurrencyPractitioner
Copy link
Author

@jerryshao In Spark Streaming, I think .tmp is used as a suffix to indicate that the object was a file, although I do not know if this is universal.

@gaborgsomogyi
Copy link
Contributor

Don't really understand the issue itself. Which filesystem used this case? Why is it not possible to use Hadoop-compatible filesystem like HDFS for instance? This supports atomic rename. See here

@srowen srowen mentioned this pull request May 11, 2018
@asfgit asfgit closed this in 348ddfd May 12, 2018
zifeif2 pushed a commit to zifeif2/spark that referenced this pull request Nov 22, 2025
Closes apache#20458
Closes apache#20530
Closes apache#20557
Closes apache#20966
Closes apache#20857
Closes apache#19694
Closes apache#18227
Closes apache#20683
Closes apache#20881
Closes apache#20347
Closes apache#20825
Closes apache#20078

Closes apache#21281
Closes apache#19951
Closes apache#20905
Closes apache#20635

Author: Sean Owen <[email protected]>

Closes apache#21303 from srowen/ClosePRs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants