Skip to content

Conversation

@sadikovi
Copy link
Contributor

@sadikovi sadikovi commented Nov 8, 2022

What changes were proposed in this pull request?

Follow-up for #38504.

This PR fixes an issue where a unit test for SymlinkTextInputFormat would fail in JDK 11. The fix is to return the record reader directly instead of wrapping it with HiveRecordReader.

The issue could be reproduced by running a test right before SPARK-40815 tests with JDK 11 and appears to be related to ExecMapper.getDone() (https://github.com/apache/hive/blob/branch-2.3/ql/src/java/org/apache/hadoop/hive/ql/io/HiveRecordReader.java#L76). This check is for a static variable that is set when the map job is done to limit results. Note that there is no synchronisation around getDone() or setDone(), it is just a static mutable variable.

Once that is set, the record reader returns 0 records. I don't know why it works in JDK 8 but not in JDK 11 but my current suspicion is something related to class loader in JDK similar to this patch: apache/hive@a234475 (maybe this is the fix but I was not able to verify).

Why are the changes needed?

Re-enables the test and fixes an issue with DelegateSymlinkTextInputFormat returning 0 records. Note that this is still an issue with SymlinkTextInputFormat as this needs to be addressed in Hive.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

I re-enabled the unit tests that used to fail in JDK 11 and extended a test to cover LIMIT.

@github-actions github-actions bot added the SQL label Nov 8, 2022
@sadikovi
Copy link
Contributor Author

sadikovi commented Nov 8, 2022

@dongjoon-hyun @sunchao Could you review this PR? Thank you. This is my attempt at fixing the test (and the code) in JDK 11.

@sadikovi
Copy link
Contributor Author

cc @cloud-fan as well.

Let me know if there are concerns with this approach.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 6308f54 Nov 10, 2022
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@sadikovi
Copy link
Contributor Author

Thank you.

SandishKumarHN pushed a commit to SandishKumarHN/spark that referenced this pull request Dec 12, 2022
…tInputFormat to avoid Hive ExecMapper.getDone() check

### What changes were proposed in this pull request?

Follow-up for apache#38504.

This PR fixes an issue where a unit test for SymlinkTextInputFormat would fail in JDK 11. The fix is to return the record reader directly instead of wrapping it with `HiveRecordReader`.

The issue could be reproduced by running a test right before SPARK-40815 tests with JDK 11 and appears to be related to `ExecMapper.getDone()` (https://github.com/apache/hive/blob/branch-2.3/ql/src/java/org/apache/hadoop/hive/ql/io/HiveRecordReader.java#L76). This check is for a static variable that is set when the map job is done to limit results. Note that there is no synchronisation around `getDone()` or `setDone()`, it is just a static mutable variable.

Once that is set, the record reader returns 0 records. I don't know why it works in JDK 8 but not in JDK 11 but my current suspicion is something related to class loader in JDK similar to this patch: apache/hive@a234475 (maybe this is the fix but I was not able to verify).

### Why are the changes needed?

Re-enables the test and fixes an issue with DelegateSymlinkTextInputFormat returning 0 records. Note that this is still an issue with SymlinkTextInputFormat as this needs to be addressed in Hive.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

I re-enabled the unit tests that used to fail in JDK 11 and extended a test to cover LIMIT.

Closes apache#38544 from sadikovi/fix-symlink-test-2.

Authored-by: Ivan Sadikov <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants