Skip to content

Conversation

@pan3793
Copy link
Member

@pan3793 pan3793 commented Mar 21, 2024

What changes were proposed in this pull request?

This is an alternative to #4923 to address jodd CVE.

Hive uses jodd's JDateTime to handle Julian's calendar in processing Parquet timestamp, while the calendar is tricky and bugly stuff, considering that branch-2.3 has been in maintaining mode for a long time, I prefer to copy the corresponding code to avoid involving any unexpected behavior change.

Why are the changes needed?

To address CVE-2018-21234.

Jodd CVE is listed in https://issues.apache.org/jira/browse/SPARK-44757 top 1, with a score 9.8.

Does this PR introduce any user-facing change?

No.

Is the change a dependency upgrade?

No.

How was this patch tested?

UT

@pan3793
Copy link
Member Author

pan3793 commented Mar 21, 2024

cc @sunchao, this is an alternative to #4923, and I prefer this way for safety, please let me know what you think.

<exclude>**/*.html</exclude>
<exclude>**/sit</exclude>
<exclude>**/test/queries/**/*.sql</exclude>
<exclude>**/ql/io/parquet/timestamp/datetime/**</exclude>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to https://www.apache.org/legal/src-headers.html#3party, we should not add AL2 license header to the copied source files

Copy link
Member

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pan3793
Copy link
Member Author

pan3793 commented Mar 26, 2024

Thanks for your approval, @sunchao

@sunchao
Copy link
Member

sunchao commented Mar 28, 2024

Thanks @pan3793 !

dongjoon-hyun pushed a commit to apache/spark that referenced this pull request May 10, 2024
### What changes were proposed in this pull request?

Remove a jar that has CVE GHSA-jrg3-qq99-35g7

### Why are the changes needed?

Previously, `jodd-core` came from Hive transitive deps, while apache/hive#5151 (Hive 2.3.10) cut it out, so we can remove it from Spark now.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Pass GA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #46520 from pan3793/SPARK-48230.

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
dongjoon-hyun pushed a commit to apache/spark that referenced this pull request Jul 22, 2025
### What changes were proposed in this pull request?

Previously, `jodd-core` came from Hive transitive deps, while apache/hive#5151 (Hive 2.3.10) cut it out, so we can remove it from Spark now.

Note: Jars shipped by Spark binary releases vary in different versions, for UDF that depends on `jodd-core` classes, it's the user's responsibility to handle transitive dependencies(e.g. shading and relocating transitive classes into the UDF jars).

### Why are the changes needed?

Remove an unused dependency.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass GA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #51618 from pan3793/SPARK-48230.

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants