Skip to content

Commit bbda3f9

Browse files
wangyumgengliangwang
authored andcommitted
[SPARK-46396][SQL] Timestamp inference should not throw exception (apache#363)
### What changes were proposed in this pull request? When setting `spark.sql.legacy.timeParserPolicy=LEGACY`, Spark will use the LegacyFastTimestampFormatter to infer potential timestamp columns. The inference shouldn't throw exception. However, when the input is 23012150952, there is exception: ``` For input string: "23012150952" java.lang.NumberFormatException: For input string: "23012150952" at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:67) at java.base/java.lang.Integer.parseInt(Integer.java:668) at java.base/java.lang.Integer.parseInt(Integer.java:786) at org.apache.commons.lang3.time.FastDateParser$NumberStrategy.parse(FastDateParser.java:304) at org.apache.commons.lang3.time.FastDateParser.parse(FastDateParser.java:1045) at org.apache.commons.lang3.time.FastDateFormat.parse(FastDateFormat.java:651) at org.apache.spark.sql.catalyst.util.LegacyFastTimestampFormatter.parseOptional(TimestampFormatter.scala:418) ``` This PR is to fix the issue. ### Why are the changes needed? Bug fix, Timestamp inference should not throw exception ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? New test case + existing tests ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#44338 from gengliangwang/fixParseOptional. Authored-by: Gengliang Wang <[email protected]> (cherry picked from commit 4a79ae9) Signed-off-by: Gengliang Wang <[email protected]> Co-authored-by: Gengliang Wang <[email protected]>
1 parent 94f10e1 commit bbda3f9

2 files changed

Lines changed: 10 additions & 5 deletions

File tree

sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -412,10 +412,14 @@ class LegacyFastTimestampFormatter(
412412

413413
override def parseOptional(s: String): Option[Long] = {
414414
cal.clear() // Clear the calendar because it can be re-used many times
415-
if (fastDateFormat.parse(s, new ParsePosition(0), cal)) {
416-
Some(extractMicros(cal))
417-
} else {
418-
None
415+
try {
416+
if (fastDateFormat.parse(s, new ParsePosition(0), cal)) {
417+
Some(extractMicros(cal))
418+
} else {
419+
None
420+
}
421+
} catch {
422+
case NonFatal(_) => None
419423
}
420424
}
421425

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/TimestampFormatterSuite.scala

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -502,9 +502,10 @@ class TimestampFormatterSuite extends DatetimeFormatterSuite {
502502

503503
assert(fastFormatter.parseOptional("2023-12-31 23:59:59.9990").contains(1704067199999000L))
504504
assert(fastFormatter.parseOptional("abc").isEmpty)
505+
assert(fastFormatter.parseOptional("23012150952").isEmpty)
505506

506507
assert(simpleFormatter.parseOptional("2023-12-31 23:59:59.9990").contains(1704067208990000L))
507508
assert(simpleFormatter.parseOptional("abc").isEmpty)
508-
509+
assert(simpleFormatter.parseOptional("23012150952").isEmpty)
509510
}
510511
}

0 commit comments

Comments
 (0)