Commit 17974e2
[SPARK-28015][SQL] Check stringToDate() consumes entire input for the yyyy and yyyy-[m]m formats
Fix `stringToDate()` for the formats `yyyy` and `yyyy-[m]m` that assumes there are no additional chars after the last components `yyyy` and `[m]m`. In the PR, I propose to check that entire input was consumed for the formats.
After the fix, the input `1999 08 01` will be invalid because it matches to the pattern `yyyy` but the strings contains additional chars ` 08 01`.
Since Spark 1.6.3 ~ 2.4.3, the behavior is the same.
```
spark-sql> SELECT CAST('1999 08 01' AS DATE);
1999-01-01
```
This PR makes it return NULL like Hive.
```
spark-sql> SELECT CAST('1999 08 01' AS DATE);
NULL
```
Added new checks to `DateTimeUtilsSuite` for the `1999 08 01` and `1999 08` inputs.
Closes #25097 from MaxGekk/spark-28015-invalid-date-format.
Authored-by: Maxim Gekk <maxim.gekk@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>1 parent 1abac14 commit 17974e2
2 files changed
Lines changed: 10 additions & 0 deletions
File tree
- sql/catalyst/src
- main/scala/org/apache/spark/sql/catalyst/util
- test/scala/org/apache/spark/sql/catalyst/util
Lines changed: 4 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
498 | 498 | | |
499 | 499 | | |
500 | 500 | | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
501 | 505 | | |
502 | 506 | | |
503 | 507 | | |
| |||
Lines changed: 6 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
160 | 160 | | |
161 | 161 | | |
162 | 162 | | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
163 | 166 | | |
164 | 167 | | |
165 | 168 | | |
| |||
336 | 339 | | |
337 | 340 | | |
338 | 341 | | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
339 | 345 | | |
340 | 346 | | |
341 | 347 | | |
| |||
0 commit comments