-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-29873][SQL][TESTS] Support --import directive to load queries from another test case in SQLQueryTestSuite
#26497
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #113681 has finished for PR 26497 at commit
|
| select interval; | ||
| select interval 1 fake_unit; | ||
| select interval 1 year to month; | ||
| select 1 year to month; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should move some tests to ansi/interval.sql. They are not duplicated tests, they test without the leading "interval".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. I'll recheck later. Thanks!
| @@ -0,0 +1,2 @@ | |||
| -- throw an exception instead of returning NULL, according to SQL ANSI 2011 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add a little more description? when will we throw an exception?
| @@ -0,0 +1,2 @@ | |||
| --- malformed interval literal with ansi mode | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, do we still have interval related tests in literals.sql?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it ok to move all the interval-related queries here into interval.sql or ansi/interval.sql?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so. @yaooqinn do you have time to take it later? IIRC there are a few interval tests in group-by.sql as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. As for literals.sql, I'll move all the interval-related queries into correct files in this pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some cases are using interval but to verify literals such as
-- awareness of the negative sign before type
select -integer '7';
select -date '1999-01-01';
select -timestamp '1999-01-01';
select -x'2379ACFe';
select +integer '7';
select +interval '1 second';
the group-by.sql contains avg and sum support, which should be move to interval.sql
|
Test build #113698 has finished for PR 26497 at commit
|
| // vol used by boolean.sql and case.sql. | ||
| localSparkSession.udf.register("vol", (s: String) => s) | ||
| // PostgreSQL enabled cartesian product by default. | ||
| localSparkSession.conf.set(SQLConf.CROSS_JOINS_ENABLED.key, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not related to this PR, but can we remove it here? CROSS JOIN is already enabled by default now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yes! I'll remove it.
| localSparkSession.udf.register("vol", (s: String) => s) | ||
| // PostgreSQL enabled cartesian product by default. | ||
| localSparkSession.conf.set(SQLConf.CROSS_JOINS_ENABLED.key, true) | ||
| localSparkSession.conf.set(SQLConf.ANSI_ENABLED.key, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should remove it in the followup, and see if it fails tests. Ideally pgsql dialect should not be affected by ansi mode config. cc @gengliangwang
|
thanks for adding it! |
| select interval '12:11:10' hour to second '1' year; | ||
|
|
||
| -- awareness of the negative sign before type | ||
| select +integer '7'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we shouldn't test integer in interval.sql, but we should test -interval '1 second' to match the comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ur, I moved this by mistake. I'll revert.
| localSparkSession.udf.register("boolne", (b1: Boolean, b2: Boolean) => b1 != b2) | ||
| // vol used by boolean.sql and case.sql. | ||
| localSparkSession.udf.register("vol", (s: String) => s) | ||
| // PostgreSQL enabled cartesian product by default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this comment can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
|
Test build #113707 has finished for PR 26497 at commit
|
|
Test build #113705 has finished for PR 26497 at commit
|
|
Test build #113709 has finished for PR 26497 at commit
|
| -- SPARK-23179: SQL ANSI 2011 states that in case of overflow during arithmetic operations, | ||
| -- an exception should be thrown instead of returning NULL. | ||
| -- This is what most of the SQL DBs do (eg. SQLServer, DB2). | ||
| --import decimalArithmeticOperations.sql |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to fail still.
[info] - ansi/decimalArithmeticOperations.sql *** FAILED *** (2 seconds, 158 milliseconds)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, thanks for your catch-up! I've fixed now.
| @@ -0,0 +1,2 @@ | |||
| --- malformed interval literal with ansi mode | |||
| --import literals.sql | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also still fails, too.
[info] - literals.sql *** FAILED *** (1 second, 989 milliseconds)
| -- SPARK-23179: SQL ANSI 2011 states that in case of overflow during arithmetic operations, | ||
| -- an exception should be thrown instead of returning NULL. | ||
| -- This is what most of the SQL DBs do (eg. SQLServer, DB2). | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since inputs/decimalArithmeticOperations.sql has some nondeterministic output with ansi=true, I inlined the ANSI-related queries in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is surprising. what causes the nondeterminice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, the queries below in inputs/decimalArithmeticOperations.sql throws multiple exceptions with a different error message in executors with ansi=true;
sql("SET spark.sql.decimalOperations.allowPrecisionLoss=false")
sql("SET spark.sql.ansi.enabled=true")
sql("create table decimals_test(id int, a decimal(38,18), b decimal(38,18)) using parquet")
sql("insert into decimals_test values(1, 100.0, 999.0), (2, 12345.123, 12345.123), (3, 0.1234567891011, 1234.1), (4, 123456789123456789.0, 1.123456789123456789)")
sql("select id, a+b, a-b, a*b, a/b from decimals_test order by id").show()
java.lang.ArithmeticException: Decimal(expanded,138698367904130467.51562262075019052100,38,20}) cannot be represented as Decimal(38, 36).
at org.apache.spark.sql.types.Decimal.toPrecision(Decimal.scala:357)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
...
java.lang.ArithmeticException: Decimal(expanded,99900.000000000000000000000000000000000,38,33}) cannot be represented as Decimal(38, 36).
at org.apache.spark.sql.types.Decimal.toPrecision(Decimal.scala:357)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
...
java.lang.ArithmeticException: Decimal(expanded,152.35802342966751000000000000000000000,38,35}) cannot be represented as Decimal(38, 36).
at org.apache.spark.sql.types.Decimal.toPrecision(Decimal.scala:357)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
...
<more exceptions below>
So, the output string printed in decimalArithmeticOperations.sql.out depends on timing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah this is nasty, but no better ideas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I have no idea, too.
|
Test build #113741 has finished for PR 26497 at commit
|
|
thanks, merging to master! |
|
Thanks for merging! |
…f ansi mode config ### What changes were proposed in this pull request? Fix the inconsistent behavior of build-in function SQL LEFT/RIGHT. ### Why are the changes needed? As the comment in #26497 (comment), Postgre dialect should not be affected by the ANSI mode config. During reran the existing tests, only the LEFT/RIGHT build-in SQL function broke the assumption. We fix this by following https://www.postgresql.org/docs/12/sql-keywords-appendix.html: `LEFT/RIGHT reserved (can be function or type)` ### Does this PR introduce any user-facing change? Yes, the Postgre dialect will not be affected by the ANSI mode config. ### How was this patch tested? Existing UT. Closes #26584 from xuanyuanking/SPARK-29951. Authored-by: Yuanjian Li <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
…f ansi mode config Fix the inconsistent behavior of build-in function SQL LEFT/RIGHT. As the comment in apache#26497 (comment), Postgre dialect should not be affected by the ANSI mode config. During reran the existing tests, only the LEFT/RIGHT build-in SQL function broke the assumption. We fix this by following https://www.postgresql.org/docs/12/sql-keywords-appendix.html: `LEFT/RIGHT reserved (can be function or type)` Yes, the Postgre dialect will not be affected by the ANSI mode config. Existing UT. Closes apache#26584 from xuanyuanking/SPARK-29951. Authored-by: Yuanjian Li <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
What changes were proposed in this pull request?
This pr is to support
--importdirective to load queries from another test case in SQLQueryTestSuite.This fix comes from the @cloud-fan suggestion in #26479 (comment)
Why are the changes needed?
This functionality might reduce duplicate test code in
SQLQueryTestSuite.Does this PR introduce any user-facing change?
No.
How was this patch tested?
Run
SQLQueryTestSuite.