Skip to content

Conversation

@maropu
Copy link
Member

@maropu maropu commented Nov 13, 2019

What changes were proposed in this pull request?

This pr is to support --import directive to load queries from another test case in SQLQueryTestSuite.

This fix comes from the @cloud-fan suggestion in #26479 (comment)

Why are the changes needed?

This functionality might reduce duplicate test code in SQLQueryTestSuite.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Run SQLQueryTestSuite.

@SparkQA
Copy link

SparkQA commented Nov 13, 2019

Test build #113681 has finished for PR 26497 at commit bfa07a5.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • protected case class AnsiTestCase(

select interval;
select interval 1 fake_unit;
select interval 1 year to month;
select 1 year to month;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should move some tests to ansi/interval.sql. They are not duplicated tests, they test without the leading "interval".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. I'll recheck later. Thanks!

@@ -0,0 +1,2 @@
-- throw an exception instead of returning NULL, according to SQL ANSI 2011
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a little more description? when will we throw an exception?

@@ -0,0 +1,2 @@
--- malformed interval literal with ansi mode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, do we still have interval related tests in literals.sql?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok to move all the interval-related queries here into interval.sql or ansi/interval.sql?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. @yaooqinn do you have time to take it later? IIRC there are a few interval tests in group-by.sql as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. As for literals.sql, I'll move all the interval-related queries into correct files in this pr.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some cases are using interval but to verify literals such as

-- awareness of the negative sign before type
select -integer '7';
select -date '1999-01-01';
select -timestamp '1999-01-01';
select -x'2379ACFe';
select +integer '7';
select +interval '1 second';

the group-by.sql contains avg and sum support, which should be move to interval.sql

@SparkQA
Copy link

SparkQA commented Nov 13, 2019

Test build #113698 has finished for PR 26497 at commit 56ceb9f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • protected case class AnsiTestCase(

// vol used by boolean.sql and case.sql.
localSparkSession.udf.register("vol", (s: String) => s)
// PostgreSQL enabled cartesian product by default.
localSparkSession.conf.set(SQLConf.CROSS_JOINS_ENABLED.key, true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not related to this PR, but can we remove it here? CROSS JOIN is already enabled by default now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes! I'll remove it.

localSparkSession.udf.register("vol", (s: String) => s)
// PostgreSQL enabled cartesian product by default.
localSparkSession.conf.set(SQLConf.CROSS_JOINS_ENABLED.key, true)
localSparkSession.conf.set(SQLConf.ANSI_ENABLED.key, true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove it in the followup, and see if it fails tests. Ideally pgsql dialect should not be affected by ansi mode config. cc @gengliangwang

@cloud-fan
Copy link
Contributor

thanks for adding it!

select interval '12:11:10' hour to second '1' year;

-- awareness of the negative sign before type
select +integer '7';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't test integer in interval.sql, but we should test -interval '1 second' to match the comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ur, I moved this by mistake. I'll revert.

localSparkSession.udf.register("boolne", (b1: Boolean, b2: Boolean) => b1 != b2)
// vol used by boolean.sql and case.sql.
localSparkSession.udf.register("vol", (s: String) => s)
// PostgreSQL enabled cartesian product by default.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this comment can be removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@SparkQA
Copy link

SparkQA commented Nov 13, 2019

Test build #113707 has finished for PR 26497 at commit 615347e.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 13, 2019

Test build #113705 has finished for PR 26497 at commit 0d9e462.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 13, 2019

Test build #113709 has finished for PR 26497 at commit 93051fc.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

-- SPARK-23179: SQL ANSI 2011 states that in case of overflow during arithmetic operations,
-- an exception should be thrown instead of returning NULL.
-- This is what most of the SQL DBs do (eg. SQLServer, DB2).
--import decimalArithmeticOperations.sql
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to fail still.

[info] - ansi/decimalArithmeticOperations.sql *** FAILED *** (2 seconds, 158 milliseconds)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, thanks for your catch-up! I've fixed now.

@@ -0,0 +1,2 @@
--- malformed interval literal with ansi mode
--import literals.sql
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also still fails, too.

[info] - literals.sql *** FAILED *** (1 second, 989 milliseconds)

-- SPARK-23179: SQL ANSI 2011 states that in case of overflow during arithmetic operations,
-- an exception should be thrown instead of returning NULL.
-- This is what most of the SQL DBs do (eg. SQLServer, DB2).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since inputs/decimalArithmeticOperations.sql has some nondeterministic output with ansi=true, I inlined the ANSI-related queries in it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is surprising. what causes the nondeterminice?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, the queries below in inputs/decimalArithmeticOperations.sql throws multiple exceptions with a different error message in executors with ansi=true;

sql("SET spark.sql.decimalOperations.allowPrecisionLoss=false")
sql("SET spark.sql.ansi.enabled=true")
sql("create table decimals_test(id int, a decimal(38,18), b decimal(38,18)) using parquet")
sql("insert into decimals_test values(1, 100.0, 999.0), (2, 12345.123, 12345.123), (3, 0.1234567891011, 1234.1), (4, 123456789123456789.0, 1.123456789123456789)")
sql("select id, a+b, a-b, a*b, a/b from decimals_test order by id").show()

java.lang.ArithmeticException: Decimal(expanded,138698367904130467.51562262075019052100,38,20}) cannot be represented as Decimal(38, 36).
	at org.apache.spark.sql.types.Decimal.toPrecision(Decimal.scala:357)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
...
java.lang.ArithmeticException: Decimal(expanded,99900.000000000000000000000000000000000,38,33}) cannot be represented as Decimal(38, 36).
	at org.apache.spark.sql.types.Decimal.toPrecision(Decimal.scala:357)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
...
java.lang.ArithmeticException: Decimal(expanded,152.35802342966751000000000000000000000,38,35}) cannot be represented as Decimal(38, 36).
	at org.apache.spark.sql.types.Decimal.toPrecision(Decimal.scala:357)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
...
<more exceptions below>

So, the output string printed in decimalArithmeticOperations.sql.out depends on timing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah this is nasty, but no better ideas.

Copy link
Member Author

@maropu maropu Nov 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I have no idea, too.

@SparkQA
Copy link

SparkQA commented Nov 14, 2019

Test build #113741 has finished for PR 26497 at commit ab0730b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan cloud-fan closed this in b5a02d3 Nov 14, 2019
@cloud-fan
Copy link
Contributor

thanks, merging to master!

@maropu
Copy link
Member Author

maropu commented Nov 14, 2019

Thanks for merging!

cloud-fan pushed a commit that referenced this pull request Nov 20, 2019
…f ansi mode config

### What changes were proposed in this pull request?
Fix the inconsistent behavior of build-in function SQL LEFT/RIGHT.

### Why are the changes needed?
As the comment in #26497 (comment), Postgre dialect should not be affected by the ANSI mode config.
During reran the existing tests, only the LEFT/RIGHT build-in SQL function broke the assumption. We fix this by following https://www.postgresql.org/docs/12/sql-keywords-appendix.html: `LEFT/RIGHT reserved (can be function or type)`

### Does this PR introduce any user-facing change?
Yes, the Postgre dialect will not be affected by the ANSI mode config.

### How was this patch tested?
Existing UT.

Closes #26584 from xuanyuanking/SPARK-29951.

Authored-by: Yuanjian Li <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
xuanyuanking added a commit to xuanyuanking/spark that referenced this pull request Dec 9, 2019
…f ansi mode config

Fix the inconsistent behavior of build-in function SQL LEFT/RIGHT.

As the comment in apache#26497 (comment), Postgre dialect should not be affected by the ANSI mode config.
During reran the existing tests, only the LEFT/RIGHT build-in SQL function broke the assumption. We fix this by following https://www.postgresql.org/docs/12/sql-keywords-appendix.html: `LEFT/RIGHT reserved (can be function or type)`

Yes, the Postgre dialect will not be affected by the ANSI mode config.

Existing UT.

Closes apache#26584 from xuanyuanking/SPARK-29951.

Authored-by: Yuanjian Li <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants