Skip to content

Conversation

@beliefer
Copy link
Contributor

@beliefer beliefer commented Nov 26, 2020

What changes were proposed in this pull request?

Currently in Spark one could redefine a window. For instance:

select count(*) OVER w FROM tenk1 WINDOW w AS (ORDER BY unique1), w AS (ORDER BY unique1);
The window w is defined two times. In PgSQL, on the other hand, a thrown will happen:

ERROR: window "w" is already defined

Why are the changes needed?

The current implement gives the following window definitions a higher priority. But it wasn't Spark's intention and users can't know from any document of Spark.
This PR fixes the bug.

Does this PR introduce any user-facing change?

Yes.
There is an example query output with/without this fix.

SELECT
    employee_name,
    salary,
    first_value(employee_name) OVER w highest_salary,
    nth_value(employee_name, 2) OVER w second_highest_salary
FROM
    basic_pays
WINDOW
    w AS (ORDER BY salary DESC ROWS BETWEEN UNBOUNDED PRECEDING AND 1 FOLLOWING),
    w AS (ORDER BY salary DESC ROWS BETWEEN UNBOUNDED PRECEDING AND 2 FOLLOWING)
ORDER BY salary DESC

The output before this fix:

Larry Bott	11798	Larry Bott	Gerard Bondur
Gerard Bondur	11472	Larry Bott	Gerard Bondur
Pamela Castillo	11303	Larry Bott	Gerard Bondur
Barry Jones	10586	Larry Bott	Gerard Bondur
George Vanauf	10563	Larry Bott	Gerard Bondur
Loui Bondur	10449	Larry Bott	Gerard Bondur
Mary Patterson	9998	Larry Bott	Gerard Bondur
Steve Patterson	9441	Larry Bott	Gerard Bondur
Julie Firrelli	9181	Larry Bott	Gerard Bondur
Jeff Firrelli	8992	Larry Bott	Gerard Bondur
William Patterson	8870	Larry Bott	Gerard Bondur
Diane Murphy	8435	Larry Bott	Gerard Bondur
Leslie Jennings	8113	Larry Bott	Gerard Bondur
Gerard Hernandez	6949	Larry Bott	Gerard Bondur
Foon Yue Tseng	6660	Larry Bott	Gerard Bondur
Anthony Bow	6627	Larry Bott	Gerard Bondur
Leslie Thompson	5186	Larry Bott	Gerard Bondur

The output after this fix:

struct<>
-- !query output
org.apache.spark.sql.catalyst.parser.ParseException

The definition of window 'w' is repetitive(line 8, pos 0)

How was this patch tested?

Jenkins test.

beliefer and others added 30 commits June 19, 2020 10:36
@github-actions github-actions bot added the SQL label Nov 26, 2020
@SparkQA
Copy link

SparkQA commented Nov 26, 2020

Test build #131824 has finished for PR 30512 at commit 43e8542.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 26, 2020

Test build #131832 has finished for PR 30512 at commit 9f29766.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@beliefer beliefer changed the title [SPARK-28645][SQL] Throw ParseException on window redefinition [SPARK-28645][SQL] ParseException is thrown when the window is redefined Nov 27, 2020
@beliefer
Copy link
Contributor Author

cc @cloud-fan

@maropu
Copy link
Member

maropu commented Nov 27, 2020

The fix looks fine. In the PR description, could you add an example query output with/without this fix?

@beliefer
Copy link
Contributor Author

The fix looks fine. In the PR description, could you add an example query output with/without this fix?

OK

FROM
basic_pays
WINDOW
w AS (ORDER BY salary DESC ROWS BETWEEN UNBOUNDED PRECEDING AND 1 FOLLOWING),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the window spec name case insensitive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the PgSQL and find the alias of window frame is case insensitive.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in e432550 Nov 27, 2020
@beliefer
Copy link
Contributor Author

@cloud-fan @maropu Thanks for your work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants