Skip to content

Conversation

@panbingkun
Copy link
Contributor

@panbingkun panbingkun commented Sep 9, 2024

What changes were proposed in this pull request?

The pr is following up #48019.

Why are the changes needed?

Fix flaky CollationSuite .
After this PR, GA's CollationSuite failed, let's fix it first.
https://github.com/panbingkun/spark/actions/runs/10765984604/job/29851047311
image

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Update existed UT.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Sep 9, 2024
@panbingkun panbingkun changed the title [SPARK-49246][TESTS] Fix CollationSuite [SPARK-49246][FOLLOWUP][TESTS] Fix CollationSuite Sep 9, 2024
@panbingkun panbingkun marked this pull request as ready for review September 9, 2024 06:59
Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @amaliujia @cloud-fan Please, review this since your changes cause the test failure.

@cloud-fan
Copy link
Contributor

I'm confused, the test were all green when I merged it.

val result = sql(s"select distinct a from $table").collect()
assert(result.length === 1)
val data = result.head.get(0).asInstanceOf[scala.collection.mutable.ArraySeq[String]]
assert(data === Array("aaa") || data === Array("AAA"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the test becomes flaky? Reading the test query I think both aaa and AAA are correct result. cc @stefankandic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it becomes flaky. The returned value is not non-deterministic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean just non-deterministic, I guess.

@MaxGekk
Copy link
Member

MaxGekk commented Sep 9, 2024

I guess we should extend checkAnswer, so, it should be aware of collations in comparing of resulted strings. WDYT?

@panbingkun
Copy link
Contributor Author

I guess we should extend checkAnswer, so, it should be aware of collations in comparing of resulted strings. WDYT?

I want to do this, but maybe we can let it in first and turn GA green.

@MaxGekk
Copy link
Member

MaxGekk commented Sep 9, 2024

I want to do this, but maybe we can let it in first and turn GA green.

Agree, let's unblock other PRs first of all.

@MaxGekk
Copy link
Member

MaxGekk commented Sep 9, 2024

@panbingkun branch-3.5 is flaky too? Should we backport your changes to branch-3.5?

@panbingkun
Copy link
Contributor Author

@panbingkun branch-3.5 is flaky too? Should we backport your changes to branch-3.5?

Branch-3.5 does not have this class, as follows:
(base) ➜ spark-community git:(branch-3.5) ✗ find . -name "CollationSuite.scala"
(base) ➜ spark-community git:(branch-3.5) ✗

@LuciferYang LuciferYang changed the title [SPARK-49246][FOLLOWUP][TESTS] Fix CollationSuite [SPARK-49246][SQL][TESTS][FOLLOWUP] Fix CollationSuite Sep 9, 2024
checkAnswer(sql(s"select distinct a from $table"), Seq(Row(Seq("aaa"))))
val result = sql(s"select distinct a from $table").collect()
assert(result.length === 1)
val data = result.head.get(0).asInstanceOf[scala.collection.mutable.ArraySeq[String]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
val data = result.head.get(0).asInstanceOf[scala.collection.mutable.ArraySeq[String]]
val data = result.head.getSeq[String](0)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks!

@panbingkun panbingkun changed the title [SPARK-49246][SQL][TESTS][FOLLOWUP] Fix CollationSuite [SPARK-49246][SQL][TESTS][FOLLOWUP] Fix flaky CollationSuite Sep 9, 2024
@zhengruifeng
Copy link
Contributor

thanks @panbingkun for working on it, I was also confused for a while

@cloud-fan
Copy link
Contributor

We should discuss the general fix carefully. I don't think we should do case insensitive string comparison in checkAnswer blindly. For example, aAa is not a correct answer while aaa and AAA are.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 20643bb Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants