-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Use logical null count in case_when_with_expr
#18872
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
e37d026 to
5e41c7f
Compare
comphead
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pepijnve, I ran those tests on main now and their passed 🤔
I would expect them fail without this PR change?
No, that's expected. The code was computing the correct result in spite of this. What was happens on With this change the null handling code path is taken and the null values get filtered out before reaching the when branch handling. |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pepijnve -- I went over the tests carefully and it looks good to me
comphead
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pepijnve for the explanation, makes a lot of sense
## Which issue does this PR close? - None, followup to apache#18152 ## Rationale for this change `case_when_with_expr` has a code path to handle `null` values early on in the evaluation. It determines the presence of nulls using `value_array.null_count() > 0`. For nested arrays this may not be correct. `logical_null_count()` should be used instead. ## What changes are included in this PR? Check `logical_null_count` instead of `null_count`. ## Are these changes tested? Additional tests added to ensure the nested array case is tested. The test already passed before this change, but was a bit less efficient since the null values were tested for equality against each possible when value. ## Are there any user-facing changes? No
Which issue does this PR close?
Rationale for this change
case_when_with_exprhas a code path to handlenullvalues early on in the evaluation. It determines the presence of nulls usingvalue_array.null_count() > 0. For nested arrays this may not be correct.logical_null_count()should be used instead.What changes are included in this PR?
Check
logical_null_countinstead ofnull_count.Are these changes tested?
Additional tests added to ensure the nested array case is tested. The test already passed before this change, but was a bit less efficient since the null values were tested for equality against each possible when value.
Are there any user-facing changes?
No