-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Use prep_null_mask_filter to handle nulls in selection mask #9163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1209,7 +1209,14 @@ impl SMJStream { | |
| ) { | ||
| // The reverse of the selection mask. For the rows not pass join filter above, | ||
| // we need to join them (left or right) with null rows for outer joins. | ||
| let not_mask = compute::not(mask)?; | ||
| let not_mask = if mask.null_count() > 0 { | ||
| // If the mask contains nulls, we need to use `prep_null_mask_filter` to | ||
| // handle the nulls in the mask as false. | ||
viirya marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| compute::not(&compute::prep_null_mask_filter(mask))? | ||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Using the test in For the query Here we take reverse mask. As it is null, its reverse mask is still false, but this row should be selected here actually. So we need to call |
||
| } else { | ||
| compute::not(mask)? | ||
| }; | ||
|
|
||
| let null_joined_batch = | ||
| compute::filter_record_batch(&output_batch, ¬_mask)?; | ||
|
|
||
|
|
@@ -1254,6 +1261,20 @@ impl SMJStream { | |
|
|
||
| // For full join, we also need to output the null joined rows from the buffered side | ||
| if matches!(self.join_type, JoinType::Full) { | ||
| // Handle not mask for buffered side further. | ||
| // For buffered side, we want to output the rows that are not null joined with | ||
| // the streamed side. i.e. the rows that are not null in the `buffered_indices`. | ||
| let not_mask = if buffered_indices.null_count() > 0 { | ||
|
||
| let nulls = buffered_indices.nulls().unwrap(); | ||
| let mask = not_mask.values() & nulls.inner(); | ||
| BooleanArray::new(mask, None) | ||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For full outer join, we need to output buffered rows that fail join filter. But in the |
||
| } else { | ||
| not_mask | ||
| }; | ||
|
|
||
| let null_joined_batch = | ||
| compute::filter_record_batch(&output_batch, ¬_mask)?; | ||
|
|
||
| let mut streamed_columns = self | ||
| .streamed_schema | ||
| .fields() | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran the added test by #9080 again in a new laptop and found this bug. I'm not sure why previously the test passed locally and in CI in #9080. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I re-checked the results in
sort_merge_join.sltand it should be correct (as it is same asjoin.sltwhich is produced by hash join operator).