Skip to content

perf: Make is_in row-group pruning precise on null-containing haystacks#27495

Merged
ritchie46 merged 2 commits intopola-rs:mainfrom
azimafroozeh:fix/issues/27416_follow_up
May 4, 2026
Merged

perf: Make is_in row-group pruning precise on null-containing haystacks#27495
ritchie46 merged 2 commits intopola-rs:mainfrom
azimafroozeh:fix/issues/27416_follow_up

Conversation

@azimafroozeh
Copy link
Copy Markdown
Collaborator

@azimafroozeh azimafroozeh commented May 4, 2026

Follow-up to #27475. The helper try_extract_is_in_haystack used to give up on haystacks with nulls under default nulls_equal=false. Now it drops them (like pyarrow does); under nulls_equal=true, if a null was dropped, we add null_count(col) == 0 to the skip predicate so row groups with nulls aren't skipped.

Also fixes a hidden bug: col_has_no_nulls = col_nc.has_no_nulls(arena) computes a column-wide value that's always true against the stats table, so the check did nothing. Replaced with col_nc.eq(idx_zero, arena) so it checks per row group.

@github-actions github-actions Bot added performance Performance issues or improvements python Related to Python Polars rust Related to Rust Polars labels May 4, 2026
Comment thread crates/polars-plan/src/plans/aexpr/predicates/column_expr.rs Outdated
nameexhaustion
nameexhaustion previously approved these changes May 4, 2026
Copy link
Copy Markdown
Collaborator

@nameexhaustion nameexhaustion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 4, 2026

Codecov Report

❌ Patch coverage is 87.50000% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.14%. Comparing base (62f38ef) to head (5a9c1d4).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...rs-plan/src/plans/aexpr/predicates/skip_batches.rs 80.95% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #27495      +/-   ##
==========================================
+ Coverage   80.54%   81.14%   +0.60%     
==========================================
  Files        1842     1842              
  Lines      254718   254720       +2     
  Branches     3181     3181              
==========================================
+ Hits       205173   206705    +1532     
+ Misses      48722    47192    -1530     
  Partials      823      823              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@azimafroozeh azimafroozeh force-pushed the fix/issues/27416_follow_up branch from 17c87ab to 5a9c1d4 Compare May 4, 2026 11:55
@ritchie46 ritchie46 merged commit cde27ea into pola-rs:main May 4, 2026
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Performance issues or improvements python Related to Python Polars rust Related to Rust Polars

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants