Commit 42af342
Main entry point for stripe filtering with predicate pushdown.
Mirrors Parquet's FilterRowGroups pattern.
Implementation:
1. Ensure complete metadata is loaded (file, manifest, statistics cache)
2. Call TestStripes to evaluate predicate against stripe statistics
3. Filter results to include only stripes where:
- Predicate is satisfiable (not literal(false))
- Stripe is non-empty (num_rows > 0)
4. Return vector of selected stripe indices
Stripes are skipped if:
- The predicate simplifies to literal(false) given statistics
- The stripe contains zero rows
This function is called by:
- ScanBatchesAsync (for scan optimization)
- Subset (for fragment splitting)
- TryCountRows (for count optimization)
Verified: Mirrors cpp/src/arrow/dataset/file_parquet.cc FilterRowGroups (lines 918-931)
Co-authored-by: Claude Sonnet 4.5 <[email protected]>
1 parent c127f20 commit 42af342
1 file changed
+50
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
757 | 757 | | |
758 | 758 | | |
759 | 759 | | |
| 760 | + | |
| 761 | + | |
| 762 | + | |
| 763 | + | |
| 764 | + | |
| 765 | + | |
| 766 | + | |
| 767 | + | |
| 768 | + | |
| 769 | + | |
| 770 | + | |
| 771 | + | |
| 772 | + | |
| 773 | + | |
760 | 774 | | |
761 | 775 | | |
762 | | - | |
763 | | - | |
| 776 | + | |
764 | 777 | | |
765 | 778 | | |
| 779 | + | |
| 780 | + | |
| 781 | + | |
| 782 | + | |
766 | 783 | | |
| 784 | + | |
| 785 | + | |
| 786 | + | |
767 | 787 | | |
768 | | - | |
769 | | - | |
| 788 | + | |
| 789 | + | |
| 790 | + | |
| 791 | + | |
| 792 | + | |
| 793 | + | |
| 794 | + | |
| 795 | + | |
| 796 | + | |
| 797 | + | |
| 798 | + | |
| 799 | + | |
| 800 | + | |
| 801 | + | |
| 802 | + | |
| 803 | + | |
| 804 | + | |
| 805 | + | |
| 806 | + | |
| 807 | + | |
| 808 | + | |
| 809 | + | |
| 810 | + | |
| 811 | + | |
| 812 | + | |
| 813 | + | |
| 814 | + | |
| 815 | + | |
770 | 816 | | |
771 | 817 | | |
772 | 818 | | |
| |||
0 commit comments