Commit 9d75f87
authored
[Parquet]Optimize the performance in record reader (#8607)
# Which issue does this PR close?
Related to:
- #7456
- #8565
# Rationale for this change
Improve the performance in ParquetRecoredBatchReader, especially when
the `rowselector` is short.
- By changing a hash map to a enum array
# What changes are included in this PR?
For `parquet/src/arrow/array_reader/cached_array_reader.rs`, update the
hash function
# Are these changes tested?
The hashmaps are already covered by existing tests.
Also tested by manual read parquets.
# Are there any user-facing changes?
No
# Performance results in arrow_reader_row_filter.rs
on my 3950X
Benchmark | Change | Verdict
-- | -- | --
int64 == 9999 / all_columns / async | 🟢 -1.61% | Improved
int64 == 9999 / all_columns / sync | 🔴 +1.56% | Regressed
int64 == 9999 / exclude_filter_column / async | 🟢 -1.11% | Improved
int64 == 9999 / exclude_filter_column / sync | ⚪ -0.97% | Within noise
float64 > 99.0 / all_columns / async | 🟢 -6.25% | Improved
float64 > 99.0 / all_columns / sync | 🟢 -11.24% | Improved
float64 > 99.0 / exclude_filter_column / async | 🟢 -11.10% | Improved
float64 > 99.0 / exclude_filter_column / sync | 🟢 -3.31% | Improved
ts ≥ 9000 / all_columns / async | 🔴 +2.77% | Regressed
ts ≥ 9000 / all_columns / sync | ⚪ -0.06% | Within noise
ts ≥ 9000 / exclude_filter_column / async | 🟢 -2.54% | Improved
ts ≥ 9000 / exclude_filter_column / sync | ⚪ +0.28% | Within noise
int64 > 90 / all_columns / async | 🟢 -14.68% | Improved
int64 > 90 / all_columns / sync | 🟢 -21.00% | Improved
int64 > 90 / exclude_filter_column / async | 🟢 -17.66% | Improved
int64 > 90 / exclude_filter_column / sync | 🟢 -14.53% | Improved
float64 ≤ 99.0 / all_columns / async | 🟢 -9.20% | Improved
float64 ≤ 99.0 / all_columns / sync | 🟢 -11.07% | Improved
float64 ≤ 99.0 / exclude_filter_column / async | 🟢 -10.01% | Improved
float64 ≤ 99.0 / exclude_filter_column / sync | 🟢 -11.80% | Improved
ts < 9000 / all_columns / async | 🟢 -3.43% | Improved
ts < 9000 / all_columns / sync | 🟢 -6.23% | Improved
ts < 9000 / exclude_filter_column / async | 🟢 -4.00% | Improved
ts < 9000 / exclude_filter_column / sync | 🟢 -3.91% | Improved
utf8View <> '' / all_columns / async | 🟢 -16.56% | Improved
utf8View <> '' / all_columns / sync | 🟢 -12.10% | Improved
utf8View <> '' / exclude_filter_column / async | 🟢 -13.00% | Improved
utf8View <> '' / exclude_filter_column / sync | 🟢 -17.29% | Improved
float64 > 99.0 AND ts ≥ 9000 / all_columns / async | 🔴 +3.51% |
Regressed
float64 > 99.0 AND ts ≥ 9000 / all_columns / sync | 🟢 -2.19% | Improved
float64 > 99.0 AND ts ≥ 9000 / exclude_filter_column / async | 🟢 -2.63%
| Improved
float64 > 99.0 AND ts ≥ 9000 / exclude_filter_column / sync | 🟢 -2.72% |
Improved1 parent 4fc9302 commit 9d75f87
2 files changed
+39
-29
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
771 | 771 | | |
772 | 772 | | |
773 | 773 | | |
| 774 | + | |
| 775 | + | |
| 776 | + | |
| 777 | + | |
| 778 | + | |
774 | 779 | | |
775 | 780 | | |
776 | 781 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | | - | |
19 | | - | |
20 | 18 | | |
21 | 19 | | |
22 | | - | |
| 20 | + | |
23 | 21 | | |
24 | 22 | | |
25 | 23 | | |
| |||
68 | 66 | | |
69 | 67 | | |
70 | 68 | | |
71 | | - | |
| 69 | + | |
72 | 70 | | |
73 | | - | |
| 71 | + | |
74 | 72 | | |
75 | 73 | | |
76 | 74 | | |
| |||
81 | 79 | | |
82 | 80 | | |
83 | 81 | | |
84 | | - | |
| 82 | + | |
85 | 83 | | |
86 | | - | |
| 84 | + | |
87 | 85 | | |
88 | 86 | | |
89 | 87 | | |
| |||
136 | 134 | | |
137 | 135 | | |
138 | 136 | | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
139 | 143 | | |
140 | 144 | | |
141 | 145 | | |
142 | 146 | | |
143 | 147 | | |
144 | 148 | | |
145 | | - | |
146 | | - | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
147 | 153 | | |
148 | 154 | | |
149 | 155 | | |
| |||
153 | 159 | | |
154 | 160 | | |
155 | 161 | | |
156 | | - | |
| 162 | + | |
| 163 | + | |
157 | 164 | | |
158 | 165 | | |
159 | 166 | | |
| |||
168 | 175 | | |
169 | 176 | | |
170 | 177 | | |
171 | | - | |
| 178 | + | |
172 | 179 | | |
173 | 180 | | |
174 | 181 | | |
| |||
178 | 185 | | |
179 | 186 | | |
180 | 187 | | |
181 | | - | |
| 188 | + | |
| 189 | + | |
182 | 190 | | |
183 | 191 | | |
184 | 192 | | |
| |||
195 | 203 | | |
196 | 204 | | |
197 | 205 | | |
198 | | - | |
199 | | - | |
200 | 206 | | |
201 | 207 | | |
202 | 208 | | |
203 | 209 | | |
204 | 210 | | |
205 | | - | |
206 | | - | |
| 211 | + | |
| 212 | + | |
207 | 213 | | |
208 | 214 | | |
209 | | - | |
210 | | - | |
211 | | - | |
212 | | - | |
213 | | - | |
214 | | - | |
215 | | - | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
216 | 220 | | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
217 | 224 | | |
218 | 225 | | |
219 | 226 | | |
| |||
226 | 233 | | |
227 | 234 | | |
228 | 235 | | |
229 | | - | |
230 | | - | |
231 | | - | |
| 236 | + | |
| 237 | + | |
232 | 238 | | |
233 | 239 | | |
234 | 240 | | |
| |||
244 | 250 | | |
245 | 251 | | |
246 | 252 | | |
247 | | - | |
248 | | - | |
249 | | - | |
| 253 | + | |
| 254 | + | |
250 | 255 | | |
251 | 256 | | |
252 | 257 | | |
| |||
0 commit comments