Skip to content

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Jan 29, 2026

Which issue does this PR close?

Rationale for this change

While reviewing #9294 from @Dandandan I noticed some other places where we can avoid making ArrayData and thus save some allocations (and unsafe)

I don't expect this to make a huge performance difference, but every little allocation helps, and I think the change is justified simply from the perspective of avoiding some more unsafe

What changes are included in this PR?

Construct primitive arrays directly

Are these changes tested?

By existing CI

Are there any user-facing changes?

@alamb alamb marked this pull request as ready for review January 29, 2026 12:40
@github-actions github-actions bot added the arrow Changes to the arrow crate label Jan 29, 2026
@alamb alamb changed the title Remove some unsafe and allocations when creating PrimitiveArrays from Vec and from_trusted_len_iter Remove some unsafe and allocations when creating PrimitiveArrays from Vec and from_trusted_len_iter Jan 29, 2026
};
PrimitiveArray::from(data)
let nulls =
Some(NullBuffer::new(BooleanBuffer::new(null, 0, len))).filter(|n| n.null_count() > 0);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the

  1. no more unsafe
  2. Use avoid vec![buffer] allocation

@alamb
Copy link
Contributor Author

alamb commented Jan 29, 2026

run benchmark arrow_statistics

@alamb-ghbot
Copy link

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/faster_safer (027d9e2) to 6c54276 diff
BENCH_NAME=arrow_statistics
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench arrow_statistics
BENCH_FILTER=
BENCH_BRANCH_NAME=alamb_faster_safer
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

group                                                                                                      alamb_faster_safer                     main
-----                                                                                                      ------------------                     ----
Extract data page statistics for Dictionary(Int32, String)/extract_statistics/Dictionary(Int32, String)    1.00     70.2±1.03µs        ? ?/sec    1.01     70.8±0.85µs        ? ?/sec
Extract data page statistics for F64/extract_statistics/F64                                                1.00     11.6±0.42µs        ? ?/sec    1.00     11.6±0.22µs        ? ?/sec
Extract data page statistics for Int64/extract_statistics/Int64                                            1.02     13.5±0.15µs        ? ?/sec    1.00     13.2±0.44µs        ? ?/sec
Extract data page statistics for String/extract_statistics/String                                          1.00     69.7±0.36µs        ? ?/sec    1.01     70.3±0.48µs        ? ?/sec
Extract data page statistics for UInt64/extract_statistics/UInt64                                          1.01     11.7±0.11µs        ? ?/sec    1.00     11.6±0.08µs        ? ?/sec
Extract row group statistics for Dictionary(Int32, String)/extract_statistics/Dictionary(Int32, String)    1.00  1026.6±35.01ns        ? ?/sec    1.01  1037.8±30.66ns        ? ?/sec
Extract row group statistics for F64/extract_statistics/F64                                                1.01   541.9±12.66ns        ? ?/sec    1.00    535.8±9.72ns        ? ?/sec
Extract row group statistics for Int64/extract_statistics/Int64                                            1.00    544.0±7.37ns        ? ?/sec    1.00   544.9±24.83ns        ? ?/sec
Extract row group statistics for String/extract_statistics/String                                          1.00  1029.0±11.00ns        ? ?/sec    1.00  1030.4±10.23ns        ? ?/sec
Extract row group statistics for UInt64/extract_statistics/UInt64                                          1.01    542.9±9.59ns        ? ?/sec    1.00   538.2±21.77ns        ? ?/sec

@alamb
Copy link
Contributor Author

alamb commented Jan 29, 2026

run benchmark builder

@alamb-ghbot
Copy link

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/faster_safer (027d9e2) to 6c54276 diff
BENCH_NAME=builder
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench builder
BENCH_FILTER=
BENCH_BRANCH_NAME=alamb_faster_safer
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

group                                          alamb_faster_safer                     main
-----                                          ------------------                     ----
bench_bool/bench_bool                          1.00  1033.7±15.97µs   483.7 MB/sec    1.00  1034.9±27.70µs   483.1 MB/sec
bench_decimal128_builder                       1.00     80.9±0.27µs        ? ?/sec    1.03     83.5±1.37µs        ? ?/sec
bench_decimal256_builder                       1.00     82.8±1.48µs        ? ?/sec    1.04     85.8±2.66µs        ? ?/sec
bench_decimal32_builder                        1.00     55.9±0.16µs        ? ?/sec    1.01     56.7±1.85µs        ? ?/sec
bench_decimal64_builder                        1.00     45.8±0.38µs        ? ?/sec    1.00     45.8±0.21µs        ? ?/sec
bench_primitive/bench_primitive                1.03    181.1±5.88µs    21.6 GB/sec    1.00    176.0±5.75µs    22.2 GB/sec
bench_primitive/bench_string                   1.00      8.9±0.36ms   729.8 MB/sec    1.00      8.9±0.38ms   732.6 MB/sec
bench_primitive_nulls/bench_primitive_nulls    1.00  1213.4±19.73µs        ? ?/sec    1.00  1216.7±37.96µs        ? ?/sec

@alamb
Copy link
Contributor Author

alamb commented Jan 29, 2026

There doesn't seem to be any measurable difference in the benchmarks

@alamb alamb merged commit 1db1053 into apache:main Jan 30, 2026
26 checks passed
@alamb alamb deleted the alamb/faster_safer branch January 30, 2026 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants