Skip to content

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Jun 18, 2025

Which issue does this PR close?

Rationale for this change

I am exploring the hypothesis that the reuse of the input batches / allocations in the coalesce_kernel benchmark is skewing the results

What changes are included in this PR?

Deep copy the batches passed to the benchmark to more accurately reflect real world use of data being produced in one source before being colesced

Are there any user-facing changes?

No this is a benchmark only change

@github-actions github-actions bot added the arrow Changes to the arrow crate label Jun 18, 2025
@alamb alamb force-pushed the alamb/more_realistic_benchmark branch from 7364db2 to 7c8112e Compare June 18, 2025 13:53
@alamb
Copy link
Contributor Author

alamb commented Jun 18, 2025

Using this updated benchmark

Running this command:

cargo bench --bench coalesce_kernels  -- "mixed_utf8view \(max_string_len=128\), 8192, nulls: 0.1, selectivity: 0.8"

Results in these meaurements on main and branch

Benchmarking filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 0.8: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.9s, enable flat sampling, or reduce sample count to 40.
filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 0.8
                        time:   [1.9285 ms 1.9592 ms 1.9921 ms]
                        change: [−29.038% −28.009% −26.891%] (p = 0.00 < 0.05)
                        Performance has improved.
                        time:   [1.8064 ms 1.8350 ms 1.8648 ms]
                        time:   [1.9849 ms 2.0140 ms 2.0429 ms]
                        time:   [1.9166 ms 1.9308 ms 1.9466 ms]

On the branch

                        time:   [2.5685 ms 2.5797 ms 2.5912 ms]
                        time:   [2.6001 ms 2.6277 ms 2.6642 ms]
                        time:   [2.5834 ms 2.6086 ms 2.6374 ms]                        

So something else is still going on

@alamb alamb closed this Jun 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant