Improve coalesce_kernel benchmark to capture inline vs non inline views #7619

alamb · 2025-06-06T18:51:00Z

Which issue does this PR close?

Follow on to Add coalesce kernel andBatchCoalescer for statefully combining selected b…atches: #7597

Rationale for this change

While reviewing the code and the concat kernel for

Add concatenate kernel benchmark for StringViewArray #7617

I realized there is a non trivial difference when there all inlined views vs some inlined views vs mostly large strings so the benchmarks should capture that

What changes are included in this PR?

Add variations of benchmark with different size strings in StringViewArray

Are there any user-facing changes?

If there are user-facing changes then we may require documentation to be updated before approving the PR.

If there are any breaking changes to public APIs, please call them out.

alamb · 2025-06-06T18:52:32Z

arrow/benches/coalesce_kernels.rs

            }
            .build();

+            // Model mostly short strings, but some longer ones


Previously all the benchmarks used a max size of 30. Now I have 20 (12/20 = 60% will be inlined views) and 128 where only 12/128 ~ 1% will be inlined views.

Dandandan

I think this is a nice extension :)

Improve coalesce_kernel benchmark

8c01689

github-actions bot added the arrow Changes to the arrow crate label Jun 6, 2025

alamb commented Jun 6, 2025

View reviewed changes

alamb changed the title ~~Improve coalesce_kernel benchmark~~ Improve coalesce_kernel benchmark to capture inline vs non inline views Jun 6, 2025

alamb mentioned this pull request Jun 6, 2025

Optimize coalesce kernel for StringViewArray (5-10%) #7620

Closed

Dandandan approved these changes Jun 6, 2025

View reviewed changes

Dandandan merged commit 44d7194 into apache:main Jun 6, 2025
24 checks passed

alamb deleted the alamb/improve_coalesce_benchmark branch June 6, 2025 23:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve coalesce_kernel benchmark to capture inline vs non inline views #7619

Improve coalesce_kernel benchmark to capture inline vs non inline views #7619

Uh oh!

alamb commented Jun 6, 2025

Uh oh!

alamb Jun 6, 2025

Uh oh!

Dandandan left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Improve coalesce_kernel benchmark to capture inline vs non inline views #7619

Improve coalesce_kernel benchmark to capture inline vs non inline views #7619

Uh oh!

Conversation

alamb commented Jun 6, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

Uh oh!

alamb Jun 6, 2025

Choose a reason for hiding this comment

Uh oh!

Dandandan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants