Improve data density visualization by sampling dense chunks by oxkitsune · Pull Request #11766 · rerun-io/rerun

oxkitsune · 2025-11-03T22:01:31Z

What

Rendering the data density graph now uniformly samples rows from chunks with many rows, instead of skipping them.
This does not solve the problem and there's a better, albeit more involved solution discussed in #7200.

Benchmarks (M3 Pro):

For chunks with 5k (around the threshold) I get:

// 5k
sampling/5000/sample_0        time:   [2.1716 µs 2.2432 µs 2.3375 µs]
sampling/5000/sample_4000 time:   [57.836 µs 57.937 µs 58.033 µs]
sampling/5000/sample_8000 time:   [62.420 µs 68.867 µs 76.966 µs]

So around the threshold, we see that for 5k rows with max_sampled_events_per_chunk=4000, we sample 4k events, which is actually faster than sampling the full chunk at max_sampled_events_per_chunk=8000.

For chunks with 20k-100k rows I get the following:

// 20k
sampling/20000/sample_0 time:           [2.1293 µs 2.1467 µs 2.1631 µs]
sampling/20000/sample_4000 time:    [130.47 µs 131.00 µs 131.39 µs]
sampling/20000/sample_8000 time:    [159.53 µs 159.68 µs 159.78 µs]

// 50k
sampling/50000/sample_0        time:    [2.1483 µs 2.2347 µs 2.3770 µs]
sampling/50000/sample_4000 time:    [278.60 µs 279.56 µs 281.00 µs]
sampling/50000/sample_8000 time:    [307.62 µs 317.30 µs 345.24 µs]

// 100k
sampling/100000/sample_0        time:   [2.1437 µs 2.2881 µs 2.6224 µs]
sampling/100000/sample_4000 time:   [521.40 µs 522.86 µs 525.53 µs]
sampling/100000/sample_8000 time:   [552.32 µs 553.91 µs 555.89 µs]

here sample_0 is the original behavior. We'd be able to go with 8000 samples without much of a performance hit.

github-actions · 2025-11-03T22:01:56Z

Web viewer built successfully.

Result	Commit	Link	Manifest
✅	`cf6707c`	https://rerun.io/viewer/pr/11766	`+nightly` `+main`

View image diff on kitdiff.

^{Note: This comment is updated whenever you push a commit.}

emilk

Nice idea!

Please make sure crates/viewer/re_time_panel/benches/bench_density_graph.rs covers this, and share some numbers of what effect it has 🙏

emilk · 2025-11-04T07:25:57Z

crates/viewer/re_time_panel/src/data_density_graph.rs

+
+            // When chunks are too large to render all events, sample this many events uniformly
+            // to create a good enough density estimate.
+            max_sampled_events_per_chunk: 4_000,


What's the motivation behind this particular number?

The limit is configured at 8k for unsorted chunks, so I figured lets take half. However, after doing some benchmarking, it seems we can use 8k without much of a loss.

For chunks with 5k (around the threshold) I get:

// 5k sampling/5000/sample_0 time: [2.1716 µs 2.2432 µs 2.3375 µs] sampling/5000/sample_4000 time: [57.836 µs 57.937 µs 58.033 µs] sampling/5000/sample_8000 time: [62.420 µs 68.867 µs 76.966 µs]

So around the threshold, we see that for 5k rows with max_sampled_events_per_chunk=4000, we sample 4k events, which is actually faster than sampling the full chunk at max_sampled_events_per_chunk=8000.

For chunks with 20k-100k rows I get the following:

// 20k sampling/20000/sample_0 time: [2.1293 µs 2.1467 µs 2.1631 µs] sampling/20000/sample_4000 time: [130.47 µs 131.00 µs 131.39 µs] sampling/20000/sample_8000 time: [159.53 µs 159.68 µs 159.78 µs] // 50k sampling/50000/sample_0 time: [2.1483 µs 2.2347 µs 2.3770 µs] sampling/50000/sample_4000 time: [278.60 µs 279.56 µs 281.00 µs] sampling/50000/sample_8000 time: [307.62 µs 317.30 µs 345.24 µs] // 100k sampling/100000/sample_0 time: [2.1437 µs 2.2881 µs 2.6224 µs] sampling/100000/sample_4000 time: [521.40 µs 522.86 µs 525.53 µs] sampling/100000/sample_8000 time: [552.32 µs 553.91 µs 555.89 µs]

here sample_0 is the original behavior. We'd be able to go with 8000 samples without much of a performance hit.

crates/viewer/re_time_panel/src/data_density_graph.rs

…11766)" This reverts commit 13470a2.

…11766)" (#11918)

oxkitsune added 📺 re_viewer affects re_viewer itself include in changelog labels Nov 3, 2025

emilk approved these changes Nov 4, 2025

View reviewed changes

emilk reviewed Nov 4, 2025

View reviewed changes

crates/viewer/re_time_panel/src/data_density_graph.rs Outdated Show resolved Hide resolved

emilk approved these changes Nov 4, 2025

View reviewed changes

crates/viewer/re_time_panel/src/data_density_graph.rs Outdated Show resolved Hide resolved

crates/viewer/re_time_panel/src/data_density_graph.rs Outdated Show resolved Hide resolved

oxkitsune added 7 commits November 4, 2025 13:43

Improve data density visualization by sampling dense chunks

6970df5

Bump sample count to 4000

f15da9b

Fix comments + logic

dc0aeb9

Fix comment

df44720

Clarify comments

c048254

Reweigh count based on sample size

52f6a49

Add benchmark + use f32

fa872db

oxkitsune force-pushed the gijs/sample-from-chunk branch from c5bb0b5 to fa872db Compare November 4, 2025 13:14

emilk requested changes Nov 4, 2025

View reviewed changes

crates/viewer/re_time_panel/src/data_density_graph.rs Outdated Show resolved Hide resolved

suggestions

cf6707c

emilk approved these changes Nov 4, 2025

View reviewed changes

oxkitsune merged commit 13470a2 into main Nov 4, 2025
69 of 70 checks passed

oxkitsune deleted the gijs/sample-from-chunk branch November 4, 2025 15:37

grtlr pushed a commit that referenced this pull request Nov 5, 2025

Improve data density visualization by sampling dense chunks (#11766)

eeb0aaa

IsseW added a commit that referenced this pull request Nov 12, 2025

Revert "Improve data density visualization by sampling dense chunks (#…

6120393

…11766)" This reverts commit 13470a2.

IsseW added a commit that referenced this pull request Nov 13, 2025

Revert "Improve data density visualization by sampling dense chunks (#…

d0bb24a

…11766)" This reverts commit 13470a2.

oxkitsune added a commit that referenced this pull request Nov 18, 2025

Revert "Improve data density visualization by sampling dense chunks (#…

7368f01

…11766)" This reverts commit 13470a2.

This was referenced Nov 18, 2025

Cache chunk time statistics for efficient data density sampling #11917

Open

Revert "Improve data density visualization by sampling dense chunks (#11766)" #11918

Merged

oxkitsune added a commit that referenced this pull request Nov 18, 2025

Revert "Improve data density visualization by sampling dense chunks (#…

49fef1b

…11766)" (#11918)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Improve data density visualization by sampling dense chunks#11766

Improve data density visualization by sampling dense chunks#11766
oxkitsune merged 8 commits intomainfrom
gijs/sample-from-chunk

oxkitsune commented Nov 3, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 3, 2025 •

edited

Loading

Uh oh!

emilk left a comment

Uh oh!

emilk Nov 4, 2025

Uh oh!

oxkitsune Nov 4, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

oxkitsune commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related

What

Uh oh!

github-actions bot commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

emilk left a comment

Choose a reason for hiding this comment

Uh oh!

emilk Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

oxkitsune Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

oxkitsune commented Nov 3, 2025 •

edited

Loading

github-actions bot commented Nov 3, 2025 •

edited

Loading