Tr/mem access patterns #2396

imreddyTeja · 2025-11-24T23:03:27Z

When a low resolution simulation is ran, the solver, which runs one thread per column, will not saturate the gpu. With the current launch config, a portion of the multiprocessors remain unused. This adds a function to catch that case, and use smaller block sizes.

Threads for stencils and pointwise operations were being launched by converting a single block and single grid index into a linear index, and then converting that into a cartesian index of (I, J, V, H), but the data is stored as (VIJFH). This adds special cases for v=63 and v=64, when i=j=4, that uses multiple block and grid dimensions, and indexes so adjacent threads are likely in the same column. Profiling shows this massively increases cache access patterns.
Before: 2.0 of the 32 bytes transmitted per sector are utilized (L2)
After: 26.2 of the 32 bytes transmitted per sector are utilized (L2)

TODO:
Test with more ClimaAtmos and coupled simulations.

Code follows the style guidelines OR N/A.
Unit tests are included OR N/A.
Code is exercised in an integration test OR N/A.
Documentation has been added/updated OR N/A.

imreddyTeja force-pushed the tr/mem-access-patterns branch from bde8907 to 33a1c4b Compare December 3, 2025 18:07

imreddyTeja added 3 commits December 3, 2025 10:10

Make gpu solver tests use higher resolution space

05db291

Ensure even distribution of solver threads in low-res cases

feb5b9d

Improve gpu data access pattern

00686ac

imreddyTeja force-pushed the tr/mem-access-patterns branch from 33a1c4b to 00686ac Compare December 3, 2025 18:10

imreddyTeja requested a review from dennisYatunin December 3, 2025 18:18

Ignore Base.Stacktraces in jet tests

94a9715

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tr/mem access patterns #2396

Tr/mem access patterns #2396

Uh oh!

imreddyTeja commented Nov 24, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Tr/mem access patterns #2396

Are you sure you want to change the base?

Tr/mem access patterns #2396

Uh oh!

Conversation

imreddyTeja commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

imreddyTeja commented Nov 24, 2025 •

edited

Loading