Skip to content

Use bounce buffer ring to optimize local pread#921

Draft
kingcrimsontianyu wants to merge 50 commits intorapidsai:mainfrom
kingcrimsontianyu:use-unified-bb-for-local-io
Draft

Use bounce buffer ring to optimize local pread#921
kingcrimsontianyu wants to merge 50 commits intorapidsai:mainfrom
kingcrimsontianyu:use-unified-bb-for-local-io

Conversation

@kingcrimsontianyu
Copy link
Contributor

@kingcrimsontianyu kingcrimsontianyu commented Feb 2, 2026

Related PRs and issues

Depends on #913 #919
Addresses #914

@kingcrimsontianyu kingcrimsontianyu added improvement Improves an existing functionality c++ Affects the C++ API of KvikIO labels Feb 2, 2026
@copy-pr-bot
Copy link

copy-pr-bot bot commented Feb 2, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@kingcrimsontianyu
Copy link
Contributor Author

/ok to test

@kingcrimsontianyu kingcrimsontianyu changed the title [WIP] Use bounce buffer ring to optimize local pread Use bounce buffer ring to optimize local pread Feb 4, 2026
@kingcrimsontianyu
Copy link
Contributor Author

/ok to test

@kingcrimsontianyu
Copy link
Contributor Author

kingcrimsontianyu commented Feb 5, 2026

Performance check

Sequential read

  • ipp1-3304

    • main: syscall pread and host-to-device copy do not overlap
image
  • This PR, ring with 4 buffers: syscall pread and host-to-device copy overlap
image
  • Hot BW: read bandwidth when the data is in the page cache
  • Cold BW: read bandwidth when the data has been cleared from the page cache
  • Cold BW: same with the above except that KVIKIO_AUTO_DIRECT_IO_READ=1 which makes opportunistic use of Direct I/O.
Branch Threads Hot BW [GiB/s] Cold BW [GiB/s] Cold BW (auto direct I/O) [GiB/s]
main 1 3770.7933 1993.8511 4027.8119
main 4 13368.3568 4225.3567 6450.4380
ring (1 buffer) 1 4086.5414 1885.2542 3890.0020
ring (1 buffer) 4 13336.1997 4280.2705 6463.3447
ring (4 buffers) 1 4165.6537 1852.0735 4019.2380
ring (4 buffers) 4 14927.6879 4745.0288 6466.7356

@kingcrimsontianyu
Copy link
Contributor Author

/ok to test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

c++ Affects the C++ API of KvikIO improvement Improves an existing functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant