Use cuda::stream_ref for stream usage by bdice · Pull Request #2372 · rapidsai/rmm

bdice · 2026-05-01T03:02:53Z

⚠️ This is draft, not ready for review yet. There are several iterations yet to do.

Summary

Replace internal rmm::cuda_stream_view usage with cuda::stream_ref across containers and memory-resource implementations.
Preserve existing rmm::cuda_stream_view compatibility APIs while converting at boundaries where needed.
Keep stream formatting and synchronous fallback paths on cuda::stream_ref.

Validation

cmake --build cpp/build/conda/cuda-13.1/release -j$(nproc)
ctest --test-dir cpp/build/conda/cuda-13.1/release -R 'DEVICE_UVECTOR|DEVICE_SCALAR|DEVICE_BUFFER|CONTAINER_MULTIDEVICE' --output-on-failure
ctest --test-dir cpp/build/conda/cuda-13.1/release -R 'DEVICE_BUFFER|DEVICE_UVECTOR|PREFETCH|POOL_MR|FIXED_SIZE|ARENA_MR|THRUST_ALLOCATOR|POLYMORPHIC_ALLOCATOR|STREAM_ADAPTOR|CALLBACK_MR|STATISTICS|TRACKING|LIMITING|ALIGNED' --output-on-failure
git diff --name-only --diff-filter=ACM | xargs pre-commit run --files

copy-pr-bot · 2026-05-01T03:02:56Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-05-01T03:08:00Z

📝 Walkthrough

Summary by CodeRabbit

API Updates
- Stream parameter types standardized to use CUDA stream references; one async element-assignment overload removed.
Internal Improvements
- Unified stream handling and synchronization across memory management and device operations for consistent behavior.
Tests & Benchmarks
- Updated tests and benchmarks to use the new stream reference type; minor SPDX/year metadata updates.

Walkthrough

Replaces usages of rmm::cuda_stream_view with CUDA's cuda::stream_ref across headers, sources, allocators, memory-resources, benchmarks, and tests; updates includes, function signatures, stream access from .value() → .get(), and adjusts explicit synchronization calls where needed.

Changes

Stream type migration (single cohesive DAG)

Layer / File(s)	Summary
Public API / Declarations `cpp/include/...` (`device_scalar.hpp`, `device_uvector.hpp`, `device_buffer.hpp`, `mr/.../*.hpp`, `polymorphic_allocator.hpp`, `thrust_allocator_adaptor.hpp`, `prefetch.hpp`, `detail/format.hpp`)	Replace public/private function signatures and private stream members to use `cuda::stream_ref`; add `#include <cuda/stream_ref>` where required.
Core Implementation `cpp/src/...` (`device_buffer.cpp`, `prefetch.cpp`, `exec_policy.cpp`, `mr/detail/*_impl.cpp`, `mr/detail/arena_memory_resource_impl.cpp`, `mr/detail/fixed_size_memory_resource_impl.cpp`, `mr/detail/pool_memory_resource_impl.cpp`, `mr/detail/stream_ordered_memory_resource.hpp`/impl)	Update callsites to construct/use `cuda::stream_ref`, switch `.value()` → `.get()` for CUDA API arguments, and replace prior `stream.synchronize_no_throw()` / `.synchronize()` uses with explicit `cudaStreamSynchronize(stream.get())` wrapped with RMM macros where applicable.
Adapters & Allocator internals `cpp/include/rmm/mr/polymorphic_allocator.hpp`, `cpp/include/rmm/mr/thrust_allocator_adaptor.hpp`, `cpp/include/rmm/mr/detail/failure_callback_resource_adaptor_impl.hpp`	Change private stored stream types to `cuda::stream_ref` (initialized from `cudaStream_t{nullptr}`) and adjust internal construction/initialization.
Benchmarks & Tools `cpp/benchmarks/`, `cpp/benchmarks/synchronization/`	Migrate benchmark code to accept/use `cuda::stream_ref` and call `.get()` for CUDA APIs; adjust synchronization calls and SPDX year updates.
Tests `cpp/tests/`, `cpp/tests/mr/`, test helpers (`mock_resource.hpp`, `device_check_resource_adaptor.hpp`, etc.)	Update test signatures, fixtures, and invocations to pass `cuda::stream_ref{...}` (default null or `cudaStreamPerThread`) and replace in-test synchronizations/accessors accordingly.
Small utilities / formatting `cpp/include/rmm/detail/format.hpp`	`format_stream` now accepts `cuda::stream_ref` and formats using `stream.get()`.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Merge staging into main: CCCL memory resource migration #2361: Systematic migration from rmm::cuda_stream_view to cuda::stream_ref touching the same APIs and callsites.
Add implicit conversion from rmm::cuda_stream to cuda::stream_ref #2326: Changes enabling/adding conversions between RMM stream types and cuda::stream_ref, directly related to this migration.
Merge main into staging #2257: Overlaps modifications to logging_resource_adaptor_impl and related stream handling.

Suggested labels

breaking, improvement

Suggested reviewers

shrshi
wence-

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 19.07% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: replacing rmm::cuda_stream_view with cuda::stream_ref across the codebase.
Description check	✅ Passed	The description is related to the changeset, providing context about replacing internal stream usage and preserving compatibility APIs.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

_{Review rate limit: 9/10 reviews remaining, refill in 6 minutes.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/benchmarks/device_uvector/device_uvector_bench.cu`:
- Line 94: The cudaMemsetAsync call using vec.data(), num_elements and
stream.get() must be wrapped with the RMM_CUDA_TRY macro to surface CUDA errors;
replace the unchecked call cudaMemsetAsync(vec.data(), 0, num_elements *
sizeof(std::int32_t), stream.get()) with
RMM_CUDA_TRY(cudaMemsetAsync(vec.data(), 0, num_elements * sizeof(std::int32_t),
stream.get())) so failures are reported (ensure RMM_CUDA_TRY is available in the
translation unit).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6a80af3f-7a8f-4182-886b-2d10727b816d

📥 Commits

Reviewing files that changed from the base of the PR and between a8b279c and 73ad4eb.

📒 Files selected for processing (7)

cpp/benchmarks/cuda_stream_pool/cuda_stream_pool_bench.cpp
cpp/benchmarks/device_uvector/device_uvector_bench.cu
cpp/benchmarks/multi_stream_allocations/multi_stream_allocations_bench.cu
cpp/benchmarks/random_allocations/random_allocations.cpp
cpp/benchmarks/replay/replay.cpp
cpp/benchmarks/synchronization/synchronization.cpp
cpp/benchmarks/synchronization/synchronization.hpp

✅ Files skipped from review due to trivial changes (1)

cpp/benchmarks/cuda_stream_pool/cuda_stream_pool_bench.cpp

coderabbitai · 2026-05-01T03:20:25Z

    auto vec = Vector(num_elements, stream);
    if (zero_init) {
-      cudaMemsetAsync(vec.data(), 0, num_elements * sizeof(std::int32_t), stream.value());
+      cudaMemsetAsync(vec.data(), 0, num_elements * sizeof(std::int32_t), stream.get());


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify async memset call sites and wrapper usage in this file rg -n -C2 'cudaMemsetAsync\(' cpp/benchmarks/device_uvector/device_uvector_bench.cu rg -n 'RMM_CUDA_TRY\s*\(\s*cudaMemsetAsync' cpp/benchmarks/device_uvector/device_uvector_bench.cu

Repository: rapidsai/rmm

Length of output: 248

Wrap cudaMemsetAsync with RMM_CUDA_TRY to detect CUDA errors.

Line 94 has an unchecked cudaMemsetAsync call that could silently fail. Per coding guidelines, all CUDA API calls must be wrapped with RMM_CUDA_TRY to detect errors early.

Suggested patch

- cudaMemsetAsync(vec.data(), 0, num_elements * sizeof(std::int32_t), stream.get()); + RMM_CUDA_TRY( + cudaMemsetAsync(vec.data(), 0, num_elements * sizeof(std::int32_t), stream.get()));

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

cudaMemsetAsync(vec.data(), 0, num_elements * sizeof(std::int32_t), stream.get());

RMM_CUDA_TRY(

cudaMemsetAsync(vec.data(), 0, num_elements * sizeof(std::int32_t), stream.get()));

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cpp/benchmarks/device_uvector/device_uvector_bench.cu` at line 94, The cudaMemsetAsync call using vec.data(), num_elements and stream.get() must be wrapped with the RMM_CUDA_TRY macro to surface CUDA errors; replace the unchecked call cudaMemsetAsync(vec.data(), 0, num_elements * sizeof(std::int32_t), stream.get()) with RMM_CUDA_TRY(cudaMemsetAsync(vec.data(), 0, num_elements * sizeof(std::int32_t), stream.get())) so failures are reported (ensure RMM_CUDA_TRY is available in the translation unit).

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

cpp/tests/mr/mr_ref_test_mt_helpers.hpp (1)
64-65: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Wrap the event synchronizations with RMM_CUDA_TRY.

Line 65 and Line 86 still ignore the result of cudaEventSynchronize. If either synchronization fails, these helpers can mask the CUDA error and make the MT stream-ordering tests harder to trust.
Suggested fix
-  cudaEventSynchronize(event);
+  RMM_CUDA_TRY(cudaEventSynchronize(event));
...
-  cudaEventSynchronize(event);
+  RMM_CUDA_TRY(cudaEventSynchronize(event));
As per coding guidelines, "Check all CUDA errors with RMM_CUDA_TRY, RMM_EXPECTS, RMM_FAIL macros; unchecked errors in memory operations, synchronization, and device calls cause silent corruption".

Also applies to: 85-86
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mr/mr_ref_test_mt_helpers.hpp` around lines 64 - 65, The
cudaEventSynchronize calls in mr_ref_test_mt_helpers.hpp are un-checked and
should be wrapped with RMM_CUDA_TRY to surface CUDA errors; replace occurrences
of cudaEventSynchronize(event) (and the second instance later in the file) with
RMM_CUDA_TRY(cudaEventSynchronize(event)); ensure you include the RMM header if
not already present and keep the synchronization semantics unchanged.
cpp/tests/mr/mr_ref_test.hpp (1)
245-270: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use the async MR APIs in test_random_async_allocations.

This helper now threads cuda::stream_ref through the signature, but Line 262 and Line 269 still use allocate_sync / deallocate_sync. That means the RandomAllocations* tests keep exercising the synchronous path and won't catch regressions in stream-ordered allocation/deallocation.
Suggested fix
     [&generator, &distribution, &ref, stream](allocation& alloc) {
       alloc.size = distribution(generator);
-      EXPECT_NO_THROW(alloc.ptr = ref.allocate_sync(alloc.size, rmm::CUDA_ALLOCATION_ALIGNMENT));
+      EXPECT_NO_THROW(
+        alloc.ptr = ref.allocate(stream, alloc.size, rmm::CUDA_ALLOCATION_ALIGNMENT));
       RMM_CUDA_TRY(cudaStreamSynchronize(stream.get()));
       EXPECT_NE(nullptr, alloc.ptr);
       EXPECT_TRUE(is_properly_aligned(alloc.ptr));
     });
 
   std::for_each(allocations.begin(), allocations.end(), [stream, &ref](allocation& alloc) {
-    EXPECT_NO_THROW(ref.deallocate_sync(alloc.ptr, alloc.size, rmm::CUDA_ALLOCATION_ALIGNMENT));
+    EXPECT_NO_THROW(ref.deallocate(stream, alloc.ptr, alloc.size, rmm::CUDA_ALLOCATION_ALIGNMENT));
     RMM_CUDA_TRY(cudaStreamSynchronize(stream.get()));
   });
As per coding guidelines, "All async memory operations must accept cuda_stream_view parameter and handle stream synchronization before memory is returned to pool or deallocated".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mr/mr_ref_test.hpp` around lines 245 - 270, In
test_random_async_allocations replace the synchronous calls
allocate_sync/deallocate_sync with the stream-ordered async MR APIs so the
helper actually tests async behavior: update the allocation lambda to call the
resource's async allocate (the allocate_async variant that accepts the
cuda::stream_ref or cuda_stream_view and alignment) and update the deallocate
lambda to call the corresponding async deallocate (deallocate_async with stream
+ alignment), then ensure you still synchronize the provided stream
(RMM_CUDA_TRY(cudaStreamSynchronize(stream.get()))) at the same points so the
test waits for the async operations to complete; target the function
test_random_async_allocations and the lambdas that currently call alloc.ptr =
ref.allocate_sync(...) and ref.deallocate_sync(...).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/include/rmm/device_scalar.hpp`:
- Around line 219-221: The call in set_value_to_zero_async is passing
value_type{0} to _storage.set_element_to_zero_async but that helper expects an
element index (size_type) not a value; update the call in
set_value_to_zero_async to pass element index 0 (i.e. use 0 as the first
argument) so _storage.set_element_to_zero_async(0, stream) is invoked; reference
functions/fields: set_value_to_zero_async, _storage.set_element_to_zero_async,
and value_type to locate and correct the call.

In `@cpp/tests/device_buffer_tests.cu`:
- Around line 60-80: Multiple tests repeat the null-stream construction
(cuda::stream_ref{cudaStream_t{nullptr}} and
rmm::cuda_stream_view{cudaStream_t{nullptr}}); introduce file-scope constants
(e.g., kNullStreamRef and kNullCudaStreamView) and replace each literal with
those constants in tests that construct rmm::device_buffer or query
buff.stream() (references: DeviceBufferTest, TYPED_TESTs EmptyBuffer and
DefaultMemoryResource, local variables buf/buff), ensuring the constants have
the same types and are initialized once at top of the file so all occurrences
reuse them.

---

Outside diff comments:
In `@cpp/tests/mr/mr_ref_test_mt_helpers.hpp`:
- Around line 64-65: The cudaEventSynchronize calls in
mr_ref_test_mt_helpers.hpp are un-checked and should be wrapped with
RMM_CUDA_TRY to surface CUDA errors; replace occurrences of
cudaEventSynchronize(event) (and the second instance later in the file) with
RMM_CUDA_TRY(cudaEventSynchronize(event)); ensure you include the RMM header if
not already present and keep the synchronization semantics unchanged.

In `@cpp/tests/mr/mr_ref_test.hpp`:
- Around line 245-270: In test_random_async_allocations replace the synchronous
calls allocate_sync/deallocate_sync with the stream-ordered async MR APIs so the
helper actually tests async behavior: update the allocation lambda to call the
resource's async allocate (the allocate_async variant that accepts the
cuda::stream_ref or cuda_stream_view and alignment) and update the deallocate
lambda to call the corresponding async deallocate (deallocate_async with stream
+ alignment), then ensure you still synchronize the provided stream
(RMM_CUDA_TRY(cudaStreamSynchronize(stream.get()))) at the same points so the
test waits for the async operations to complete; target the function
test_random_async_allocations and the lambdas that currently call alloc.ptr =
ref.allocate_sync(...) and ref.deallocate_sync(...).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 40051d54-d640-4c86-8a96-ad99018c6d69

📥 Commits

Reviewing files that changed from the base of the PR and between 73ad4eb and fc86f8a.

📒 Files selected for processing (23)

cpp/include/rmm/device_scalar.hpp
cpp/tests/container_multidevice_tests.cu
cpp/tests/cuda_stream_tests.cpp
cpp/tests/device_buffer_tests.cu
cpp/tests/device_check_resource_adaptor.hpp
cpp/tests/device_scalar_tests.cpp
cpp/tests/device_uvector_tests.cpp
cpp/tests/mock_resource.hpp
cpp/tests/mr/aligned_mr_tests.cpp
cpp/tests/mr/arena_mr_tests.cpp
cpp/tests/mr/cccl_mr_ref_test_allocation.hpp
cpp/tests/mr/cccl_mr_ref_test_basic.hpp
cpp/tests/mr/cccl_mr_ref_test_mt.hpp
cpp/tests/mr/failure_callback_mr_tests.cpp
cpp/tests/mr/mr_ref_test.hpp
cpp/tests/mr/mr_ref_test_allocation.hpp
cpp/tests/mr/mr_ref_test_basic.hpp
cpp/tests/mr/mr_ref_test_mt.hpp
cpp/tests/mr/mr_ref_test_mt_helpers.hpp
cpp/tests/mr/pool_mr_tests.cpp
cpp/tests/mr/statistics_mr_tests.cpp
cpp/tests/mr/thrust_allocator_tests.cu
cpp/tests/mr/tracking_mr_tests.cpp

coderabbitai · 2026-05-03T18:09:21Z

+  void set_value_to_zero_async(cuda::stream_ref stream)
  {
    _storage.set_element_to_zero_async(value_type{0}, stream);


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify the callee signature and this call site side-by-side. rg -n -C2 'set_element_to_zero_async\s*\(' cpp/include/rmm/device_uvector.hpp cpp/include/rmm/device_scalar.hpp

Repository: rapidsai/rmm

Length of output: 1085

Use element index 0 in zeroing call (Line 221).

set_element_to_zero_async expects an element index as its first argument, not a value. Passing value_type{0} is incorrect—the function signature requires size_type element_index (see line 248 in device_uvector.hpp). This will fail for trivially copyable types that are not implicitly convertible to the index type.

Proposed fix

- _storage.set_element_to_zero_async(value_type{0}, stream); + _storage.set_element_to_zero_async(size_type{0}, stream);

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

void set_value_to_zero_async(cuda::stream_ref stream)

{

_storage.set_element_to_zero_async(value_type{0}, stream);

void set_value_to_zero_async(cuda::stream_ref stream)

{

_storage.set_element_to_zero_async(size_type{0}, stream);

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cpp/include/rmm/device_scalar.hpp` around lines 219 - 221, The call in set_value_to_zero_async is passing value_type{0} to _storage.set_element_to_zero_async but that helper expects an element index (size_type) not a value; update the call in set_value_to_zero_async to pass element index 0 (i.e. use 0 as the first argument) so _storage.set_element_to_zero_async(0, stream) is invoked; reference functions/fields: set_value_to_zero_async, _storage.set_element_to_zero_async, and value_type to locate and correct the call.

coderabbitai · 2026-05-03T18:09:21Z

+  auto buf = rmm::device_buffer(10, cuda::stream_ref{cudaStream_t{nullptr}}, ref);
  EXPECT_EQ(buf.size(), 10);
 }

 TYPED_TEST(DeviceBufferTest, EmptyBuffer)
 {
-  rmm::device_buffer buff(0, rmm::cuda_stream_default);
+  rmm::device_buffer buff(0, cuda::stream_ref{cudaStream_t{nullptr}});
  EXPECT_TRUE(buff.is_empty());
 }

 TYPED_TEST(DeviceBufferTest, DefaultMemoryResource)
 {
-  rmm::device_buffer buff(this->size, rmm::cuda_stream_default);
+  rmm::device_buffer buff(this->size, cuda::stream_ref{cudaStream_t{nullptr}});
  EXPECT_NE(nullptr, buff.data());
  EXPECT_EQ(this->size, buff.size());
  EXPECT_EQ(this->size, buff.ssize());
  EXPECT_EQ(this->size, buff.capacity());
  EXPECT_EQ(rmm::device_async_resource_ref{rmm::mr::get_current_device_resource_ref()},
            buff.memory_resource());
-  EXPECT_EQ(rmm::cuda_stream_default, buff.stream());
+  EXPECT_EQ(rmm::cuda_stream_view{cudaStream_t{nullptr}}, buff.stream());
 }


🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Reduce repeated null-stream literals with shared test constants.

cuda::stream_ref{cudaStream_t{nullptr}} and rmm::cuda_stream_view{cudaStream_t{nullptr}} are repeated throughout this file. Please define shared file-scope constants/helpers and reuse them to keep test updates safer and less noisy.

Proposed refactor pattern

@@ namespace testing { namespace thrust = THRUST_NS_QUALIFIER; } // namespace testing using namespace testing; + +namespace { +auto const k_null_stream_ref = cuda::stream_ref{cudaStream_t{nullptr}}; +auto const k_null_stream_view = rmm::cuda_stream_view{cudaStream_t{nullptr}}; +} // namespace @@ - auto buf = rmm::device_buffer(10, cuda::stream_ref{cudaStream_t{nullptr}}, ref); + auto buf = rmm::device_buffer(10, k_null_stream_ref, ref); @@ - rmm::device_buffer buff(0, cuda::stream_ref{cudaStream_t{nullptr}}); + rmm::device_buffer buff(0, k_null_stream_ref); @@ - EXPECT_EQ(rmm::cuda_stream_view{cudaStream_t{nullptr}}, buff.stream()); + EXPECT_EQ(k_null_stream_view, buff.stream());

As per coding guidelines, "Detect significant code duplication (3+ occurrences of similar code); refactor into shared utility functions or base classes."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cpp/tests/device_buffer_tests.cu` around lines 60 - 80, Multiple tests repeat the null-stream construction (cuda::stream_ref{cudaStream_t{nullptr}} and rmm::cuda_stream_view{cudaStream_t{nullptr}}); introduce file-scope constants (e.g., kNullStreamRef and kNullCudaStreamView) and replace each literal with those constants in tests that construct rmm::device_buffer or query buff.stream() (references: DeviceBufferTest, TYPED_TESTs EmptyBuffer and DefaultMemoryResource, local variables buf/buff), ensuring the constants have the same types and are initialized once at top of the file so all occurrences reuse them.

Use cuda::stream_ref for stream usage

a8b279c

github-project-automation Bot added this to RMM Project Board May 1, 2026

Use cuda::stream_ref in benchmarks

73ad4eb

coderabbitai Bot reviewed May 1, 2026

View reviewed changes

Use cuda::stream_ref in tests

fc86f8a

coderabbitai Bot reviewed May 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use cuda::stream_ref for stream usage#2372

Use cuda::stream_ref for stream usage#2372
bdice wants to merge 3 commits intorapidsai:mainfrom
bdice:stream-ref-usage

bdice commented May 1, 2026 •

edited

Loading

Uh oh!

copy-pr-bot Bot commented May 1, 2026

Uh oh!

coderabbitai Bot commented May 1, 2026 •

edited

Loading

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 1, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 3, 2026

Uh oh!

coderabbitai Bot May 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	cudaMemsetAsync(vec.data(), 0, num_elements * sizeof(std::int32_t), stream.get());
	RMM_CUDA_TRY(
	cudaMemsetAsync(vec.data(), 0, num_elements * sizeof(std::int32_t), stream.get()));

Conversation

bdice commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Uh oh!

copy-pr-bot Bot commented May 1, 2026

Uh oh!

coderabbitai Bot commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bdice commented May 1, 2026 •

edited

Loading

coderabbitai Bot commented May 1, 2026 •

edited

Loading