Migrate Python/Cython bindings from device_memory_resource* to resource_ref / any_resource
Part of #2011. Continues the work scoped in #2209.
Problem
The Python/Cython layer is entirely coupled to device_memory_resource*:
DeviceMemoryResource stores shared_ptr[device_memory_resource] and exposes
get_mr() -> device_memory_resource*
- All adaptor constructor declarations in
librmm/memory_resource.pxd take
device_memory_resource*, even though the C++ headers already accept
device_async_resource_ref
device_buffer.pxd constructor declarations use device_memory_resource*
per_device_resource.pxd only declares the pointer-based
get_current_device_resource() / set_current_device_resource() APIs
- 54 total references to
device_memory_resource across 7 Cython files
- Zero references to
resource_ref, any_resource, or
device_async_resource_ref in any .pxd / .pyx file
This coupling blocks removal of device_memory_resource from C++. By migrating
Python first, subsequent C++ removal becomes a pure C++ change with no
cross-language coordination.
Goal
After this work:
DeviceMemoryResource stores an any_device_resource
(any_resource<device_accessible>) instead of
shared_ptr[device_memory_resource]
- All Cython
.pxd declarations match the actual C++ signatures
(device_async_resource_ref, not device_memory_resource*)
- The Python-side
set_per_device_resource calls the *_ref C++ API
- No
.pxd or .pyx file references device_memory_resource
- The Python user-facing API is unchanged (backward compatible)
device_memory_resource still exists in C++ and all resources still inherit from
it. We are only cutting the Python-side dependency.
Design
The DeviceMemoryResource base class is retained in Python. Its internal storage
changes from shared_ptr[device_memory_resource] to any_device_resource. Every
concrete resource class (CudaMemoryResource, PoolMemoryResource, etc.)
constructs its C++ resource and stores it as an any_device_resource.
any_resource<device_accessible> is an owning, type-erased, copyable CCCL type
that subsumes the role of both shared_ptr and shared_resource from Python's
perspective. No shared_resource_wrapper or other indirection is needed.
To pass resources into C++ APIs that accept device_async_resource_ref, the
any_resource converts implicitly (it supports conversion to resource_ref).
Cython Limitations
- Cython cannot stack-allocate C++ template classes without a verifiable
nullary constructor. any_device_resource should be declared via a C++ typedef
(e.g., using any_device_resource = cuda::mr::any_resource<cuda::mr::device_accessible>)
and wrapped in Cython as an opaque type, or stored behind unique_ptr if needed.
- Resource refs must be constructed inline at call sites to avoid Cython's
nullary constructor requirement.
Tasks
1. Add CCCL type declarations to Cython .pxd files
Add cdef extern declarations for:
device_async_resource_ref (from rmm/resource_ref.hpp)
any_device_resource typedef for
cuda::mr::any_resource<cuda::mr::device_accessible>
Files:
python/rmm/rmm/librmm/memory_resource.pxd
python/rmm/rmm/librmm/per_device_resource.pxd
2. Update Cython .pxd declarations to match actual C++ signatures
The C++ adaptor constructors already take device_async_resource_ref, not
device_memory_resource*. Update the Cython declarations to be truthful.
Files and changes:
python/rmm/rmm/librmm/memory_resource.pxd -- change all adaptor constructor
parameters from device_memory_resource* to device_async_resource_ref
(pool, arena, fixed_size, binning, limiting, logging, statistics, tracking,
failure_callback, prefetch, aligned, thread_safe, callback)
python/rmm/rmm/librmm/device_buffer.pxd -- change device_buffer
constructor parameters from device_memory_resource* to
device_async_resource_ref
python/rmm/rmm/librmm/device_uvector.pxd -- update memory_resource()
return type
python/rmm/rmm/librmm/per_device_resource.pxd -- add *_ref function
declarations (set_per_device_resource_ref,
get_current_device_resource_ref, etc.)
3. Migrate DeviceMemoryResource storage to any_device_resource
Files:
python/rmm/rmm/pylibrmm/memory_resource/_memory_resource.pxd -- change
c_obj from shared_ptr[device_memory_resource] to any_device_resource;
replace get_mr() with a method returning device_async_resource_ref
python/rmm/rmm/pylibrmm/memory_resource/_memory_resource.pyx -- update all
__cinit__ methods, allocate(), deallocate(), and per-device resource
functions
Construction pattern changes from:
self.c_obj.reset(new cuda_memory_resource())
to:
self.c_obj = any_device_resource(cuda_memory_resource())
Allocation changes from:
self.c_obj.get().allocate(stream.view(), nbytes)
to calling allocate through the any_device_resource interface.
4. Update device_buffer.pyx
Pass device_async_resource_ref (obtained from the any_device_resource) to
device_buffer constructors instead of device_memory_resource*.
File: python/rmm/rmm/pylibrmm/device_buffer.pyx
5. Switch per-device resource Python API to *_ref C++ functions
Call set_per_device_resource_ref() / set_current_device_resource_ref()
instead of the pointer-based variants.
File: python/rmm/rmm/pylibrmm/memory_resource/_memory_resource.pyx
6. Remove all device_memory_resource references from Python
- Remove
device_memory_resource base class declarations from .pxd files
- Remove
device_memory_resource cimports
- Remove pointer-based per-device-resource declarations from
per_device_resource.pxd
Validation
build-rmm-python succeeds
- All Python tests pass (
test-rmm-python)
- No
.pxd or .pyx file contains device_memory_resource
References
Migrate Python/Cython bindings from
device_memory_resource*toresource_ref/any_resourcePart of #2011. Continues the work scoped in #2209.
Problem
The Python/Cython layer is entirely coupled to
device_memory_resource*:DeviceMemoryResourcestoresshared_ptr[device_memory_resource]and exposesget_mr() -> device_memory_resource*librmm/memory_resource.pxdtakedevice_memory_resource*, even though the C++ headers already acceptdevice_async_resource_refdevice_buffer.pxdconstructor declarations usedevice_memory_resource*per_device_resource.pxdonly declares the pointer-basedget_current_device_resource()/set_current_device_resource()APIsdevice_memory_resourceacross 7 Cython filesresource_ref,any_resource, ordevice_async_resource_refin any.pxd/.pyxfileThis coupling blocks removal of
device_memory_resourcefrom C++. By migratingPython first, subsequent C++ removal becomes a pure C++ change with no
cross-language coordination.
Goal
After this work:
DeviceMemoryResourcestores anany_device_resource(
any_resource<device_accessible>) instead ofshared_ptr[device_memory_resource].pxddeclarations match the actual C++ signatures(
device_async_resource_ref, notdevice_memory_resource*)set_per_device_resourcecalls the*_refC++ API.pxdor.pyxfile referencesdevice_memory_resourcedevice_memory_resourcestill exists in C++ and all resources still inherit fromit. We are only cutting the Python-side dependency.
Design
The
DeviceMemoryResourcebase class is retained in Python. Its internal storagechanges from
shared_ptr[device_memory_resource]toany_device_resource. Everyconcrete resource class (
CudaMemoryResource,PoolMemoryResource, etc.)constructs its C++ resource and stores it as an
any_device_resource.any_resource<device_accessible>is an owning, type-erased, copyable CCCL typethat subsumes the role of both
shared_ptrandshared_resourcefrom Python'sperspective. No
shared_resource_wrapperor other indirection is needed.To pass resources into C++ APIs that accept
device_async_resource_ref, theany_resourceconverts implicitly (it supports conversion toresource_ref).Cython Limitations
nullary constructor.
any_device_resourceshould be declared via a C++ typedef(e.g.,
using any_device_resource = cuda::mr::any_resource<cuda::mr::device_accessible>)and wrapped in Cython as an opaque type, or stored behind
unique_ptrif needed.nullary constructor requirement.
Tasks
1. Add CCCL type declarations to Cython
.pxdfilesAdd
cdef externdeclarations for:device_async_resource_ref(fromrmm/resource_ref.hpp)any_device_resourcetypedef forcuda::mr::any_resource<cuda::mr::device_accessible>Files:
python/rmm/rmm/librmm/memory_resource.pxdpython/rmm/rmm/librmm/per_device_resource.pxd2. Update Cython
.pxddeclarations to match actual C++ signaturesThe C++ adaptor constructors already take
device_async_resource_ref, notdevice_memory_resource*. Update the Cython declarations to be truthful.Files and changes:
python/rmm/rmm/librmm/memory_resource.pxd-- change all adaptor constructorparameters from
device_memory_resource*todevice_async_resource_ref(pool, arena, fixed_size, binning, limiting, logging, statistics, tracking,
failure_callback, prefetch, aligned, thread_safe, callback)
python/rmm/rmm/librmm/device_buffer.pxd-- changedevice_bufferconstructor parameters from
device_memory_resource*todevice_async_resource_refpython/rmm/rmm/librmm/device_uvector.pxd-- updatememory_resource()return type
python/rmm/rmm/librmm/per_device_resource.pxd-- add*_reffunctiondeclarations (
set_per_device_resource_ref,get_current_device_resource_ref, etc.)3. Migrate
DeviceMemoryResourcestorage toany_device_resourceFiles:
python/rmm/rmm/pylibrmm/memory_resource/_memory_resource.pxd-- changec_objfromshared_ptr[device_memory_resource]toany_device_resource;replace
get_mr()with a method returningdevice_async_resource_refpython/rmm/rmm/pylibrmm/memory_resource/_memory_resource.pyx-- update all__cinit__methods,allocate(),deallocate(), and per-device resourcefunctions
Construction pattern changes from:
self.c_obj.reset(new cuda_memory_resource())to:
Allocation changes from:
self.c_obj.get().allocate(stream.view(), nbytes)to calling
allocatethrough theany_device_resourceinterface.4. Update
device_buffer.pyxPass
device_async_resource_ref(obtained from theany_device_resource) todevice_bufferconstructors instead ofdevice_memory_resource*.File:
python/rmm/rmm/pylibrmm/device_buffer.pyx5. Switch per-device resource Python API to
*_refC++ functionsCall
set_per_device_resource_ref()/set_current_device_resource_ref()instead of the pointer-based variants.
File:
python/rmm/rmm/pylibrmm/memory_resource/_memory_resource.pyx6. Remove all
device_memory_resourcereferences from Pythondevice_memory_resourcebase class declarations from.pxdfilesdevice_memory_resourcecimportsper_device_resource.pxdValidation
build-rmm-pythonsucceedstest-rmm-python).pxdor.pyxfile containsdevice_memory_resourceReferences
device_memory_resource*#1500 (original issue: refactor Cython to use resource_ref)