Skip to content

[Java] Adding tests to use CuVSDeviceMatrix (device memory) directly as a CagraIndex input dataset#1340

Merged
rapids-bot[bot] merged 4 commits intorapidsai:branch-25.10from
ldematte:java/device-matrix-cagra-index-tests
Sep 25, 2025
Merged

[Java] Adding tests to use CuVSDeviceMatrix (device memory) directly as a CagraIndex input dataset#1340
rapids-bot[bot] merged 4 commits intorapidsai:branch-25.10from
ldematte:java/device-matrix-cagra-index-tests

Conversation

@ldematte
Copy link
Copy Markdown
Contributor

When we introduced CuVSDeviceMatrix in #1232, we made it possible to use device-memory-backed dataset as an input for index build: since we accept a CuVSMatrix, and we have correct toTensor implementations for CPU and GPU, and the underlying functions in libcuvs support different memory types and sizes (through DLManagedTensor information), this became supported "naturally".
However, we never tested this explicitly.

This PR adds tests to check and show that using CuVSDeviceMatrix (device memory) directly as a CagraIndex input dataset works as intended.

(similar tests for other index types will be added as follow-ups)

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Sep 18, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

public void close() {
if (mustClose()) {
checkCuVSError(cuvsRMMFree(cuvsResourceHandle, pointer, numBytes), "cuvsRMMFree");
pointer = MemorySegment.NULL;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a small enhancement that came up while testing: it protects us from a double close().
I thought it was worth adding it.

Copy link
Copy Markdown
Contributor

@ChrisHegarty ChrisHegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


for (int n = 0; n < rows; ++n) {
for (int i = 0; i < cols; ++i) {
assertEquals(array1[n][i], array2[n][i], DELTA);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the sort of thing that C++ does nicely. :]

assertSame2dArray would be a function template. After the element type T of array1 is deduced, line#265 would be a call to compareEquals<T>, which would have type-specific specializations.

compareEquals<float> would use DELTA for the epsilon-compare.

Copy link
Copy Markdown
Contributor

@mythrocks mythrocks Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All that said, this changed version does remove quite a bit of code repetition. 👏

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the sort of thing that C++ does nicely. :]

Yep.
I have planned another iteration of test cleanup/extension, as I have spotted other places where we are uncovered/where we can improve things.
Let me think if I can further de-duplicate here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me think if I can further de-duplicate here.

I don't think that's necessary. It's pretty sleek right now. We can add polish later, if required.

@mythrocks mythrocks added improvement Improves an existing functionality non-breaking Introduces a non-breaking change testing labels Sep 23, 2025
@mythrocks
Copy link
Copy Markdown
Contributor

/ok to test 8c80760

@mythrocks
Copy link
Copy Markdown
Contributor

/ok to test 00f10f6

@mythrocks mythrocks changed the title [Review][Java] Adding tests to use CuVSDeviceMatrix (device memory) directly as a CagraIndex input dataset [Java] Adding tests to use CuVSDeviceMatrix (device memory) directly as a CagraIndex input dataset Sep 25, 2025
@mythrocks
Copy link
Copy Markdown
Contributor

/ok to test 841b107

@mythrocks
Copy link
Copy Markdown
Contributor

/merge

@rapids-bot rapids-bot Bot merged commit c55cfd7 into rapidsai:branch-25.10 Sep 25, 2025
84 checks passed
@mythrocks
Copy link
Copy Markdown
Contributor

This change has been merged. Thank you for the contribution, @ldematte.

@ldematte ldematte deleted the java/device-matrix-cagra-index-tests branch September 25, 2025 10:55
enp1s0 pushed a commit to enp1s0/cuvs that referenced this pull request Oct 22, 2025
…as a CagraIndex input dataset (rapidsai#1340)

When we introduced `CuVSDeviceMatrix` in rapidsai#1232, we made it possible to use device-memory-backed dataset as an input for index build: since we accept a `CuVSMatrix`, and we have correct `toTensor` implementations for CPU and GPU, and the underlying functions in libcuvs support different memory types and sizes (through DLManagedTensor information), this became supported "naturally".
However, we never tested this explicitly. 

This PR adds tests to check and show that using CuVSDeviceMatrix (device memory) directly as a CagraIndex input dataset works as intended.

(similar tests for other index types will be added as follow-ups)

Authors:
  - Lorenzo Dematté (https://github.com/ldematte)
  - MithunR (https://github.com/mythrocks)

Approvers:
  - Chris Hegarty (https://github.com/ChrisHegarty)
  - MithunR (https://github.com/mythrocks)

URL: rapidsai#1340
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change testing

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants