Spectral Embedding `nnz_t` by aamijar · Pull Request #1628 · rapidsai/cuvs

aamijar · 2025-12-10T01:21:48Z

Resolves #1243. Depends on rapidsai/raft#2891.

This PR adds a NNZType to the spectral embedding public api with precomputed connectivity graph.
The transform api for the precomputed connectivity graph has been switched to use the COO codepath all the way through the algorithm.

The spectral embedding api which passes in a dataset also has been switched to use the COO codepath and use int64_t by default.

copy-pr-bot · 2025-12-10T01:21:52Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

cjnolet · 2026-01-06T16:19:02Z

-    coo_to_csr_matrix(handle, n_samples, sym_coo_row_ind.view(), connectivity_graph);
-  auto laplacian =
-    create_laplacian(handle, spectral_embedding_config, csr_matrix_view, diagonal.view());
+  // raft::print_device_vector("connectivity_graph_vals", connectivity_graph.get_elements().data(),


Please don't leave commented out code in your changes

viclafargue

Thanks for working on this.

viclafargue · 2026-01-07T09:54:31Z

  create_connectivity_graph(handle, spectral_embedding_config, dataset, sym_coo_matrix);
  auto csr_matrix_view =
    coo_to_csr_matrix<float>(handle, n_samples, sym_coo_row_ind.view(), sym_coo_matrix.view());
-  auto laplacian =
-    create_laplacian<float>(handle, spectral_embedding_config, csr_matrix_view, diagonal.view());
+  auto laplacian = create_laplacian<float, raft::device_csr_matrix<float, int, int, int>>(
+    handle, spectral_embedding_config, csr_matrix_view, diagonal.view());
  compute_eigenpairs<float>(
-    handle, spectral_embedding_config, n_samples, laplacian, diagonal.view(), embedding);
+    handle, spectral_embedding_config, n_samples, laplacian.view(), diagonal.view(), embedding);
 }


It looks like the connectivity graph in the transform function that takes the dataset as argument is assumed to have a nnz of type int. Is this intentional? Will it be updated in a follow-up PR?

Good point, I'll try to change it so that it defaults to int64_t.

Addressed in 30ebe8e

Co-authored-by: Victor Lafargue <viclafargue@nvidia.com>

viclafargue

Looks good for the most part. But, some checks might be necessary.

Could you review the following for risks of integer overflow?

nnz overflow :
thrust::tabulate

n_samples*k_search overflow :
d_indices and d_distances allocation (extents are provided as integers which may cause an overflow internally before allocation)

less likely n_samples overflow :
config.max_iterations

RAFT operators that may use container extents internally for indexing :

raft::linalg::matrix_vector_op
raft::matrix::gather
raft::linalg::unary_op
raft::matrix::fill

It looks like the coo_to_csr_matrix function is not used anymore. Should we delete it or make it compatible with larger nnz? In this case sym_coo_row_ind would probably have to be of the nnz type.

Also, why do we keep two versions of the function (for the two nnz types)? Is this for legacy support or are there some performance implications? If there is no performance implication we should probably only use the uin64_t nnz in cuML.

tarang-jain · 2026-01-10T01:38:37Z

 }

-template <typename DataT>
+template <typename DataT, typename A>


Please use a more informative name here instead of A.

Addressed in 3d962a3

tarang-jain · 2026-01-10T01:40:47Z


-CUVS_INST_SPECTRAL_EMBEDDING(float);
-CUVS_INST_SPECTRAL_EMBEDDING(double);
+CUVS_INST_SPECTRAL_EMBEDDING(float, int);


Do we need the int instantiations here? Or can we skip them and stick to int64_t only?

We can keep the int ones to avoid breaking cuml and remove them later.

Yes sounds good.

Tracking here #1695

aamijar · 2026-01-10T03:18:32Z

Hi @viclafargue, thanks for the review! I have addressed your int overflow concerns in c81cb15.

Yes, we can remove the coo_to_csr since that is no longer used. I was keeping it around for debugging purposes. Removed in 54e3133
Yes, we can only instantiate the int64_t nnz type functions and drop int but we should keep it for now to avoid breaking cuml.

viclafargue

Thanks for working on this! LGTM

It could be interesting to see if we could handle the edge case of very large n_samples * config.n_components (matrix_vector_op & gather use) maybe as a follow-up PR though.

aamijar · 2026-01-12T22:53:01Z

It could be interesting to see if we could handle the edge case of very large n_samples * config.n_components (matrix_vector_op & gather use) maybe as a follow-up PR though.

Hi @viclafargue, so that would mean we need to change the output embedding to be indexed with int64_t right? I think we can address that in a follow up PR. Lanczos solver uses int for embedding output currently.

aamijar · 2026-01-13T00:36:35Z

/merge

Resolves #7225, Resolves #6910. Depends on rapidsai/cuvs#1628 This PR pulls in the int64_t support from cuvs to the spectral embedding cuml cpp api. This api is used during UMAP spectral initialization. Authors: - Anupam (https://github.com/aamijar) - Simon Adorf (https://github.com/csadorf) Approvers: - Jinsol Park (https://github.com/jinsolp) - Divye Gala (https://github.com/divyegala) - Victor Lafargue (https://github.com/viclafargue) URL: #7586

spectral-embedding-nnz

e72adcd

github-project-automation Bot added this to Unstructured Data Processing Dec 10, 2025

github-project-automation Bot moved this to Todo in Unstructured Data Processing Dec 10, 2025

aamijar self-assigned this Dec 10, 2025

aamijar moved this from Todo to In Progress in Unstructured Data Processing Dec 10, 2025

aamijar added non-breaking Introduces a non-breaking change improvement Improves an existing functionality labels Dec 10, 2025

Merge branch 'main' into spectral-embedding-nnz

3ab13ea

cjnolet reviewed Jan 6, 2026

View reviewed changes

clean up

9e7f2c8

aamijar mentioned this pull request Jan 7, 2026

UMAP with int64_t spectral initialization rapidsai/cuml#7586

Merged

aamijar added 2 commits January 7, 2026 00:46

Merge branch 'main' into spectral-embedding-nnz

81ebfb2

Merge branch 'main' into spectral-embedding-nnz

74087c9

aamijar marked this pull request as ready for review January 7, 2026 06:21

aamijar requested a review from a team as a code owner January 7, 2026 06:21

pin raft

8b653d2

aamijar requested a review from a team as a code owner January 7, 2026 06:23

aamijar commented Jan 7, 2026

View reviewed changes

Comment thread cpp/cmake/thirdparty/get_raft.cmake Outdated

update year

a7389df

viclafargue reviewed Jan 7, 2026

View reviewed changes

aamijar and others added 7 commits January 7, 2026 10:51

Update cpp/src/preprocessing/spectral/detail/spectral_embedding.cuh

bf4e1b7

Co-authored-by: Victor Lafargue <viclafargue@nvidia.com>

fix style

1d7dbc5

dataset api int64

30ebe8e

Merge branch 'main' into spectral-embedding-nnz

f29ec0a

remove coo_sort

fb7e26b

Merge branch 'main' into spectral-embedding-nnz

a93ef93

rename

a12e009

viclafargue reviewed Jan 9, 2026

View reviewed changes

Merge branch 'main' into spectral-embedding-nnz

58cb448

tarang-jain reviewed Jan 10, 2026

View reviewed changes

aamijar added 2 commits January 10, 2026 03:07

update risks for int overflow

c81cb15

rename type

3d962a3

remove coo_to_csr

54e3133

viclafargue approved these changes Jan 12, 2026

View reviewed changes

aamijar removed the request for review from a team January 12, 2026 22:09

tarang-jain approved these changes Jan 13, 2026

View reviewed changes

rapids-bot Bot merged commit 67fe5a0 into rapidsai:main Jan 13, 2026
190 of 193 checks passed

github-project-automation Bot moved this from In Progress to Done in Unstructured Data Processing Jan 13, 2026

aamijar mentioned this pull request Jan 15, 2026

[CI] SpectralEmbedding eigensolver convergence failure in scikit-learn accel tests rapidsai/cuml#7671

Open

1 task

Conversation

aamijar commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot Bot commented Dec 10, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

viclafargue left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

viclafargue left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aamijar commented Jan 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

viclafargue left a comment

Choose a reason for hiding this comment

Uh oh!

aamijar commented Jan 12, 2026

Uh oh!

aamijar commented Jan 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

aamijar commented Dec 10, 2025 •

edited

Loading

viclafargue left a comment •

edited

Loading

aamijar commented Jan 10, 2026 •

edited

Loading