Skip to content

Fix the launch bounds for nn-descent kernel for 1210#967

Closed
achirkin wants to merge 7 commits intobranch-25.06from
achirkin-backport-nn-descent-launch-bounds
Closed

Fix the launch bounds for nn-descent kernel for 1210#967
achirkin wants to merge 7 commits intobranch-25.06from
achirkin-backport-nn-descent-launch-bounds

Conversation

@achirkin
Copy link
Copy Markdown
Contributor

@achirkin achirkin commented Jun 2, 2025

A fix to make cuVS build for a target architecture 1210

Backport of #965 from 25.08 to 25.06

@achirkin achirkin self-assigned this Jun 2, 2025
@achirkin achirkin requested a review from a team as a code owner June 2, 2025 06:26
@achirkin achirkin added the bug Something isn't working label Jun 2, 2025
@achirkin achirkin moved this to In Progress in Unstructured Data Processing Jun 2, 2025
@github-actions github-actions Bot added the cpp label Jun 2, 2025
@achirkin achirkin added the non-breaking Introduces a non-breaking change label Jun 2, 2025
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jun 2, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.20%. Comparing base (0a35104) to head (78a1651).
Report is 1 commits behind head on branch-25.06.

Additional details and impacted files
@@              Coverage Diff              @@
##           branch-25.06     #967   +/-   ##
=============================================
  Coverage         83.20%   83.20%           
=============================================
  Files                21       21           
  Lines               131      131           
=============================================
  Hits                109      109           
  Misses               22       22           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread cpp/src/neighbors/detail/nn_descent.cuh Outdated
Comment thread cpp/src/neighbors/detail/nn_descent.cuh Outdated
Comment thread cpp/src/neighbors/detail/nn_descent.cuh Outdated
@jakirkham
Copy link
Copy Markdown
Member

Saw the following test failures in this CI job:

The following tests FAILED:
	  9 - NEIGHBORS_ANN_CAGRA_FLOAT_UINT32_TEST (Failed)
	 10 - NEIGHBORS_ANN_CAGRA_HALF_UINT32_TEST (Failed)
	 11 - NEIGHBORS_ANN_CAGRA_INT8_UINT32_TEST (Failed)
	 12 - NEIGHBORS_ANN_CAGRA_UINT8_UINT32_TEST (Failed)

Trying restarting in case this is just flakiness

@achirkin
Copy link
Copy Markdown
Contributor Author

achirkin commented Jun 2, 2025

The error is way to big to be just flakiness:

2025-06-02T15:46:59.1556329Z [ RUN      ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/668
2025-06-02T15:46:59.1556533Z [  2647][15:40:34:623583][info  ] optimizing graph
2025-06-02T15:46:59.1557128Z [  2647][15:40:34:682181][info  ] Graph optimized, creating index
2025-06-02T15:46:59.1557337Z [  2647][15:40:34:910399][info  ] optimizing graph
2025-06-02T15:46:59.1557575Z [  2647][15:40:34:969064][info  ] Graph optimized, creating index
2025-06-02T15:46:59.1557776Z [  2647][15:40:35:281402][info  ] optimizing graph
2025-06-02T15:46:59.1558016Z [  2647][15:40:35:415630][info  ] Graph optimized, creating index
2025-06-02T15:46:59.1558413Z [  2647][15:40:35:421747][info  ] Recall = 0.686000 (686/1000), the error is 1993.3% above the threshold (eps = 0.006000).
2025-06-02T15:46:59.1558899Z /tmp/conda-bld-output/bld/rattler-build_libcuvs/work/cpp/tests/neighbors/ann_cagra/../ann_cagra.cuh:1103: Failure
2025-06-02T15:46:59.1559396Z Value of: eval_neighbours(indices_naive, indices_Cagra, distances_naive, distances_Cagra, ps.n_queries, ps.k, 0.006, min_recall)
2025-06-02T15:46:59.1559921Z   Actual: false (actual recall (0.68600000000000005) is lower than the minimum expected recall (0.98499999999999999); eps = 0.0060000000000000001. )
2025-06-02T15:46:59.1560094Z Expected: true
2025-06-02T15:46:59.1560351Z 
2025-06-02T15:46:59.1560525Z 10000x32, 10
2025-06-02T15:46:59.1560687Z query 0
2025-06-02T15:46:59.1560921Z  indices=[3620,9120,2922,8422,9467,3967,5761,261,9534,4034];
2025-06-02T15:46:59.1561215Z n dist=[8.20325,8.20325,9.65429,9.65429,9.77335,9.77335,10.2873,10.2873,10.7788,10.7788];
2025-06-02T15:46:59.1561508Z c dist=[16.0779,8.20325,23.1528,9.65429,9.77335,19.3681,10.2873,27.6479,10.7788,16.9891];
2025-06-02T15:46:59.1561997Z /tmp/conda-bld-output/bld/rattler-build_libcuvs/work/cpp/tests/neighbors/ann_cagra/../ann_cagra.cuh:1111: Failure
2025-06-02T15:46:59.1562614Z Value of: eval_distances(handle_, database.data(), search_queries.data(), indices_dev.data(), distances_dev.data(), ps.n_rows, ps.dim, ps.n_queries, ps.k, ps.metric, 1.0e-4)
2025-06-02T15:46:59.1562787Z   Actual: false
2025-06-02T15:46:59.1562955Z Expected: true
2025-06-02T15:46:59.1563036Z 
2025-06-02T15:46:59.1564025Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/668, where GetParam() = {n_queries=100, dataset shape=10000x32, k=10, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=AUTO, merge_logic=PHYSICAL}
2025-06-02T15:46:59.1564190Z  (1030 ms)
2025-06-02T15:46:59.1564570Z [ RUN      ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/669
2025-06-02T15:46:59.1564919Z [  2647][15:40:35:710443][info  ] optimizing graph
2025-06-02T15:46:59.1565165Z [  2647][15:40:35:775068][info  ] Graph optimized, creating index
2025-06-02T15:46:59.1565366Z [  2647][15:40:35:986749][info  ] optimizing graph
2025-06-02T15:46:59.1565609Z [  2647][15:40:36:045152][info  ] Graph optimized, creating index
2025-06-02T15:46:59.1565997Z [  2647][15:40:36:073774][info  ] Recall = 0.761000 (761/1000), the error is 1493.3% above the threshold (eps = 0.006000).
2025-06-02T15:46:59.1566483Z /tmp/conda-bld-output/bld/rattler-build_libcuvs/work/cpp/tests/neighbors/ann_cagra/../ann_cagra.cuh:1103: Failure
2025-06-02T15:46:59.1567308Z Value of: eval_neighbours(indices_naive, indices_Cagra, distances_naive, distances_Cagra, ps.n_queries, ps.k, 0.006, min_recall)
2025-06-02T15:46:59.1567836Z   Actual: false (actual recall (0.76100000000000001) is lower than the minimum expected recall (0.98499999999999999); eps = 0.0060000000000000001. )
2025-06-02T15:46:59.1568014Z Expected: true
2025-06-02T15:46:59.1568095Z 
2025-06-02T15:46:59.1568260Z 10000x32, 10
2025-06-02T15:46:59.1568425Z query 0
2025-06-02T15:46:59.1568657Z  indices=[9120,3620,8422,3967,9467,5877,5944,5761,3048,8548];
2025-06-02T15:46:59.1568953Z n dist=[8.20325,8.20325,9.65429,9.77335,9.77335,9.83969,10.2227,10.2873,10.3589,10.3589];
2025-06-02T15:46:59.1569238Z c dist=[8.20325,16.0779,9.65429,19.3681,9.77335,9.83969,10.2227,10.2873,21.8471,10.3589];
2025-06-02T15:46:59.1569724Z /tmp/conda-bld-output/bld/rattler-build_libcuvs/work/cpp/tests/neighbors/ann_cagra/../ann_cagra.cuh:1111: Failure
2025-06-02T15:46:59.1570347Z Value of: eval_distances(handle_, database.data(), search_queries.data(), indices_dev.data(), distances_dev.data(), ps.n_rows, ps.dim, ps.n_queries, ps.k, ps.metric, 1.0e-4)
2025-06-02T15:46:59.1570520Z   Actual: false
2025-06-02T15:46:59.1570685Z Expected: true
2025-06-02T15:46:59.1570766Z 
2025-06-02T15:46:59.1571733Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/669, where GetParam() = {n_queries=100, dataset shape=10000x32, k=10, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=AUTO, merge_logic=LOGICAL}
2025-06-02T15:46:59.1571903Z  (651 ms)

Full list of failed tests with fp32 data:

2025-06-02T15:46:59.5078220Z [  FAILED  ] 53 tests, listed below:
2025-06-02T15:46:59.5079206Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/668, where GetParam() = {n_queries=100, dataset shape=10000x32, k=10, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=AUTO, merge_logic=PHYSICAL}
2025-06-02T15:46:59.5079289Z 
2025-06-02T15:46:59.5080245Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/669, where GetParam() = {n_queries=100, dataset shape=10000x32, k=10, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=AUTO, merge_logic=LOGICAL}
2025-06-02T15:46:59.5080329Z 
2025-06-02T15:46:59.5081343Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/672, where GetParam() = {n_queries=100, dataset shape=10000x32, k=10, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=AUTO, merge_logic=PHYSICAL}
2025-06-02T15:46:59.5081513Z 
2025-06-02T15:46:59.5082511Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/673, where GetParam() = {n_queries=100, dataset shape=10000x32, k=10, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=AUTO, merge_logic=LOGICAL}
2025-06-02T15:46:59.5082600Z 
2025-06-02T15:46:59.5083617Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/704, where GetParam() = {n_queries=100, dataset shape=5000x32, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=1)}
2025-06-02T15:46:59.5083698Z 
2025-06-02T15:46:59.5084771Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/705, where GetParam() = {n_queries=100, dataset shape=5000x32, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=IVF_PQ, merge_logic=LOGICAL(refine_rate=1)}
2025-06-02T15:46:59.5084857Z 
2025-06-02T15:46:59.5085860Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/706, where GetParam() = {n_queries=100, dataset shape=5000x32, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=2)}
2025-06-02T15:46:59.5085950Z 
2025-06-02T15:46:59.5087573Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/707, where GetParam() = {n_queries=100, dataset shape=5000x32, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=IVF_PQ, merge_logic=LOGICAL(refine_rate=2)}
2025-06-02T15:46:59.5087662Z 
2025-06-02T15:46:59.5088687Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/708, where GetParam() = {n_queries=100, dataset shape=5000x32, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=3)}
2025-06-02T15:46:59.5088774Z 
2025-06-02T15:46:59.5089768Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/709, where GetParam() = {n_queries=100, dataset shape=5000x32, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=IVF_PQ, merge_logic=LOGICAL(refine_rate=3)}
2025-06-02T15:46:59.5089849Z 
2025-06-02T15:46:59.5090998Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/716, where GetParam() = {n_queries=100, dataset shape=5000x32, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=1)}
2025-06-02T15:46:59.5091140Z 
2025-06-02T15:46:59.5092389Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/717, where GetParam() = {n_queries=100, dataset shape=5000x32, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=LOGICAL(refine_rate=1)}
2025-06-02T15:46:59.5092473Z 
2025-06-02T15:46:59.5093719Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/718, where GetParam() = {n_queries=100, dataset shape=5000x32, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=2)}
2025-06-02T15:46:59.5093800Z 
2025-06-02T15:46:59.5095048Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/719, where GetParam() = {n_queries=100, dataset shape=5000x32, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=LOGICAL(refine_rate=2)}
2025-06-02T15:46:59.5095129Z 
2025-06-02T15:46:59.5096283Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/720, where GetParam() = {n_queries=100, dataset shape=5000x32, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=3)}
2025-06-02T15:46:59.5096364Z 
2025-06-02T15:46:59.5097828Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/721, where GetParam() = {n_queries=100, dataset shape=5000x32, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=LOGICAL(refine_rate=3)}
2025-06-02T15:46:59.5097909Z 
2025-06-02T15:46:59.5098926Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/728, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=1)}
2025-06-02T15:46:59.5099010Z 
2025-06-02T15:46:59.5100105Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/729, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=IVF_PQ, merge_logic=LOGICAL(refine_rate=1)}
2025-06-02T15:46:59.5100183Z 
2025-06-02T15:46:59.5101197Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/730, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=2)}
2025-06-02T15:46:59.5101287Z 
2025-06-02T15:46:59.5102303Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/731, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=IVF_PQ, merge_logic=LOGICAL(refine_rate=2)}
2025-06-02T15:46:59.5102385Z 
2025-06-02T15:46:59.5103396Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/732, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=3)}
2025-06-02T15:46:59.5103563Z 
2025-06-02T15:46:59.5104566Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/733, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=IVF_PQ, merge_logic=LOGICAL(refine_rate=3)}
2025-06-02T15:46:59.5104650Z 
2025-06-02T15:46:59.5105801Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/740, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=1)}
2025-06-02T15:46:59.5105944Z 
2025-06-02T15:46:59.5107475Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/741, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=LOGICAL(refine_rate=1)}
2025-06-02T15:46:59.5107558Z 
2025-06-02T15:46:59.5108623Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/742, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=2)}
2025-06-02T15:46:59.5108704Z 
2025-06-02T15:46:59.5109763Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/743, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=LOGICAL(refine_rate=2)}
2025-06-02T15:46:59.5109927Z 
2025-06-02T15:46:59.5110995Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/744, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=3)}
2025-06-02T15:46:59.5111076Z 
2025-06-02T15:46:59.5112123Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_U32/745, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=LOGICAL(refine_rate=3)}
2025-06-02T15:46:59.5112204Z 
2025-06-02T15:46:59.5113162Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/668, where GetParam() = {n_queries=100, dataset shape=10000x32, k=10, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=AUTO, merge_logic=PHYSICAL}
2025-06-02T15:46:59.5113246Z 
2025-06-02T15:46:59.5114271Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/669, where GetParam() = {n_queries=100, dataset shape=10000x32, k=10, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=AUTO, merge_logic=LOGICAL}
2025-06-02T15:46:59.5114350Z 
2025-06-02T15:46:59.5115534Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/672, where GetParam() = {n_queries=100, dataset shape=10000x32, k=10, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=AUTO, merge_logic=PHYSICAL}
2025-06-02T15:46:59.5115615Z 
2025-06-02T15:46:59.5117213Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/673, where GetParam() = {n_queries=100, dataset shape=10000x32, k=10, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=AUTO, merge_logic=LOGICAL}
2025-06-02T15:46:59.5117473Z 
2025-06-02T15:46:59.5118495Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/704, where GetParam() = {n_queries=100, dataset shape=5000x32, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=1)}
2025-06-02T15:46:59.5118585Z 
2025-06-02T15:46:59.5119591Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/706, where GetParam() = {n_queries=100, dataset shape=5000x32, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=2)}
2025-06-02T15:46:59.5119669Z 
2025-06-02T15:46:59.5120767Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/708, where GetParam() = {n_queries=100, dataset shape=5000x32, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=3)}
2025-06-02T15:46:59.5120912Z 
2025-06-02T15:46:59.5121979Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/716, where GetParam() = {n_queries=100, dataset shape=5000x32, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=1)}
2025-06-02T15:46:59.5122074Z 
2025-06-02T15:46:59.5123124Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/717, where GetParam() = {n_queries=100, dataset shape=5000x32, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=LOGICAL(refine_rate=1)}
2025-06-02T15:46:59.5123211Z 
2025-06-02T15:46:59.5124284Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/718, where GetParam() = {n_queries=100, dataset shape=5000x32, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=2)}
2025-06-02T15:46:59.5124474Z 
2025-06-02T15:46:59.5125535Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/719, where GetParam() = {n_queries=100, dataset shape=5000x32, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=LOGICAL(refine_rate=2)}
2025-06-02T15:46:59.5125628Z 
2025-06-02T15:46:59.5127024Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/720, where GetParam() = {n_queries=100, dataset shape=5000x32, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=3)}
2025-06-02T15:46:59.5127276Z 
2025-06-02T15:46:59.5128368Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/721, where GetParam() = {n_queries=100, dataset shape=5000x32, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=LOGICAL(refine_rate=3)}
2025-06-02T15:46:59.5128460Z 
2025-06-02T15:46:59.5129729Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/728, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=1)}
2025-06-02T15:46:59.5129812Z 
2025-06-02T15:46:59.5130835Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/729, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=IVF_PQ, merge_logic=LOGICAL(refine_rate=1)}
2025-06-02T15:46:59.5130919Z 
2025-06-02T15:46:59.5131927Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/730, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=2)}
2025-06-02T15:46:59.5132014Z 
2025-06-02T15:46:59.5133014Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/731, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=IVF_PQ, merge_logic=LOGICAL(refine_rate=2)}
2025-06-02T15:46:59.5133095Z 
2025-06-02T15:46:59.5134101Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/732, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=3)}
2025-06-02T15:46:59.5134179Z 
2025-06-02T15:46:59.5135254Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/733, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=L2, host, build_algo=IVF_PQ, merge_logic=LOGICAL(refine_rate=3)}
2025-06-02T15:46:59.5135397Z 
2025-06-02T15:46:59.5136453Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/740, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=1)}
2025-06-02T15:46:59.5136534Z 
2025-06-02T15:46:59.5137984Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/741, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=LOGICAL(refine_rate=1)}
2025-06-02T15:46:59.5138070Z 
2025-06-02T15:46:59.5139132Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/742, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=2)}
2025-06-02T15:46:59.5139297Z 
2025-06-02T15:46:59.5140353Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/743, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=LOGICAL(refine_rate=2)}
2025-06-02T15:46:59.5140431Z 
2025-06-02T15:46:59.5141490Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/744, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=PHYSICAL(refine_rate=3)}
2025-06-02T15:46:59.5141571Z 
2025-06-02T15:46:59.5142621Z [  FAILED  ] AnnCagraIndexMergeTest/AnnCagraIndexMergeTestF_U32.AnnCagraIndexMerge_I64/745, where GetParam() = {n_queries=100, dataset shape=5000x64, k=16, auto, max_queries=10, itopk_size=64, search_width=1, metric=InnerProduct, host, build_algo=IVF_PQ, merge_logic=LOGICAL(refine_rate=3)}

I wonder I didn't catch this while building locally on the same target.

@jakirkham
Copy link
Copy Markdown
Member

Let's see if this reproduces outside of these changes: #969

@jakirkham
Copy link
Copy Markdown
Member

Was unable to reproduce in a no change PR: #969

Comment thread cpp/src/neighbors/detail/nn_descent.cuh
@cjnolet
Copy link
Copy Markdown
Member

cjnolet commented Jun 3, 2025

It looks like this change might not be the cause of the cagra merge test failure. We're seeing it in the nightly CI also. See here

@cjnolet
Copy link
Copy Markdown
Member

cjnolet commented Jun 4, 2025

Fixed in #974. Closing.

@cjnolet cjnolet closed this Jun 4, 2025
@github-project-automation github-project-automation Bot moved this from In Progress to Done in Unstructured Data Processing Jun 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working cpp non-breaking Introduces a non-breaking change

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

6 participants