Skip to content

[BUG] DBSCAN crashes on data with a large number of rows #5393

@albert-cwkuo

Description

@albert-cwkuo

Describe the bug
When feeding DBSCAN.fit_predict with data x having many a large #rows, it crashed instantly with the following error:

RuntimeError: CUDA error encountered at: file=/project/python/_skbuild/linux-x86_64-3.8/cmake-build/_deps/raft-src/cpp/include/raft/spatial/knn/detail/epsilon_neighborhood.cuh line=200: call='cudaGetLastError()', Reason=cudaErrorInvalidConfiguration:invalid configuration argument
Obtained 64 stack frames
#0 in /nethome/ckuo45/anaconda3/envs/know-aug/lib/python3.8/site-packages/cuml/internals/../libcuml++.so(_ZN4raft9exception18collect_call_stackEv+0x38) [0x7f7dc633dd28]
#1 in /nethome/ckuo45/anaconda3/envs/know-aug/lib/python3.8/site-packages/cuml/internals/../libcuml++.so(_ZN4raft10cuda_errorC1ERKSs+0x38) [0x7f7dc633e518]
#2 in /nethome/ckuo45/anaconda3/envs/know-aug/lib/python3.8/site-packages/cuml/internals/../libcuml++.so(_ZN4raft7spatial3knn6detail21epsUnexpL2SqNeighImplIflLi4EEEvPbPT0_PKT_S9_S5_S5_S5_S7_P11CUstream_st+0x37f) [0x7f7dc640771f]
#3 in /nethome/ckuo45/anaconda3/envs/know-aug/lib/python3.8/site-packages/cuml/internals/../libcuml++.so(+0x20067d2) [0x7f7dc63da7d2]
#4 in /nethome/ckuo45/anaconda3/envs/know-aug/lib/python3.8/site-packages/cuml/internals/../libcuml++.so(_ZN2ML6Dbscan3runIflLb0EEEmRKN4raft8handle_tEPKT_T0_S9_S9_S9_S6_S9_PS9_SA_iiiPvmP11CUstream_stNS2_8distance12DistanceTypeE+0x662) [0x7f7dc6421412]
#5 in /nethome/ckuo45/anaconda3/envs/know-aug/lib/python3.8/site-packages/cuml/internals/../libcuml++.so(_ZN2ML6Dbscan13dbscanFitImplIflLb0EEEvRKN4raft8handle_tEPT_T0_S8_S6_S8_NS2_8distance12DistanceTypeEPS8_SB_mP11CUstream_sti+0x1336) [0x7f7dc64255e6]
#6 in /nethome/ckuo45/anaconda3/envs/know-aug/lib/python3.8/site-packages/cuml/cluster/dbscan.cpython-38-x86_64-linux-gnu.so(+0x2a1a8) [0x7f7ed16b61a8]
#7 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(PyObject_Call+0x272) [0x4ed992]
#8 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0x1efa) [0x4ca03a]
#9 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalCodeWithName+0x1f5) [0x4c6fe5]
#10 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(PyEval_EvalCodeEx+0x39) [0x4c6de9]
#11 in /nethome/ckuo45/anaconda3/envs/know-aug/lib/python3.8/site-packages/cuml/cluster/dbscan.cpython-38-x86_64-linux-gnu.so(+0x25bad) [0x7f7ed16b1bad]
#12 in /nethome/ckuo45/anaconda3/envs/know-aug/lib/python3.8/site-packages/cuml/cluster/dbscan.cpython-38-x86_64-linux-gnu.so(+0x27d04) [0x7f7ed16b3d04]
#13 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(PyObject_Call+0x272) [0x4ed992]
#14 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0x1efa) [0x4ca03a]
#15 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalCodeWithName+0x1f5) [0x4c6fe5]
#16 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(PyEval_EvalCodeEx+0x39) [0x4c6de9]
#17 in /nethome/ckuo45/anaconda3/envs/know-aug/lib/python3.8/site-packages/cuml/cluster/dbscan.cpython-38-x86_64-linux-gnu.so(+0x25bad) [0x7f7ed16b1bad]
#18 in /nethome/ckuo45/anaconda3/envs/know-aug/lib/python3.8/site-packages/cuml/cluster/dbscan.cpython-38-x86_64-linux-gnu.so(+0x2883f) [0x7f7ed16b483f]
#19 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(PyObject_Call+0x272) [0x4ed992]
#20 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0x1efa) [0x4ca03a]
#21 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalCodeWithName+0x1f5) [0x4c6fe5]
#22 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x4e9674]
#23 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0x4d48) [0x4cce88]
#24 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalCodeWithName+0x1f5) [0x4c6fe5]
#25 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(PyEval_EvalCodeEx+0x39) [0x4c6de9]
#26 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(PyEval_EvalCode+0x1b) [0x56d93b]
#27 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x571390]
#28 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x4dbb17]
#29 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0x907) [0x4c8a47]
#30 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x4f568b]
#31 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0x1d68) [0x4c9ea8]
#32 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x4f568b]
#33 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0x1d68) [0x4c9ea8]
#34 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x4f568b]
#35 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x4e747d]
#36 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0xa3d) [0x4c8b7d]
#37 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyFunction_Vectorcall+0x106) [0x4db216]
#38 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0x907) [0x4c8a47]
#39 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyFunction_Vectorcall+0x106) [0x4db216]
#40 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0xa3d) [0x4c8b7d]
#41 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalCodeWithName+0x1f5) [0x4c6fe5]
#42 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x4e9674]
#43 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0x1725) [0x4c9865]
#44 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyFunction_Vectorcall+0x106) [0x4db216]
#45 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0xa3d) [0x4c8b7d]
#46 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyFunction_Vectorcall+0x106) [0x4db216]
#47 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0xa3d) [0x4c8b7d]
#48 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyFunction_Vectorcall+0x106) [0x4db216]
#49 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0xa3d) [0x4c8b7d]
#50 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalCodeWithName+0x1f5) [0x4c6fe5]
#51 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyFunction_Vectorcall+0x19c) [0x4db2ac]
#52 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x4e95e7]
#53 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(PyObject_Call+0x5e) [0x4ed77e]
#54 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0x1efa) [0x4ca03a]
#55 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalCodeWithName+0x1f5) [0x4c6fe5]
#56 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyFunction_Vectorcall+0x19c) [0x4db2ac]
#57 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalFrameDefault+0x907) [0x4c8a47]
#58 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(_PyEval_EvalCodeWithName+0x1f5) [0x4c6fe5]
#59 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(PyEval_EvalCodeEx+0x39) [0x4c6de9]
#60 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python(PyEval_EvalCode+0x1b) [0x56d93b]
#61 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x58cc71]
#62 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x586a2f]
#63 in /nethome/ckuo45/anaconda3/envs/know-aug/bin/python() [0x590072]

Steps/Code to reproduce bug
Here's the code snippet to reproduce the bug with X having 5M rows

import numpy as np
from cuml import DBSCAN
x = np.random.random((5000000, 768)).astype(np.float32)
dbscan = DBSCAN(min_samples = 1, eps = 5.0)
labels = dbscan.fit_predict(x)

Expected behavior
The last line labels = dbscan.fit_predict(x) crashes immediately.

Environment details (please complete the following information):

  • Environment location: local PC
  • Linux Distro/Architecture: Ubuntu 16.04 amd64
  • GPU Model/Driver: A40 Nvidia GPU, Driver Version: 460.91.03
  • CUDA: 11.2
  • Method of cuDF & cuML install: pip

Additional context
This issue seems relevant to this solved issue: #1753

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions