Summary
While experimenting with IVFFlat indexes in Python, I noticed what I believe to be a bug in the GPU implementation.
In short, if you pass an index object as the quantizer in the constructor call to GpuIndexIVFFlat and that
object goes out of scope or is manually deld, Faiss crashes (SIGSEGV) when trying to use the index.
The CPU implementation does not show this behavior (works despite coarse quantizer being deld).
Platform
OS: Ubuntu 22.04
Faiss version: 1.9.0, but applies to a custom fork based off 1.7.4 as well where I have discovered it, so the issue is probably older.
Installed from: anaconda
Faiss compilation options:
Running on:
Interface:
Reproduction instructions
CPU case that works robustly
import numpy as np
import faiss
dim = 10
nv = 1000
db = np.random.rand(nv, dim)
idx_coarse = faiss.IndexFlat(dim, faiss.METRIC_L2)
idx = faiss.IndexIVFFlat(idx_coarse, dim, faiss.METRIC_L2)
del(idx_coarse) # delete the coarse quantizer
idx.train(db) # no problem
# del(idx_coarse) # if we do it here, also no problem
idx.add(db)
# del(idx_coarse) # if we do it here, also no problem
idx.search(db, 1)
On the GPU, stuff breaks:
import numpy as np
import faiss
dim = 10
nv = 1000
db = np.random.rand(nv, dim)
res = faiss.StandardGpuResources()
idx_coarse = faiss.GpuIndexFlat(res, dim, faiss.METRIC_L2)
idx = faiss.GpuIndexIVFFlat(res, idx_coarse, dim, faiss.METRIC_L2)
del(idx_coarse) # deletion site (1)
idx.train(db) # BOOM(1) assertion error. Consistency check failure that quantizer's `d` != index's `d` due to being undefined values in quantizer's d field.
# del(idx_coarse) # if we delete here (2)
idx.add(db) # BOOM (2) SIGSEGV
del(idx_coarse) # if we delete here (3)
# idx.search(db, 1) # BOOM (3) SIGSEGV