Hello,
In Python 3.12.5, with faiss-cpu 1.11.0.post1, I have the following scheme:
# Generate some embeds with SBERT...
dim = 384
count_cells = 10
coarse_quantizer = faiss.IndexFlatL2(dim)
ivf_index = faiss.IndexIVFFlat(coarse_quantizer, dim, nlist)
train_embeds = embeds[:5000]
ivf_index.train(train_embeds)
index = faiss.IndexIDMap2(ivf_index)
Afterwards, I have a loop where I add a few embeds each time like this:
# embeds - a list of sbert embeds of dim 384, embeds_ids - their unique ids
self.index.add_with_ids(embeds, embeds_ids)
The ids are for sure unique - I use the index of the embeds, and iterate over the embeds.
Once the index reaches a certain size, I start removing embeds by ids - not in a sliding window fashion.
I added some safety code as advised by an LLM, although note that my code works fine with a naive index, i.e. faiss.IndexIDMap2(faiss.IndexFlatL2(dim)), by just calling remove_ids with the array of ids.
ids64 = np.ascontiguousarray(np.asarray(embeds_ids, dtype=np.int64))
sel = faiss.IDSelectorBatch(ids64.size, faiss.swig_ptr(ids64))
before = self.index.ntotal
removed = int(self.index.remove_ids(sel))
after = self.index.ntotal
assert after == before - len(embeds_ids), (before, after)
return removed
This function can remove a single ID 3 times, passing the assertion, and than gets the following assertion failure (btw - throwing an exception would be nice, unless there's some good reason to use an assertion instead).
Faiss assertion 'j == index->ntotal' failed in virtual size_t __cdecl faiss::IndexIDMapTemplate<faiss::Index>::remove_ids(const IDSelector &) [IndexT = faiss::Index] at D:\a\faiss-wheels\faiss-wheels\faiss\faiss\IndexIDMap.cpp:197
I've been troubleshooting for quite some time, and I suspect there might be a bug - since I think (hope) my usage is fine.
Hello,
In Python 3.12.5, with faiss-cpu 1.11.0.post1, I have the following scheme:
Afterwards, I have a loop where I add a few embeds each time like this:
The ids are for sure unique - I use the index of the embeds, and iterate over the embeds.
Once the index reaches a certain size, I start removing embeds by ids - not in a sliding window fashion.
I added some safety code as advised by an LLM, although note that my code works fine with a naive index, i.e. faiss.IndexIDMap2(faiss.IndexFlatL2(dim)), by just calling remove_ids with the array of ids.
This function can remove a single ID 3 times, passing the assertion, and than gets the following assertion failure (btw - throwing an exception would be nice, unless there's some good reason to use an assertion instead).
Faiss assertion 'j == index->ntotal' failed in virtual size_t __cdecl faiss::IndexIDMapTemplate<faiss::Index>::remove_ids(const IDSelector &) [IndexT = faiss::Index] at D:\a\faiss-wheels\faiss-wheels\faiss\faiss\IndexIDMap.cpp:197I've been troubleshooting for quite some time, and I suspect there might be a bug - since I think (hope) my usage is fine.