-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Closed
Labels
Description
Hi, I'm currently testing HNSW with scalar quantizer SQ8. Since my dataset is pretty large and can't processed by original number of reducers(and we don't want to increase the number of reducers because that would also affect the search performance), so I tested splitting embeddings to several batches and then encoded and trained them by batch. But the recall dropped a lot by using batch. Just want to check that is batch encoding a feasible way? Or should we encode the whole bunch of embeddings together but not splitting in batch?
Thank you.
Reactions are currently unavailable