Skip to content

Can faiss batch sa_encode embeddings? #4192

@Luciferre

Description

@Luciferre

Hi, I'm currently testing HNSW with scalar quantizer SQ8. Since my dataset is pretty large and can't processed by original number of reducers(and we don't want to increase the number of reducers because that would also affect the search performance), so I tested splitting embeddings to several batches and then encoded and trained them by batch. But the recall dropped a lot by using batch. Just want to check that is batch encoding a feasible way? Or should we encode the whole bunch of embeddings together but not splitting in batch?

Thank you.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions