Description
Disclosure + Credit: I work at Amazon, and this idea was suggested by a colleague familiar with vector search (thanks Karthik!)
For quantized vector fields, HNSW graphs are initially built (during indexing) with original (unquantized) vectors, but later (during merging) with quantized vectors (see ref).
Would a graph built using original (unquantized) vectors every time (both indexing + merge) be higher quality, and better to search?
This is a tradeoff b/w increased indexing time (quantized computations are cheaper) v/s better recall + latency at search time.