Auto-byte quantize by bratseth · Pull Request #35702 · vespa-engine/vespa

bratseth · 2026-01-27T13:20:25Z

This adds automagic byte quantization to the HuggingFace embedder.

Maybe we should check for L2 normalization, or require normalization=true to do this?

arnej27959 · 2026-01-30T08:51:22Z

In the non-normalize case it’s probably more useful to scale with 127.0/max(abs(x)) and assume angular distance.

arnej27959 · 2026-01-30T08:53:23Z

And the normalize case is very likely to end with almost all values 0.

arnej27959 · 2026-02-05T16:00:09Z

I was a bit too pessimistic; checking on Cohere embeddings doing (L2 norm)*127 I got:

Auto-byte quantize

25eb713

bratseth requested a review from arnej27959 January 27, 2026 13:20

Simplify

19fbdf1

Provide feedback