Scalar Quantizer to support signed int8 vectors

# Summary

As of today, we have three different 8 bit quantizers `QT_8bit_direct`, `QT_8bit_uniform` and `QT_8bit` but all three of them only support uint8 vectors ranging from [0 to 255]. If we try to ingest signed 8 bit vectors ([-128 to 127]) using these quantizers then during encoding it will be casted as uint8 values which will change the sign and magnitude of values outside of uint8 range. There are few use cases where customers want to use models like [Cohere Embed](https://cohere.com/blog/int8-binary-embeddings) that generates signed int8 embeddings ranging from [-128 to 127]. To support such use cases we need a new signed 8 bit scalar quantizer.

## Solution
To solve this problem, we can add a new signed 8 bit quantizer something similar to `QT_8bit_direct` where during encoding it adds `128` to each dimension of the vector to bring it into uint8 range to store in  `uint8_t* code`. Similarly, during decoding or while reconstructing the components `128` will be subtracted from each dimension to retrieve the actual signed int8 vector back before computing the distance.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scalar Quantizer to support signed int8 vectors #3488

Summary

Solution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Scalar Quantizer to support signed int8 vectors #3488

Description

Summary

Solution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions