Summary
As of today, we have three different 8 bit quantizers QT_8bit_direct, QT_8bit_uniform and QT_8bit but all three of them only support uint8 vectors ranging from [0 to 255]. If we try to ingest signed 8 bit vectors ([-128 to 127]) using these quantizers then during encoding it will be casted as uint8 values which will change the sign and magnitude of values outside of uint8 range. There are few use cases where customers want to use models like Cohere Embed that generates signed int8 embeddings ranging from [-128 to 127]. To support such use cases we need a new signed 8 bit scalar quantizer.
Solution
To solve this problem, we can add a new signed 8 bit quantizer something similar to QT_8bit_direct where during encoding it adds 128 to each dimension of the vector to bring it into uint8 range to store in uint8_t* code. Similarly, during decoding or while reconstructing the components 128 will be subtracted from each dimension to retrieve the actual signed int8 vector back before computing the distance.