InnerProduct in NN Descent is calculated by reinterpret_cast<const uint8_t*> the fp16 data pointer.
Then, based on the data_dim of the int8/uint8 data it calculates the distance like below;
// data_n1 and data_n2 are uint8_t*
for (int d = 0; d < data_dim; d++) {
s_distances[i] += __popc(static_cast<uint32_t>(data_n1[d] ^ data_n2[d]) & 0xff);
}
This can be improved by checking for the divisibility of data_dim. If data is divisible by 2, we can do something like this
Then, based on the data_dim of the int8/uint8 data it calculates the distance like below;
// data_n1 and data_n2 are half*. Say data_dim %2 == 0
for (int d = 0; d < data_dim/2; d++) {
s_distances[i] += __popc(static_cast<uint32_t>(data_n1[d] ^ data_n2[d]) & 0xffff);
}
can do the same for when data_dim%4 == 0
InnerProduct in NN Descent is calculated by
reinterpret_cast<const uint8_t*>the fp16 data pointer.Then, based on the
data_dimof the int8/uint8 data it calculates the distance like below;This can be improved by checking for the divisibility of
data_dim. If data is divisible by 2, we can do something like thisThen, based on the
data_dimof the int8/uint8 data it calculates the distance like below;can do the same for when
data_dim%4 == 0