FP16 API for CAGRA and IVF-PQ#264
Conversation
|
We also need to measure the impact of binary size and compile time for adding these new types to the public API. We can't keep increasing without figuring out ways we can consolidate what's there. These two thjngs are the number 1 complaint from users currently. (It's not just this PR. This is also holding up the half precision bfknn and RBC PRs). |
|
Linking #110 |
|
The kernels are already compiled and included in It was a mistake during porting the code from RAFT, that the public API was not defined for fp16, therefore I labelled this PR as a bugfix. |
achirkin
left a comment
There was a problem hiding this comment.
Thanks Tamas for the PR! The changes are straightforward and it looks good to me as-is, yet a few nitpicks below (if the time permits).
tfeher
left a comment
There was a problem hiding this comment.
Thanks Artem for the review, I have addressed the issues.
|
Sorry @tfeher, I understand that there were some bits which were not exposed in the prior port, but this still doesn't change the increase in binary size. We need to address this before we continue to merge changes that increase it. I propose we look at things that can be consolidated and maybe try using JIT for some of these things. We have to take a step back and fix this at the source. |
|
Binary size: 679 -> 661 MB A quick check with bench-ann shows the FP16 benchmarks seem to work. I've checked that CAGRA is a little bit faster on FP16 vs FP32 on the glove dataset. |
…his change breaking the code
|
/merge |
This PR adds public API to CAGRA and IVF-PQ ANN search using FP16 input data.
Note that the fp16 kernels are already compiled in libcuvs. This PR just adds the missing public API declarations into the C++ headers, and restores the instantiations of public API functions.
This PR partially fixes #144 (the IVF-Flat API is not yet added here).