Conversation
MarcusDunn
left a comment
There was a problem hiding this comment.
This looks great! Thanks for the PR.
my only question is, why did you make initialized_logits public?
|
|
|
I was trying to match it as closely with the original llama.cpp repo. Specifically the batch decode treats pooling type none and the other pooling types differently. . I initially started off trying to modify the embedding example to match the original repo ( since reranking in llama.cpp is built into the examples/embedding.cpp ). Its actually not being used in the reranking (since pooling is set to rank), but does it make sense to leave it public so we could use it with pooling type none if required? |
|
sorry for the late reply! I would prefer we keep it private unless it is required to have a feature work. |
c72a971 to
d789cac
Compare
|
My bad finally got around to it. Updated now. initialized_logits is not public anymore. |
The parent llama.cpp repo recently added support for reranking.
This PR
Validated against all examples in the llama.cpp and BGE-reranker-v2-m3 and all rerank scores match.
Please review, make edits and feel free to commit. Thank you for the awesome repo!