Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion doc/source/models/model_abilities/embed.rst
Original file line number Diff line number Diff line change
Expand Up @@ -123,4 +123,10 @@ Does Embeddings API provides integration method for LangChain?
-----------------------------------------------------------------------------------

Yes, you can refer to the related sections in LangChain's respective official Xinference documentation.
Here is the link: `Text Embedding Models: Xinference <https://python.langchain.com/docs/integrations/text_embedding/xinference>`_
Here is the link: `Text Embedding Models: Xinference <https://python.langchain.com/docs/integrations/text_embedding/xinference>`_


Does Embeddings API support hrbrid model?
-----------------------------------------------------------------------------------

Yes, you can use ``flag`` as the engine to deploy the model and call Embeddings API by setting the extra parameter ``return_parse=True`` which will return sparse vectors.
12 changes: 10 additions & 2 deletions xinference/model/rerank/sentence_transformers/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,7 @@ def load(self):
if (
self.model_family.type == "normal"
and "qwen3" not in self.model_family.model_name.lower()
and "jina-reranker-v3" not in self.model_family.model_name.lower()
):
try:
import sentence_transformers
Expand Down Expand Up @@ -109,7 +110,10 @@ def load(self):
)
if self._use_fp16:
self._model.model.half()
elif "qwen3" in self.model_family.model_name.lower():
elif (
"qwen3" in self.model_family.model_name.lower()
or "jina-reranker-v3" in self.model_family.model_name.lower()
):
# qwen3-reranker
# now we use transformers
# TODO: support engines for rerank models
Expand Down Expand Up @@ -225,6 +229,7 @@ def rerank(
if (
self.model_family.type == "normal"
and "qwen3" not in self.model_family.model_name.lower()
and "jina-reranker-v3" not in self.model_family.model_name.lower()
):
logger.debug("Passing processed sentences: %s", sentence_combinations)
similarity_scores = self._model.predict(
Expand All @@ -235,7 +240,10 @@ def rerank(
).cpu()
if similarity_scores.dtype == torch.bfloat16:
similarity_scores = similarity_scores.float()
elif "qwen3" in self.model_family.model_name.lower():
elif (
"qwen3" in self.model_family.model_name.lower()
or "jina-reranker-v3" in self.model_family.model_name.lower()
):

def format_instruction(instruction, query, doc):
if instruction is None:
Expand Down
Loading