Skip to content

[Bug]: KeyError on logprobs with MistralTokenizer #17421

@tjohnson31415

Description

@tjohnson31415

Your current environment

I confirmed that it still fails on current main at the time of writing. See below for reproduction instructions:

🐛 Describe the bug

We found an edge-case that causes requests to error out when using the MistralTokenizer with a model.

example serving command using the MistralTokenizer (with a model using the Tekken tokenizer):

vllm serve mistralai/Mistral-Small-3.1-24B-Instruct-2503 --tokenizer-mode mistral  --config-format mistral --load-format mistral

simplified reproduction case:

curl -s -X POST \
  -H "Content-Type: application/json" \
  "http://localhost:8000/v1/chat/completions" \
  --data-binary @- << _EOF
  {
   "model": "mistralai/Mistral-Small-3.1-24B-Instruct-2503",
   "logprobs": true,
   "top_logprobs": 2,
   "messages": [
      {
          "role": "user",
          "content":  " "
      }
   ],
   "guided_json": {"properties": {}}
  }
_EOF

The relevant part of the stacktrace

  ...
  File "/workspace/my-vllm/lib64/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 477, in create_chat_completion
    generator = await handler.create_chat_completion(request, raw_request)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/my-vllm/lib64/python3.12/site-packages/vllm/entrypoints/openai/serving_chat.py", line 267, in create_chat_completion
    return await self.chat_completion_full_generator(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/my-vllm/lib64/python3.12/site-packages/vllm/entrypoints/openai/serving_chat.py", line 925, in chat_completion_full_generator
    logprobs = self._create_chat_logprobs(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/my-vllm/lib64/python3.12/site-packages/vllm/entrypoints/openai/serving_chat.py", line 1145, in _create_chat_logprobs
    step_token = step_top_logprobs[token_id]
                 ~~~~~~~~~~~~~~~~~^^^^^^^^^^
KeyError: 2

From my investigation, this occurs when all of the top logprobs are special tokens. For this case with MistralTokenizer decoded_tokens ends up being an empty list, resulting in an empty dict in logprobs that is then accessed via step_top_logprobs. When skipping special tokens (which is required/default for the MistralTokenizer), convert_ids_list_to_tokens can return a list smaller than the list of input tokens.

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions