[Bug]: KeyError on logprobs with MistralTokenizer

### Your current environment

I confirmed that it still fails on [current main at the time of writing](https://github.com/vllm-project/vllm/commit/7489ec0bab2904dcc4001af59a942a16756fdbbc). See below for reproduction instructions:


### 🐛 Describe the bug

We found an edge-case that causes requests to error out when using the MistralTokenizer with a model.

example serving command using the MistralTokenizer (with a model using the Tekken tokenizer):
```
vllm serve mistralai/Mistral-Small-3.1-24B-Instruct-2503 --tokenizer-mode mistral  --config-format mistral --load-format mistral
```

simplified reproduction case:
```
curl -s -X POST \
  -H "Content-Type: application/json" \
  "http://localhost:8000/v1/chat/completions" \
  --data-binary @- << _EOF
  {
   "model": "mistralai/Mistral-Small-3.1-24B-Instruct-2503",
   "logprobs": true,
   "top_logprobs": 2,
   "messages": [
      {
          "role": "user",
          "content":  " "
      }
   ],
   "guided_json": {"properties": {}}
  }
_EOF
```

The relevant part of the stacktrace
```
  ...
  File "/workspace/my-vllm/lib64/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 477, in create_chat_completion
    generator = await handler.create_chat_completion(request, raw_request)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/my-vllm/lib64/python3.12/site-packages/vllm/entrypoints/openai/serving_chat.py", line 267, in create_chat_completion
    return await self.chat_completion_full_generator(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/my-vllm/lib64/python3.12/site-packages/vllm/entrypoints/openai/serving_chat.py", line 925, in chat_completion_full_generator
    logprobs = self._create_chat_logprobs(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/my-vllm/lib64/python3.12/site-packages/vllm/entrypoints/openai/serving_chat.py", line 1145, in _create_chat_logprobs
    step_token = step_top_logprobs[token_id]
                 ~~~~~~~~~~~~~~~~~^^^^^^^^^^
KeyError: 2
```

From my investigation, this occurs when all of the top logprobs are special tokens. For this case with MistralTokenizer [`decoded_tokens`](https://github.com/vllm-project/vllm/blob/7489ec0bab2904dcc4001af59a942a16756fdbbc/vllm/v1/engine/logprobs.py#L72-L74) ends up being an empty list, resulting in an empty dict in `logprobs` that is then accessed via `step_top_logprobs`. When skipping special tokens (which is required/default for the MistralTokenizer), `convert_ids_list_to_tokens` can return a list smaller than the list of input tokens.

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: KeyError on logprobs with MistralTokenizer #17421

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: KeyError on logprobs with MistralTokenizer #17421

Description

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions