[Bug]: `schema` field becomes `None` in Responses API when `stream=True`

### Your current environment

vllm: 0.10.2
openai: 1.108.0

### 🐛 Describe the bug

#### 1. **Client Request Arrives**
```python
stream = await client.responses.create(
    model=model,
    input=formatted_prompt,
    text={"format": {"name": "schema_ner", "schema": json_schema, "type": "json_schema", "strict": True}},
    stream=True,
)
```

#### 2. **FastAPI/Pydantic Parses the Request**
At `vllm/entrypoints/openai/api_server.py:516`:
```python
async def create_responses(request: ResponsesRequest, raw_request: Request):
```

The JSON is parsed into a `ResponsesRequest` object where:
- `text` field is type `Optional[ResponseTextConfig]` (from OpenAI library)
- Inside that, `format` is type `ResponseFormatTextConfig` (a Union type)
- When `type="json_schema"`, it becomes `ResponseFormatTextJSONSchemaConfig`

#### 3. **Pydantic Field Alias Mapping**

The OpenAI library defines the schema field with an alias:
```python
# openai/types/responses/response_format_text_json_schema_config.py
class ResponseFormatTextJSONSchemaConfig(BaseModel):
    schema_: Dict[str, object] = FieldInfo(alias="schema")  # Note: field name has underscore, alias doesn't
```

**Pydantic's behavior:**
- Sees JSON key: `"schema": {...}`
- Because of `alias="schema"`, maps it to the Python field: `schema_`
- So `request.text.format.schema_` is populated correctly ✅

#### 4. **Sampling Parameters Created Successfully**
At `vllm/entrypoints/openai/serving_responses.py:354`:
```python
sampling_params = request.to_sampling_params(default_max_tokens, self.default_sampling_params)
```

Inside `to_sampling_params()` at `vllm/entrypoints/openai/protocol.py:400-408`:
```python
if self.text is not None and self.text.format is not None:
    response_format = self.text.format
    if (
        response_format.type == "json_schema"
        and response_format.schema_ is not None  # ✅ Works correctly
    ):
        structured_outputs = StructuredOutputsParams(
            json=response_format.schema_
        )
```

**This works fine in both streaming and non-streaming modes.**

#### 5. **🐛 THE BUG: Streaming Creates Initial Response**
At `vllm/entrypoints/openai/serving_responses.py:1822-1830`:
```python
initial_response = ResponsesResponse.from_request(
    request,
    sampling_params,
    model_name=model_name,
    created_time=created_time,
    output=[],
    status="in_progress",
    usage=None,
).model_dump()  # ❌ MISSING by_alias=True
```

#### 6. **model_dump() Without by_alias=True Breaks Field Aliases**

**Before (Python object):**
```python
request.text.format = ResponseFormatTextJSONSchemaConfig(
    name="schema_ner",
    schema_={...},  # Python field name with underscore
    type="json_schema",
    strict=True
)
```

**After model_dump() without by_alias=True:**
```python
{
    "format": {
        "name": "schema_ner",
        "schema_": {...},  # ❌ Serialized as "schema_" (field name) not "schema" (alias)
        "type": "json_schema", 
        "strict": True
    }
}
```

**The dict now has `"schema_"` instead of `"schema"`!**

#### 7. **Dict Used in Streaming Events**
At `vllm/entrypoints/openai/serving_responses.py:1831-1836`:
```python
yield _increment_sequence_number_and_return(
    ResponseCreatedEvent(
        type="response.created",
        sequence_number=-1,
        response=initial_response,  # This dict has wrong key "schema_"
    )
)
```

#### 8. **Event Serialized to SSE Stream**
At `vllm/entrypoints/openai/api_server.py:500`:
```python
event_data = f"event: {event_type}\ndata: {event.model_dump_json(indent=None)}\n\n"
```

The JSON has the wrong key:
```json
{
  "text": {
    "format": {
      "schema_": {...},
      "name": "schema_ner",
      "type": "json_schema"
    }
  }
}
```

#### 9. **Validation Fails**

When the dict is processed/validated, Pydantic expects:
- JSON key: `"schema"` → maps to field: `schema_`

But receives:
- JSON key: `"schema_"` → doesn't match the alias!

Pydantic treats `schema` as missing and sets it to `None`, causing the error.

## Why Only in Streaming Mode?

Non-streaming mode doesn't call `model_dump()` on the initial response at this point in the code flow, so the issue doesn't surface.

## The Fix

### Location 1: `vllm/entrypoints/openai/serving_responses.py:1830`
```python
).model_dump(by_alias=True)  # Add by_alias=True
```

### Location 2: `vllm/entrypoints/openai/serving_responses.py:1879`
```python
response=final_response.model_dump(by_alias=True),  # Add by_alias=True
```

With `by_alias=True`, the serialization correctly uses field aliases:
```python
{
    "format": {
        "name": "schema_ner",
        "schema": {...},  # ✅ Correct! Uses the alias "schema" not field name "schema_"
        "type": "json_schema",
        "strict": True
    }
}
```

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: `schema` field becomes `None` in Responses API when `stream=True` #26288

Your current environment

🐛 Describe the bug

1. Client Request Arrives

2. FastAPI/Pydantic Parses the Request

3. Pydantic Field Alias Mapping

4. Sampling Parameters Created Successfully

5. 🐛 THE BUG: Streaming Creates Initial Response

6. model_dump() Without by_alias=True Breaks Field Aliases

7. Dict Used in Streaming Events

8. Event Serialized to SSE Stream

9. Validation Fails

Why Only in Streaming Mode?

The Fix

Location 1: `vllm/entrypoints/openai/serving_responses.py:1830`

Location 2: `vllm/entrypoints/openai/serving_responses.py:1879`

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: schema field becomes None in Responses API when stream=True #26288

Description

Your current environment

🐛 Describe the bug

1. Client Request Arrives

2. FastAPI/Pydantic Parses the Request

3. Pydantic Field Alias Mapping

4. Sampling Parameters Created Successfully

5. 🐛 THE BUG: Streaming Creates Initial Response

6. model_dump() Without by_alias=True Breaks Field Aliases

7. Dict Used in Streaming Events

8. Event Serialized to SSE Stream

9. Validation Fails

Why Only in Streaming Mode?

The Fix

Location 1: vllm/entrypoints/openai/serving_responses.py:1830

Location 2: vllm/entrypoints/openai/serving_responses.py:1879

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Bug]: `schema` field becomes `None` in Responses API when `stream=True` #26288

Location 1: `vllm/entrypoints/openai/serving_responses.py:1830`

Location 2: `vllm/entrypoints/openai/serving_responses.py:1879`