Feature Request: Support OpenAI Responses API (/v1/responses) in llama.cpp server

### Prerequisites

- [x] I am running the latest code. Mention the version if possible as well.
- [x] I carefully followed the [README.md](https://github.com/ggml-org/llama.cpp/blob/master/README.md).
- [x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [x] I reviewed the [Discussions](https://github.com/ggml-org/llama.cpp/discussions), and have a new and useful enhancement to share.

### Feature Description

Description:
The llama.cpp OpenAI-compatible server currently supports endpoints like /v1/chat/completions, but does not support the newer OpenAI Responses API (/v1/responses).

Many OpenAI client SDKs and tools are moving toward the unified Responses API, which supports structured outputs, tool calls, and multimodal responses in a single endpoint. Lack of /v1/responses support makes it harder to use llama.cpp as a drop-in replacement for OpenAI backends without an additional proxy or translation layer.

Requested feature:
Add native support for the /v1/responses endpoint in llama-server, aligned as closely as possible with OpenAI’s Responses API request/response format.

Benefits:

Improved compatibility with modern OpenAI SDKs

Easier migration from OpenAI APIs to llama.cpp

Reduced need for custom proxy or request rewriting layers

Thanks for the great work on llama.cpp!

### Motivation

I want to use ClaudeCode with llama-server offline, but it currently don't support necessary endpoints.

### Possible Implementation

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Support OpenAI Responses API (/v1/responses) in llama.cpp server #19138

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Support OpenAI Responses API (/v1/responses) in llama.cpp server #19138

Description

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions