-
Notifications
You must be signed in to change notification settings - Fork 470
Add MCP server endpoints #1453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MCP server endpoints #1453
Changes from 2 commits
3a8af21
52c196b
9eaf84d
43c8fe4
1c7f33e
de4b293
1e1f8c5
90ebe5e
2440fe7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,100 @@ | ||
| # MCP protocol support | ||
|
|
||
| `mistralrs-server` can serve **MCP (Model Control Protocol)** traffic next to the regular OpenAI-compatible HTTP interface. | ||
| MCP is an open, tool-based protocol that lets clients interact with models through structured *tool calls* instead of free-form HTTP routes. | ||
| Under the hood the server uses [`rust-mcp-sdk`](https://crates.io/crates/rust-mcp-sdk) and exposes a single tool called **`chat`** that mirrors the behaviour of the `/v1/chat/completions` endpoint. | ||
|
|
||
| --- | ||
|
|
||
| ## 1. Building | ||
|
|
||
| Support for MCP is compiled in by default because the workspace enables the `server` and `hyper-server` features of `rust-mcp-sdk`. | ||
| When you only compile the `mistralrs-server` crate outside the workspace enable the `mcp-server` Cargo feature manually: | ||
|
|
||
| ```bash | ||
| cargo build -p mistralrs-server --release --features "mcp-server" | ||
| ``` | ||
|
|
||
| ## 2. Running | ||
|
|
||
| Start the normal HTTP server and add the `--mcp-port` flag to spin up an MCP server on a separate port: | ||
|
|
||
| ```bash | ||
| ./target/release/mistralrs-server \ | ||
| --port 1234 # OpenAI compatible HTTP API | ||
| --mcp-port 4321 # MCP protocol endpoint (SSE over HTTP) | ||
| plain -m mistralai/Mistral-7B-Instruct-v0.3 | ||
| ``` | ||
|
|
||
| * `--mcp-port` takes precedence over `--port` – you can run the HTTP and MCP servers on totally independent ports or omit `--port` when you only need MCP.* | ||
|
|
||
| The server prints an extra line such as | ||
|
|
||
| ``` | ||
| MCP ‑ listening on http://0.0.0.0:9001 | ||
| ``` | ||
|
|
||
| ## 3. Capabilities announced to clients | ||
|
|
||
| At start-up the MCP handler advertises the following `InitializeResult` (abridged): | ||
|
|
||
| ```jsonc | ||
| { | ||
| "server_info": { "name": "mistralrs", "version": "<crate-version>" }, | ||
| "protocol_version": "2025-03-26", // latest spec version from rust-mcp-sdk | ||
| "instructions": "use tool 'chat'", | ||
| "capabilities": { | ||
| "tools": {} | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| Only one tool is currently exposed: | ||
|
|
||
| | tool | description | | ||
| |------|------------------------------------------------------| | ||
| | `chat` | Wraps the OpenAI `/v1/chat/completions` endpoint. | | ||
|
|
||
| ## 4. Calling the `chat` tool | ||
|
|
||
| Clients send a [`CallToolRequest`](https://docs.rs/rust-mcp-schema/latest/rust_mcp_schema/struct.CallToolRequest.html) event where `params.name` is `"chat"` and `params.arguments` contains a standard MCP [`CreateMessageRequest`](https://docs.rs/rust-mcp-schema/latest/rust_mcp_schema/struct.CreateMessageRequest.html). | ||
|
|
||
| Example request (sent as SSE `POST /mcp/stream` or via the convenience helpers in `rust-mcp-sdk`): | ||
|
|
||
| ```jsonc | ||
| { | ||
| "kind": "callToolRequest", | ||
| "id": "123", | ||
| "params": { | ||
| "name": "chat", | ||
| "arguments": { | ||
| "model": "mistralai/Mistral-7B-Instruct-v0.3", | ||
| "messages": [ | ||
| { "role": "user", "content": "Explain Rust ownership." } | ||
| ] | ||
| } | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| The response is a `CallToolResult` event whose `content` array contains a single `TextContent` item with the assistant response. | ||
|
|
||
| ```jsonc | ||
| { | ||
| "kind": "callToolResult", | ||
| "id": "123", | ||
| "content": [ | ||
| { "type": "text", "text": "Rust’s ownership system ..." } | ||
| ] | ||
| } | ||
| ``` | ||
|
|
||
| Error cases are mapped to `CallToolError` with `is_error = true`. | ||
|
|
||
| ## 5. Limitations & future work | ||
|
|
||
| • Only synchronous, single-shot requests are supported right now. | ||
| • Streaming responses (`partialCallToolResult`) are not yet implemented. | ||
| • No authentication layer is provided – run the MCP port behind a reverse proxy if you need auth. | ||
|
|
||
| Contributions to extend MCP coverage (streaming, more tools, auth hooks) are welcome! |
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,95 @@ | ||||||||||||||||||||||||||||||
| use async_trait::async_trait; | ||||||||||||||||||||||||||||||
| use rust_mcp_sdk::{ | ||||||||||||||||||||||||||||||
| mcp_server::{hyper_server, HyperServerOptions, ServerHandler}, | ||||||||||||||||||||||||||||||
| schema::{ | ||||||||||||||||||||||||||||||
| schema_utils::CallToolError, CallToolRequest, CallToolResult, CallToolResultContentItem, | ||||||||||||||||||||||||||||||
| Implementation, InitializeResult, ServerCapabilities, ServerCapabilitiesTools, TextContent, | ||||||||||||||||||||||||||||||
| LATEST_PROTOCOL_VERSION, | ||||||||||||||||||||||||||||||
| }, | ||||||||||||||||||||||||||||||
| }; | ||||||||||||||||||||||||||||||
| use std::io; | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| use mistralrs_server_core::{ | ||||||||||||||||||||||||||||||
| chat_completion::{create_response_channel, parse_request}, | ||||||||||||||||||||||||||||||
| types::SharedMistralRsState, | ||||||||||||||||||||||||||||||
| }; | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| pub struct MistralMcpHandler { | ||||||||||||||||||||||||||||||
| pub state: SharedMistralRsState, | ||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| #[async_trait] | ||||||||||||||||||||||||||||||
| impl ServerHandler for MistralMcpHandler { | ||||||||||||||||||||||||||||||
| async fn handle_call_tool_request( | ||||||||||||||||||||||||||||||
| &self, | ||||||||||||||||||||||||||||||
| request: CallToolRequest, | ||||||||||||||||||||||||||||||
| _runtime: &dyn rust_mcp_sdk::McpServer, | ||||||||||||||||||||||||||||||
| ) -> std::result::Result<CallToolResult, CallToolError> { | ||||||||||||||||||||||||||||||
| if request.params.name != "chat" { | ||||||||||||||||||||||||||||||
| return Err(CallToolError::unknown_tool(request.params.name)); | ||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||
| let args = request.params.arguments.into(); | ||||||||||||||||||||||||||||||
| let req: rust_mcp_sdk::schema::CreateMessageRequest = | ||||||||||||||||||||||||||||||
| serde_json::from_value(args).map_err(|e| CallToolError::new(io::Error::other(e)))?; | ||||||||||||||||||||||||||||||
| // Translate to ChatCompletionRequest | ||||||||||||||||||||||||||||||
| let chat_req: mistralrs_server_core::openai::ChatCompletionRequest = | ||||||||||||||||||||||||||||||
| serde_json::from_value(serde_json::to_value(req).unwrap()) | ||||||||||||||||||||||||||||||
| .map_err(CallToolError::new)?; | ||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
| let req: rust_mcp_sdk::schema::CreateMessageRequest = | |
| serde_json::from_value(args).map_err(|e| CallToolError::new(io::Error::other(e)))?; | |
| // Translate to ChatCompletionRequest | |
| let chat_req: mistralrs_server_core::openai::ChatCompletionRequest = | |
| serde_json::from_value(serde_json::to_value(req).unwrap()) | |
| .map_err(CallToolError::new)?; | |
| let args = request.params.arguments.into(); | |
| let req: rust_mcp_sdk::schema::CreateMessageRequest = | |
| serde_json::from_value(args).map_err(|e| CallToolError::new(io::Error::other(e)))?; | |
| // TODO: Implement direct conversion from CreateMessageRequest to ChatCompletionRequest | |
| // to avoid the overhead of JSON serialization/deserialization | |
| let chat_req: mistralrs_server_core::openai::ChatCompletionRequest = | |
| serde_json::from_value(serde_json::to_value(req).unwrap()) | |
| .map_err(CallToolError::new)?; |
🤖 Prompt for AI Agents
In mistralrs-server/src/mcp_server.rs around lines 32 to 37, the code
inefficiently converts CreateMessageRequest to ChatCompletionRequest by
serializing to JSON and then deserializing. To fix this, implement a direct
conversion method such as From or TryFrom trait between these types or manually
map fields from CreateMessageRequest to ChatCompletionRequest without
intermediate JSON serialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🛠️ Refactor suggestion
🧩 Analysis chain
Review the response handling for completeness and structure preservation.
The current implementation has potential issues:
- It only receives one response, which might miss subsequent streaming responses
- Joining multiple choice contents with newlines loses the original response structure
Consider preserving the response structure or documenting the expected behavior.
🏁 Script executed:
#!/bin/bash
# Check if streaming responses are used elsewhere in the codebase
rg -A 5 "Response::Chunk|is_streaming.*true" --type rustLength of output: 10753
Ensure mcp_server handles streaming responses and preserves choice structure
The current match in mistralrs-server/src/mcp_server.rs (lines 46–66) only handles Response::Done and treats any other variant (including streaming chunks) as an error. Since the core and examples use Response::Chunk for streaming, this will break any tool calls configured for streaming:
- Streaming tool responses will hit the
Some(_)arm and return a “no response” error. - Joining all choice contents with
"\n"flattens multiple messages and loses per-choice metadata.
Please update this handler to either:
- Accumulate
Response::Chunkvariants (e.g. buffer incomingchunk.choicesuntil a terminal signal) and then emit a singleCallToolResult, or - Forward each chunk as it arrives (e.g. streaming
CallToolResultContentItem::TextContent), or - Clearly document that mcp_server does not support streaming tool responses and ensure the engine never emits chunks for this endpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Consider documenting the purpose of the date-based feature flag.
The feature flag
"2025_03_26"appears to be date-based, which could make future maintenance challenging. Consider adding a comment explaining what this feature represents or check if there's a more semantic feature name available.🌐 Web query:
💡 Result:
The "2025_03_26" feature flag in the
rust-mcp-sdkversion 0.4.2 likely corresponds to a specific version of the Model Context Protocol (MCP) schema. In the Rust MCP ecosystem, feature flags are often used to select different versions of the MCP schema, allowing developers to work with various schema versions as needed.For instance, the
rust-mcp-schemacrate provides multiple schema versions, including "2024_11_05" and "draft". To use a specific schema version, you can enable the corresponding feature in yourCargo.tomlfile. While the "2025_03_26" version isn't explicitly listed in the available versions, it's possible that this feature flag is intended to select that specific schema version.To utilize the "2025_03_26" schema version, you would typically add the following to your
Cargo.toml:This configuration instructs Cargo to compile the
rust-mcp-schemacrate with the "2025_03_26" feature enabled, thereby selecting that specific schema version.It's important to note that the availability of the "2025_03_26" schema version depends on its inclusion in the
rust-mcp-schemacrate. If this version isn't listed among the available features, it may not be supported or released yet. In such cases, you might need to consult the official MCP documentation or the maintainers of therust-mcp-schemacrate for further information.For more details on the
rust-mcp-schemacrate and its available schema versions, you can visit the GitHub repository: (github.com)Citations:
Document the MCP schema version feature flag
The feature flag
"2025_03_26"inrust-mcp-sdk = { version = "0.4.2", … }selects the Model Context Protocol schema v2025-03-26. To improve maintainability:rust-mcp-schemacrate exposes a2025_03_26feature; if it doesn’t, coordinate with its maintainers or choose an available schema version.🤖 Prompt for AI Agents