fix: update responses limitations doc to track latest state

iamemilio · iamemilio · commit 3a9d5601ee3f · 2025-12-15T11:18:25.000-05:00
diff --git a/docs/docs/providers/openai_responses_limitations.mdx b/docs/docs/providers/openai_responses_limitations.mdx
@@ -5,41 +5,12 @@ sidebar_label: Limitations of Responses API
 sidebar_position: 1
 ---
 
-## Unresolved Issues
+## Issues
 
 This document outlines known limitations and inconsistencies between Llama Stack's Responses API and OpenAI's Responses API. This comparison is based on OpenAI's API and reflects a comparison with the OpenAI APIs as of October 6, 2025 (OpenAI's client version `openai==1.107`).
 See the OpenAI [changelog](https://platform.openai.com/docs/changelog) for details of any new functionality that has been added since that date. Links to issues are included so readers can read about status, post comments, and/or subscribe for updates relating to any limitations that are of specific interest to them. We would also love any other feedback on any use-cases you try that do not work to help prioritize the pieces left to implement.
 Please open new issues in the [meta-llama/llama-stack](https://github.com/meta-llama/llama-stack) GitHub repository with details of anything that does not work that does not already have an open issue.
 
-### Instructions
-**Status:** Partial Implementation + Work in Progress
-
-**Issue:** [#3566](https://github.com/llamastack/llama-stack/issues/3566)
-
-In Llama Stack, the instructions parameter is already implemented for creating a response, but it is not yet included in the output response object.
-
----
-
-### Streaming
-
-**Status:** Partial Implementation
-
-**Issue:** [#2364](https://github.com/llamastack/llama-stack/issues/2364)
-
-Streaming functionality for the Responses API is partially implemented and does work to some extent, but some streaming response objects that would be needed for full compatibility are still missing.
-
----
-
-### Prompt Templates
-
-**Status:** Partial Implementation
-
-**Issue:** [#3321](https://github.com/llamastack/llama-stack/issues/3321)
-
-OpenAI's platform supports [templated prompts using a structured language](https://platform.openai.com/docs/guides/text?api-mode=responses#reusable-prompts). These templates can be stored server-side for organizational sharing. This feature is under development for Llama Stack.
-
----
-
 ### Web-search tool compatibility
 
 **Status:** Partial Implementation
@@ -111,19 +82,9 @@ In OpenAI's API, the `tool_choice` parameter allows you to set restrictions or r
 
 **Status:** Not Implemented
 
-OpenAI's platform allows users to track agentic users using a safety identifier passed with each response. When requests violate moderation or safety rules, account holders are alerted and automated actions can be taken. This capability is not currently available in Llama Stack.
-
----
-
-### Connectors
-
-**Status:** Not Implemented
-
-Connectors are MCP servers maintained and managed by the Responses API provider. OpenAI has documented their connectors at [https://platform.openai.com/docs/guides/tools-connectors-mcp](https://platform.openai.com/docs/guides/tools-connectors-mcp).
+**Issue:** [#4381](https://github.com/llamastack/llama-stack/issues/4381)
 
-**Open Questions:**
-- Should Llama Stack include built-in support for some, all, or none of OpenAI's connectors?
-- Should there be a mechanism for administrators to add custom connectors via `config.yaml` or an API?
+OpenAI's platform allows users to track agentic users using a safety identifier passed with each response. When requests violate moderation or safety rules, account holders are alerted and automated actions can be taken. This capability is not currently available in Llama Stack.
 
 ---
 
@@ -156,16 +117,6 @@ It enables users to also get logprobs for alternative tokens.
 
 ---
 
-### Max Tool Calls
-
-**Status:** Not Implemented
-
-**Issue:** [#3563](https://github.com/llamastack/llama-stack/issues/3563)
-
-The Responses API can accept a `max_tool_calls` parameter that limits the number of tool calls allowed to be executed for a given response. This feature needs full implementation and documentation.
-
----
-
 ### Max Output Tokens
 
 **Status:** Not Implemented
@@ -186,16 +137,6 @@ The return object from a call to Responses includes a field for indicating why a
 
 ---
 
-### Metadata
-
-**Status:** Not Implemented
-
-**Issue:** [#3564](https://github.com/llamastack/llama-stack/issues/3564)
-
-Metadata allows you to attach additional information to a response for your own reference and tracking.  It is not implemented in Llama Stack.
-
----
-
 ### Background
 
 **Status:** Not Implemented
@@ -249,6 +190,8 @@ Sampling allows MCP tools to query the generative AI model. See the [MCP specifi
 - If not, is there a reasonable way to make that work within the API as is? Or would the API need to change?
 - Does this work in Llama Stack?
 
+---
+
 ### Prompt Caching
 
 **Status:** Unknown
@@ -262,14 +205,66 @@ OpenAI provides a [prompt caching](https://platform.openai.com/docs/guides/promp
 
 ---
 
+## Coming Soon
+
+---
+
 ### Parallel Tool Calls
 
-**Status:** Rumored Issue
+**Status:** In Progress
+
+Align Llama Stack Responses Paralell tool calls behavior with OpenAI and harden the implementation with tests.
 
-There are reports that `parallel_tool_calls` may not work correctly. This needs verification and a ticket should be opened if confirmed.
+---
+
+### Connectors
+
+**Status:** In Progress
+
+**Issue:** [#4061](https://github.com/llamastack/llama-stack/issues/4061)
+
+Connectors are MCP servers maintained and managed by the Responses API provider. OpenAI has documented their connectors at [https://platform.openai.com/docs/guides/tools-connectors-mcp](https://platform.openai.com/docs/guides/tools-connectors-mcp).
+
+**Open Questions:**
+- Should Llama Stack include built-in support for some, all, or none of OpenAI's connectors?
+- Should there be a mechanism for administrators to add custom connectors via `config.yaml` or an API?
+
+---
+
+### Server Side Telemetry
+
+**Status:** Merged [Planned 0.4.z]
+
+**Issue:** [#3806](https://github.com/llamastack/llama-stack/issues/3806)
+
+Support OpenTelemetry as the preferred way to instrument Llama Stack.
+
+**Remaining Issues:**
+- Some data needs to be converted to follow semantic conventions for OTEL genai data
+
+---
+
+### Max Tool Calls
+
+**Status:** Merged [Planned 0.4.z]
+
+**Issue:** [#3563](https://github.com/llamastack/llama-stack/issues/3563)
+
+The Responses API can accept a `max_tool_calls` parameter that limits the number of tool calls allowed to be executed for a given response.
+
+---
+
+### Metadata
+
+**Status:** Merged [Planned 0.4.z]
+
+**Issue:** [#3564](https://github.com/llamastack/llama-stack/issues/3564)
+
+Metadata allows you to attach additional information to a response for your own reference and tracking.
 
 ---
 
+
 ## Resolved Issues
 
 The following limitations have been addressed in recent releases:
@@ -297,3 +292,34 @@ The `require_approval` parameter for MCP tools in the Responses API now works co
 **Fixed in:** [#3003](https://github.com/llamastack/llama-stack/pull/3003) (Agent API), [#3602](https://github.com/llamastack/llama-stack/pull/3602) (Responses API)
 
 MCP tools now correctly handle array-type arguments in both the Agent API and Responses API.
+
+---
+
+### Streaming
+
+**Status:** ✅ Resolved
+
+**Issue:** [#2364](https://github.com/llamastack/llama-stack/issues/2364)
+
+Streaming functionality for the Responses API is feature complete and released.
+
+---
+
+### Prompt Templates
+
+**Status:** ✅ Resolved
+
+**Issue:** [#3321](https://github.com/llamastack/llama-stack/issues/3321)
+
+OpenAI's platform supports [templated prompts using a structured language](https://platform.openai.com/docs/guides/text?api-mode=responses#reusable-prompts). These templates can be stored server-side for organizational sharing.
+
+---
+
+### Instructions
+**Status:**  ✅ Resolved
+
+**Issue:** [#3566](https://github.com/llamastack/llama-stack/issues/3566)
+
+The Responses API request and response object now supports the *instructions* field.
+
+---