feat: Reuse Codex CLI OAuth tokens for ChatGPT backend LLM calls#693
feat: Reuse Codex CLI OAuth tokens for ChatGPT backend LLM calls#693ilblackdragon merged 30 commits intonearai:stagingfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances IronClaw's authentication flexibility by allowing it to utilize existing OpenAI Codex CLI OAuth tokens. This means users with an active ChatGPT subscription can now seamlessly integrate IronClaw without the need for additional API key setup, as IronClaw can directly interact with ChatGPT's private backend API. The change introduces new modules for credential management and a dedicated LLM provider for the ChatGPT Responses API, making the system more convenient for a broader user base. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a great new feature to reuse Codex CLI's OAuth tokens, which simplifies setup for many users. The implementation is well-structured, with new modules for authentication and the custom LLM provider. I've found a couple of areas for improvement. There's a high-severity bug in the credential loading logic that could lead to using the wrong LLM provider, and a medium-severity performance issue related to reqwest::Client instantiation. My detailed comments include code suggestions to address these points. Overall, this is a solid contribution.
| if !is_chatgpt { | ||
| if let Some(key) = auth.openai_api_key.filter(|k| !k.is_empty()) { | ||
| tracing::info!("Loaded API key from Codex auth.json (API key mode)"); | ||
| return Some(CodexCredentials { | ||
| token: key, | ||
| is_chatgpt_mode: false, | ||
| }); | ||
| } | ||
| } |
There was a problem hiding this comment.
The current logic has a bug where if auth_mode is "apiKey" but the key is missing, it incorrectly falls through to check for a ChatGPT token. This can lead to returning credentials with is_chatgpt_mode: true, causing the wrong LLM provider to be used.
To fix this, we should explicitly prevent the fallback when auth_mode is set. If auth_mode is "apiKey" and no key is found, the function should return None.
if !is_chatgpt {
if let Some(key) = auth.openai_api_key.filter(|k| !k.is_empty()) {
tracing::info!("Loaded API key from Codex auth.json (API key mode)");
return Some(CodexCredentials {
token: key,
is_chatgpt_mode: false,
});
}
// If auth_mode was explicitly `apiKey`, do not fall back to checking for a token.
if auth.auth_mode.is_some() {
return None;
}
}
src/llm/codex_chatgpt.rs
Outdated
| async fn fetch_default_model(base_url: &str, api_key: &str) -> Option<String> { | ||
| let url = format!("{base_url}/models?client_version=0.1.0"); | ||
| let client = Client::new(); |
There was a problem hiding this comment.
Creating a new reqwest::Client for a single request is inefficient. The client manages a connection pool and is designed to be reused. This function is called from with_auto_model, which then calls Self::new, creating another client. This results in two clients being created to instantiate one provider.
To improve this, the client should be created once in with_auto_model, passed as an argument to this function, and then moved into the CodexChatGptProvider instance. This will require modifying with_auto_model as well.
| async fn fetch_default_model(base_url: &str, api_key: &str) -> Option<String> { | |
| let url = format!("{base_url}/models?client_version=0.1.0"); | |
| let client = Client::new(); | |
| async fn fetch_default_model(client: &Client, base_url: &str, api_key: &str) -> Option<String> { | |
| let url = format!("{base_url}/models?client_version=0.1.0"); |
References
- While the general rule advises against
reqwest::Clientreuse for one-time or infrequent operations with negligible benefit and high churn, this specific scenario involves creating two clients for a single provider instantiation, indicating a clear inefficiency where the benefit of reuse is not negligible.
When LLM_USE_CODEX_AUTH=true, IronClaw reads the Codex CLI's auth.json (default ~/.codex/auth.json) and extracts the API key or OAuth access token. This lets IronClaw piggyback on a Codex login without implementing its own OAuth flow. New env vars: - LLM_USE_CODEX_AUTH: enable Codex auth fallback (default: false) - CODEX_AUTH_PATH: override path to auth.json
Switch base_url to chatgpt.com/backend-api/codex when auth.json contains ChatGPT OAuth tokens. The access_token is a JWT that only works against the private ChatGPT backend, not the public OpenAI API. Refactored codex_auth.rs to return CodexCredentials (token + is_chatgpt_mode) instead of just a string key.
When LLM_USE_CODEX_AUTH=true, Codex credentials are now loaded before checking env vars or the secrets store overlay. Previously the secrets store key (injected during onboarding) would shadow the Codex token.
- New CodexChatGptProvider speaks the Responses API protocol - Auto-detects model from /models endpoint (gpt-4o -> gpt-5.2-codex) - Adds store=false (required by ChatGPT backend) - Error handling with timeout for HTTP 400 responses - Message format translation: Chat Completions -> Responses API - SSE response parsing for text, tool calls, and usage stats - 7 unit tests for message conversion and SSE parsing
The Responses API sends function_call_arguments.delta events with item_id (e.g. fc_...) not call_id (e.g. call_...). The parser now keys pending tool calls by item_id from output_item.added and tracks call_id separately for result matching.
gpt-5.2-codex fills optional tool parameters with empty strings (e.g. timestamp: ""), which IronClaw's tool validation rejects. Strip them before passing to tool execution.
When auth_mode is explicitly 'apiKey' but the key is missing/empty, do not fall through to check for a ChatGPT access_token. This prevents returning credentials with is_chatgpt_mode: true and routing to the wrong LLM provider.
… calls Create Client once in with_auto_model, pass &Client to fetch_default_model, and move it into the provider struct. Eliminates the redundant Client::new() that wasted a connection pool.
The /models endpoint gates newer models behind client_version. Version 0.1.0 only returns up to gpt-5.2-codex, while 1.0.0+ also returns gpt-5.3-codex and gpt-5.4.
Fetch the full model list from /models endpoint. If LLM_MODEL is set, validate it against the supported list and warn with available models if not found. If LLM_MODEL is not set, auto-detect the highest-priority model. Also bumps client_version to 1.0.0 to unlock gpt-5.3/5.4.
zmanian
left a comment
There was a problem hiding this comment.
Architecturally sound -- correctly identifies that Codex ChatGPT tokens need a dedicated provider (different wire format, /responses SSE endpoint vs /chat/completions). Good test coverage (14 tests). No new dependencies.
Issues to address:
1. No timeout on fetch_available_models (must fix). The model discovery request has no timeout. If chatgpt.com is slow or hangs, startup blocks indefinitely. Add a timeout (e.g., 10 seconds).
2. Undocumented private API (must document). The ChatGPT backend endpoint (chatgpt.com/backend-api/codex) is a private, undocumented API. Using subscriber OAuth tokens from a third-party app may violate the token's intended scope or OpenAI's ToS. Add a prominent warning (both in docs and as a log message on first use).
3. No token refresh handling (should fix). OAuth access tokens expire but the PR only reads access_token, not refresh_token. Either implement refresh or document that users need to re-run codex --login when the token expires.
4. block_in_place for async model detection (should fix). Consider making create_registry_provider async or doing model detection lazily on first call.
5. Multimodal content silently dropped (should fix). message_to_input_items only reads msg.content (text). If content_parts contains images, they're silently dropped. At minimum log a warning.
6. Module placement (minor). codex_auth.rs at crate root -- should be under src/llm/ since it's only used by LLM config/provider code.
Prevents startup from blocking indefinitely if chatgpt.com is slow or unreachable. Uses reqwest per-request timeout.
The chatgpt.com/backend-api/codex endpoint is private and undocumented. Add warning in module docs and a runtime log on first use to inform users of potential ToS implications.
On HTTP 401, if a refresh_token is available, the provider now automatically refreshes the access token via auth.openai.com/oauth/token (same protocol as Codex CLI) and retries the request once. Refreshed tokens are persisted back to auth.json. Changes: - codex_auth: read refresh_token, add refresh_access_token() and persist_refreshed_tokens() - codex_chatgpt: RwLock for api_key, 401 detection + retry in send_request, send_http_request helper - config/llm: thread refresh_token/auth_path through RegistryProviderConfig - llm/mod: pass refresh params to with_auto_model
Model is no longer resolved during provider construction. Instead, resolve_model() uses tokio::sync::OnceCell to lazily fetch from /models on the first LLM call. This eliminates the block_in_place + block_on workaround in create_codex_chatgpt_from_registry. - with_auto_model (async) -> with_lazy_model (sync constructor) - resolve_model() added with OnceCell-based lazy init - build_request_body takes model as parameter - model_name() returns resolved or configured_model as fallback
message_to_input_items now checks content_parts for user messages. ContentPart::Text maps to input_text and ContentPart::ImageUrl maps to input_image, matching the Responses API format used by Codex CLI. Falls back to plain text when content_parts is empty. Also updates client_version to 0.111.0 for /models endpoint. Adds test: test_message_conversion_user_with_image
codex_auth is only used by the LLM layer (codex_chatgpt provider and config/llm). Moving it under src/llm/ reflects its actual scope. - Remove pub mod codex_auth from lib.rs - Add pub mod codex_auth to llm/mod.rs - Update imports: super::codex_auth, crate::llm::codex_auth
|
Thanks for the thorough review! All 6 issues have been addressed in the latest push: 1. No timeout on 2. Undocumented private API (documented) 3. No token refresh handling (fixed — passive refresh)
Commit: 4. 5. Multimodal content silently dropped (fixed — full support) 6. Module placement (fixed) All 16 tests pass (7 codex_auth + 9 codex_chatgpt). |
zmanian
left a comment
There was a problem hiding this comment.
Re-review of PR #693 (17 commits, 1232 additions)
The 6 issues from the initial review have all been addressed. The rework is solid: lazy model detection via OnceCell, 401-based token refresh with auth.json persistence, multimodal content support, ToS warning, module moved to src/llm/. Good work.
However, this re-review found several new issues that need fixing before merge.
Issues to fix
1. refresh_token stored as plain String, not SecretString (must fix)
In RegistryProviderConfig, refresh_token is Option<String>. In CodexCredentials, both token and refresh_token are plain String. The access token gets wrapped in SecretString at the config boundary, but the refresh token never does. This means:
- The refresh token will appear in debug logs if
RegistryProviderConfigis ever debug-printed - It persists in memory as a cleartext String with no zeroize-on-drop
Both refresh_token in RegistryProviderConfig and in CodexCredentials should be SecretString (or at minimum Option<SecretString>). The CodexChatGptProvider::refresh_token field should follow suit.
2. No timeout on send_http_request for LLM responses (must fix)
send_http_request() (line ~879 of codex_chatgpt.rs) posts to /responses with no timeout. The fetch_available_models correctly has a 10-second timeout, but the actual LLM request -- which reads an SSE stream that could hang indefinitely -- has none. The reqwest::Client is constructed with Client::new() which has no default timeout.
Add a configurable timeout (respect the project's LLM_REQUEST_TIMEOUT_SECS setting, defaulting to 120s) on the HTTP client or on send_http_request.
3. Race condition in token refresh (should fix)
In send_request(), on a 401:
- Read lock to get
api_key - Send request, get 401
- Call
codex_auth::refresh_access_token(rt, ...) - Write lock to update
api_key - Retry
If two concurrent requests both get a 401 simultaneously, both will attempt to refresh the token in parallel. The second refresh may fail (refresh tokens are often single-use) or produce a different token, creating a race. The NEAR AI provider handles this by checking if the token has already been refreshed (compare the token that failed with the current stored token) before attempting refresh. Consider adding similar guard logic.
4. reqwest::Client not reused (minor, Gemini also flagged this)
Client::new() is called in the constructor and again in refresh_access_token(). The refresh function creates a throwaway client for each refresh. Pass the provider's self.client to the refresh function, or make refresh_access_token accept a &Client parameter.
5. auth.json persistence is not atomic (should fix)
persist_refreshed_tokens() does read_to_string -> modify -> write. If the process crashes mid-write, auth.json is corrupted. Use write-to-temp-then-rename (atomic write pattern). The tempfile crate is already a dependency. Example: write to a sibling temp file, then std::fs::rename().
6. SSE parsing reads entire response body into memory (minor)
resp.text().await buffers the entire SSE stream. For normal responses this is fine, but a misbehaving server could send unbounded data. Consider adding a body size limit via resp.bytes() with a cap, or using streaming SSE parsing. Low priority since this is behind an opt-in flag.
7. No CLAUDE.md / module spec update (must fix per project rules)
Per the project's LLM module docs (src/llm/CLAUDE.md), new providers must be documented there. The file map table, provider selection table, and provider chain sections need updating to mention codex_chatgpt. Also update .env.example with LLM_USE_CODEX_AUTH documentation.
Overlap with PR #744 (OpenAI Codex provider)
PR #744 adds a first-class LLM_BACKEND=openai_codex with its own OAuth device-code login flow, session persistence, and token-refreshing decorator. Both PRs target the same endpoint (chatgpt.com/backend-api/codex) and implement similar SSE parsing for the Responses API.
Key differences:
- This PR (#693) piggybacks on Codex CLI's
auth.json-- zero IronClaw-side OAuth flow - PR #744 implements a full device-code OAuth flow within IronClaw itself
These are complementary but will conflict heavily on merge. The SSE parsing, message conversion, and model detection logic are near-identical. Whoever merges second will need to deduplicate. Recommend coordinating: either merge one first and rebase the other, or extract the shared Responses API client into a common module that both can use.
What looks good
- All 6 original review items addressed correctly
- 16 tests (7 codex_auth + 9 codex_chatgpt) with good edge case coverage
- No
.unwrap()/.expect()in production code - Lazy model detection via OnceCell avoids blocking during provider setup
store: falsecorrectly set (ChatGPT backend requirement)strip_empty_string_valuesis a pragmatic fix for gpt-5.2-codex quirk- Gemini's apiKey-mode fallback bug was caught and fixed with a test
- Module correctly placed in
src/llm/ - Feature is opt-in (LLM_USE_CODEX_AUTH=true), no breaking changes
Verdict: REQUEST_CHANGES
Items 1 (SecretString for refresh_token), 2 (LLM request timeout), and 7 (CLAUDE.md update) are required. Items 3 (race condition) and 5 (atomic write) are strongly recommended. Items 4 and 6 are nice-to-haves.
zmanian
left a comment
There was a problem hiding this comment.
All 6 previous review items addressed (timeout on model fetch, ToS warning, 401-based refresh, lazy resolution, multimodal content, module location). However, new issues found:
-
refresh_token stored as plain String -- Should be
SecretStringfor zeroize-on-drop. The project usessecrecy::SecretStringextensively for sensitive values. -
No timeout on LLM HTTP requests --
send_http_request()has no timeout. Model fetch has one, but the actual LLM call doesn't. -
No CLAUDE.md update -- Project rules require updating
src/llm/CLAUDE.mdand.env.examplefor new providers.
Strongly recommended:
- Race condition in concurrent token refresh (two 401s could both refresh, second may fail with single-use refresh tokens)
- Non-atomic auth.json write (crash mid-write corrupts the file; use write-to-temp-then-rename)
Note: Overlaps significantly with PR #744 -- both target the same endpoint with near-identical SSE parsing. Should coordinate to extract shared Responses API client.
|
Thanks for the detailed re-review. The issues you called out have now been addressed on the latest revision. Fixed:
Validation run locally:
The latest branch head includes the follow-up fixes for client reuse and streaming SSE parsing. |
zmanian
left a comment
There was a problem hiding this comment.
Re-review #3: All previous feedback addressed
This is a thorough rework. All 13 issues from the two prior reviews have been addressed:
First review (6/6 fixed)
- Timeout on model fetch -- 10s timeout added
- Private API warning -- doc comment + runtime
tracing::warn!on construction - Token refresh -- Full 401-based refresh with
refresh_access_token(), persists to auth.json - block_in_place -- Replaced with
OnceCelllazy model resolution - Multimodal content --
content_partsmapped toinput_text/input_image - Module placement -- Moved to
src/llm/
Second review (7/7 fixed)
- SecretString for refresh_token --
SecretStringused throughout:CodexCredentials,RegistryProviderConfig,CodexChatGptProvider,CodexTokens,RefreshResponse - LLM request timeout --
send_http_requesttakes configurableDuration, respectsLLM_REQUEST_TIMEOUT_SECS - Race condition in token refresh --
refresh_lock: Mutex<()>+ token comparison guard (same pattern as NEAR AI provider) - Client reuse for refresh --
refresh_access_tokennow takes&reqwest::Clientparameter, reuses provider's client - Atomic auth.json write -- write-to-temp-then-rename pattern with cleanup on rename failure
- SSE streaming -- Now uses
eventsource-streamcrate for incremental parsing with per-event idle timeout (no full-body buffering) - CLAUDE.md update --
src/llm/CLAUDE.mdfile map and provider sections updated,.env.exampleupdated
New observations (non-blocking)
1. pub mod visibility (style, non-blocking). codex_chatgpt is only used within src/llm/mod.rs -- could be pub(crate) instead of pub. Same applies to codex_auth if it's only used from crate::config::llm and llm/mod.rs. Not a blocker.
2. eventsource-stream dependency (acceptable). New dependency eventsource-stream = "0.2". Lightweight crate (2 transitive deps: nom and memchr, both already in the tree). Reasonable for proper SSE parsing vs hand-rolling it.
3. client_version=0.111.0 hardcoded (noted). The /models endpoint uses a hardcoded client version string. If OpenAI changes version gating, this will need updating. Fine for now since it's behind an opt-in flag.
4. Overlap with PR #744 (coordination note). Both PRs target the same ChatGPT backend endpoint with similar SSE parsing logic. Whichever merges second should deduplicate into a shared Responses API client module.
What looks good
- Zero
.unwrap()/.expect()in production code - 16 tests with solid edge case coverage (apiKey fallback bug, image content, SSE streaming, empty string stripping)
- All sensitive values use
SecretStringwithsecrecycrate (zeroize-on-drop) - Entirely opt-in via
LLM_USE_CODEX_AUTH=true, no breaking changes - Clean separation:
codex_auth.rs(credential loading/refresh) vscodex_chatgpt.rs(Responses API provider) store: falsecorrectly enforced for ChatGPT backend- Lazy model detection avoids blocking provider setup
Approved.
|
All follow-up fixes are in, and the latest branch head is now stable. The previous approval appears to have been dismissed after the later sync/check-trigger commit. Could you please take another quick look and re-approve if everything still looks good? |
zmanian
left a comment
There was a problem hiding this comment.
Review: feat: Reuse Codex CLI OAuth tokens for ChatGPT backend LLM calls
Overall this is well-structured and follows the existing LLM provider patterns well. The token refresh handling with mutex-based race prevention is solid. However, there are security and code quality issues that should be addressed before merge.
Security Issues (Must Fix)
1. File permissions not set after writing auth.json
persist_refreshed_tokens() in codex_auth.rs writes refreshed OAuth tokens to auth.json via std::fs::write() + std::fs::rename() but never sets file permissions. This file contains bearer tokens and refresh tokens. On Unix systems, the file inherits the umask default (often 0644), making it world-readable.
After the rename, set permissions to 0600:
#[cfg(unix)]
{
use std::os::unix::fs::PermissionsExt;
let _ = std::fs::set_permissions(path, std::fs::Permissions::from_mode(0o600));
}2. RefreshRequest derives Debug with plaintext refresh token
RefreshRequest (line ~254) has #[derive(Debug, Serialize)] but contains refresh_token: &'a str as a plain string. If anything in the error/retry path debug-prints this struct, the refresh token leaks to logs. Either:
- Remove
Debugderive fromRefreshRequest, or - Wrap the field and implement a custom Debug that redacts it
This is defense-in-depth -- the struct isn't currently debug-printed, but the derive makes it easy to accidentally do so.
Code Quality Issues
3. Module visibility inconsistency
codex_chatgpt is declared pub mod but anthropic_oauth (which serves an analogous role) is mod. Since codex_chatgpt is only consumed within llm/mod.rs, it should be mod codex_chatgpt (or pub(crate) mod at most) for consistency with the existing pattern.
Similarly, codex_auth needs to be pub (or pub(crate)) since it's called from config/llm.rs via crate::llm::codex_auth::, but explicitly marking it pub(crate) would be more intentional.
4. Unrelated change bundled in PR
The main.rs change (#[cfg(unix)] on ChannelSecretUpdater import) is unrelated to Codex auth. Consider splitting it out or at minimum noting it in the PR description.
Minor / Non-blocking
5. default_codex_auth_path() uses dirs::home_dir().unwrap_or_else(|| PathBuf::from(".")) which falls back to CWD silently. This is fine for a best-effort path, but a tracing::warn! when home_dir() returns None would help debugging on unusual systems.
6. No token expiry check before use. The access token (a JWT) is used as-is without checking its exp claim. A proactive check could avoid one round-trip 401 + refresh cycle. This is non-blocking since the refresh-on-401 flow handles it, but would be a nice optimization.
7. store: false assertion in tests is good. The explicit test that store: false is set in the request body is a nice touch for ensuring ChatGPT backend compliance.
What Looks Good
- Token refresh race condition handling via
refresh_lockmutex with "already refreshed" check is well done SecretStringusage for all token storage- SSE parsing with idle timeout prevents silent hangs
strip_empty_string_valuesaddresses a real model quirk pragmatically- Lazy model resolution avoids blocking during provider setup
- Good test coverage (14 tests across both modules)
- Clear warning about private/undocumented API usage
- Error handling consistently uses
LlmErrorvariants with context
|
Fixed in this follow-up:
Also included as a small supporting cleanup:
|
zmanian
left a comment
There was a problem hiding this comment.
Thanks for addressing the review feedback. All three must-fix issues are resolved:
- File permissions: set_auth_file_permissions() now sets 0600 on Unix after write
- RefreshRequest: Debug derive removed, only Serialize remains
- Module visibility: codex_chatgpt is now
mod, codex_auth ispub(crate) mod
The unrelated main.rs change (ChannelSecretUpdater import move) is still present but it's cosmetic and non-blocking.
LGTM
…rai#693) * feat: add Codex auth.json token reuse for LLM authentication When LLM_USE_CODEX_AUTH=true, IronClaw reads the Codex CLI's auth.json (default ~/.codex/auth.json) and extracts the API key or OAuth access token. This lets IronClaw piggyback on a Codex login without implementing its own OAuth flow. New env vars: - LLM_USE_CODEX_AUTH: enable Codex auth fallback (default: false) - CODEX_AUTH_PATH: override path to auth.json * fix: handle ChatGPT auth mode correctly Switch base_url to chatgpt.com/backend-api/codex when auth.json contains ChatGPT OAuth tokens. The access_token is a JWT that only works against the private ChatGPT backend, not the public OpenAI API. Refactored codex_auth.rs to return CodexCredentials (token + is_chatgpt_mode) instead of just a string key. * fix: Codex auth takes highest priority over secrets store When LLM_USE_CODEX_AUTH=true, Codex credentials are now loaded before checking env vars or the secrets store overlay. Previously the secrets store key (injected during onboarding) would shadow the Codex token. * feat: Responses API provider for ChatGPT backend - New CodexChatGptProvider speaks the Responses API protocol - Auto-detects model from /models endpoint (gpt-4o -> gpt-5.2-codex) - Adds store=false (required by ChatGPT backend) - Error handling with timeout for HTTP 400 responses - Message format translation: Chat Completions -> Responses API - SSE response parsing for text, tool calls, and usage stats - 7 unit tests for message conversion and SSE parsing * fix: SSE parser uses item_id instead of call_id for tool call deltas The Responses API sends function_call_arguments.delta events with item_id (e.g. fc_...) not call_id (e.g. call_...). The parser now keys pending tool calls by item_id from output_item.added and tracks call_id separately for result matching. * fix: strip empty string values from tool call arguments gpt-5.2-codex fills optional tool parameters with empty strings (e.g. timestamp: ""), which IronClaw's tool validation rejects. Strip them before passing to tool execution. * fix: prevent apiKey mode fallback to ChatGPT token When auth_mode is explicitly 'apiKey' but the key is missing/empty, do not fall through to check for a ChatGPT access_token. This prevents returning credentials with is_chatgpt_mode: true and routing to the wrong LLM provider. * refactor: reuse single reqwest::Client across model discovery and LLM calls Create Client once in with_auto_model, pass &Client to fetch_default_model, and move it into the provider struct. Eliminates the redundant Client::new() that wasted a connection pool. * fix: bump client_version to 1.0.0 to unlock gpt-5.3-codex and gpt-5.4 The /models endpoint gates newer models behind client_version. Version 0.1.0 only returns up to gpt-5.2-codex, while 1.0.0+ also returns gpt-5.3-codex and gpt-5.4. * feat: user-configured LLM_MODEL takes priority over auto-detection Fetch the full model list from /models endpoint. If LLM_MODEL is set, validate it against the supported list and warn with available models if not found. If LLM_MODEL is not set, auto-detect the highest-priority model. Also bumps client_version to 1.0.0 to unlock gpt-5.3/5.4. * fix: add 10s timeout to model discovery HTTP request Prevents startup from blocking indefinitely if chatgpt.com is slow or unreachable. Uses reqwest per-request timeout. * docs: add private API warning for ChatGPT backend endpoint The chatgpt.com/backend-api/codex endpoint is private and undocumented. Add warning in module docs and a runtime log on first use to inform users of potential ToS implications. * feat: implement OAuth 401 token refresh for Codex ChatGPT provider On HTTP 401, if a refresh_token is available, the provider now automatically refreshes the access token via auth.openai.com/oauth/token (same protocol as Codex CLI) and retries the request once. Refreshed tokens are persisted back to auth.json. Changes: - codex_auth: read refresh_token, add refresh_access_token() and persist_refreshed_tokens() - codex_chatgpt: RwLock for api_key, 401 detection + retry in send_request, send_http_request helper - config/llm: thread refresh_token/auth_path through RegistryProviderConfig - llm/mod: pass refresh params to with_auto_model * refactor: lazy model detection via OnceCell, remove block_in_place Model is no longer resolved during provider construction. Instead, resolve_model() uses tokio::sync::OnceCell to lazily fetch from /models on the first LLM call. This eliminates the block_in_place + block_on workaround in create_codex_chatgpt_from_registry. - with_auto_model (async) -> with_lazy_model (sync constructor) - resolve_model() added with OnceCell-based lazy init - build_request_body takes model as parameter - model_name() returns resolved or configured_model as fallback * feat: support multimodal content (images) in Codex ChatGPT provider message_to_input_items now checks content_parts for user messages. ContentPart::Text maps to input_text and ContentPart::ImageUrl maps to input_image, matching the Responses API format used by Codex CLI. Falls back to plain text when content_parts is empty. Also updates client_version to 0.111.0 for /models endpoint. Adds test: test_message_conversion_user_with_image * refactor: move codex_auth module from src/ to src/llm/ codex_auth is only used by the LLM layer (codex_chatgpt provider and config/llm). Moving it under src/llm/ reflects its actual scope. - Remove pub mod codex_auth from lib.rs - Add pub mod codex_auth to llm/mod.rs - Update imports: super::codex_auth, crate::llm::codex_auth * Fix codex provider style issues * Use SecretString throughout codex auth refresh flow * Use SecretString for codex access tokens * Reuse provider client for codex token refresh * Stream Codex SSE responses incrementally * Fix Windows clippy and SQLite test linkage * Trigger checks after regression skip label * Tighten codex auth module handling
Summary
This PR enables IronClaw to reuse OAuth authentication tokens from the Codex CLI's
auth.jsonfile, allowing users with an active OpenAI/ChatGPT subscription to use IronClaw without needing a separate API key. When enabled, IronClaw communicates directly with ChatGPT's private Responses API backend (chatgpt.com/backend-api/codex) using the subscriber's JWT token.Motivation
Currently, IronClaw requires users to configure a separate LLM API key (e.g., from OpenAI Platform, Anthropic, or a third-party proxy). Users who already have a ChatGPT Plus/Pro/Team subscription and use the Codex CLI have valid OAuth tokens stored locally, but cannot leverage them with IronClaw. This PR bridges that gap, making it easy for ChatGPT subscribers to get started with IronClaw without additional API key setup.
Changes
New Files
src/codex_auth.rs— Reads and parses Codex CLI's~/.codex/auth.json, supporting bothapiKeyandchatgptauthentication modes. ReturnsCodexCredentialswith the token, mode flag, and correct base URL.src/llm/codex_chatgpt.rs— NewLlmProviderimplementation (CodexChatGptProvider) that speaks the OpenAI Responses API protocol (POST /responseswith SSE streaming), which is different from the standard Chat Completions API. Includes:/modelsendpoint (e.g.,gpt-5.2-codex)ChatMessageformat to Responses API input itemsstore: falseflag required by the ChatGPT backendModified Files
src/config/llm.rs— Addedis_codex_chatgptfield toRegistryProviderConfig. WhenLLM_USE_CODEX_AUTH=true, Codex credentials are loaded with highest priority (over env vars and secrets store). In ChatGPT mode, the base URL is automatically overridden to the private backend endpoint.src/llm/mod.rs— Addedmod codex_chatgptand dispatch logic increate_registry_providerto route to the new Responses API provider whenis_codex_chatgptis true.src/lib.rs— Registered thecodex_authmodule.Usage
The feature requires that the user has previously authenticated with the Codex CLI (
codex --login), which stores the OAuth token in~/.codex/auth.json.Compatibility
LLM_USE_CODEX_AUTH=true.LLM_USE_CODEX_AUTH=false(default), all existing LLM provider paths remain completely unchanged.is_codex_chatgptflag only affects the provider selection when Codex ChatGPT mode credentials are detected; all other provider protocols (OpenAI-compatible, Anthropic, Ollama) are unaffected.auth.json(apiKeymode) are also supported and route through the existing OpenAI-compatible provider.Testing
codex_chatgpt.rscovering:instructionsfieldstore: falseassertioncodex_auth.rscovering auth.json parsinggpt-5.2-codexmodelBackground
This PR builds on the observation made during #384 (
feat(setup): Anthropic OAuth onboarding), where Codex OAuth support was initially attempted but then intentionally removed with:This PR solves that exact problem by implementing a dedicated
CodexChatGptProviderthat speaks the Responses API protocol natively (POST /responseswith SSE streaming), rather than trying to use the Codex OAuth token as a standard OpenAI API key. This approach correctly handles the different wire format and endpoint.