-
Notifications
You must be signed in to change notification settings - Fork 1.5k
feat(wren-ai-service): add support for new GPT-5 models in configuration files (ai-env-changed) #1875
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Caution Review failedThe pull request is closed. WalkthroughThis change adds three new GPT-5 model variants—gpt-5, gpt-5-mini, and gpt-5-nano—to multiple configuration files and user selection logic. It also updates dependency versions in the Python service to ensure compatibility with the new models, with specific adjustments for Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Launcher
participant Config
participant AIService
User->>Launcher: Start launch command
Launcher->>User: Prompt for generation model selection
User->>Launcher: Selects "gpt-5" (or mini/nano)
Launcher->>Config: Validate model selection
Config-->>Launcher: Model is valid (new GPT-5 variant)
Launcher->>AIService: Passes model config for generation
AIService-->>Launcher: Uses new GPT-5 model parameters
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Note 🔌 MCP (Model Context Protocol) integration is now available in Early Access!Pro users can now connect to remote MCP servers under the Integrations page to get reviews and chat conversations that understand additional development context. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (4)
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (1)
wren-ai-service/tools/config/config.full.yaml (1)
27-44: Repeat of GPT-5temperature/aliasconcerns
Please apply the same fixes here to keep behaviour consistent across all shipped config templates.
🧹 Nitpick comments (1)
docker/config.example.yaml (1)
45-45: Trailing spaces flagged by yamllint
Line 45 has stray whitespace → CI lint failure.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
wren-ai-service/poetry.lockis excluded by!**/*.lock
📒 Files selected for processing (6)
deployment/kustomizations/base/cm.yaml(1 hunks)docker/config.example.yaml(1 hunks)wren-ai-service/pyproject.toml(2 hunks)wren-ai-service/tools/config/config.example.yaml(1 hunks)wren-ai-service/tools/config/config.full.yaml(1 hunks)wren-launcher/commands/launch.go(2 hunks)
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: cyyeh
PR: Canner/WrenAI#1763
File: wren-ai-service/src/pipelines/generation/semantics_description.py:31-33
Timestamp: 2025-06-20T02:37:21.292Z
Learning: In the wren-ai-service codebase, when adding new fields like "alias" to the output of functions that use Pydantic models for validation, the user prefers not to update the corresponding Pydantic model definitions to include these new fields.
📚 Learning: 2025-06-20T02:37:21.292Z
Learnt from: cyyeh
PR: Canner/WrenAI#1763
File: wren-ai-service/src/pipelines/generation/semantics_description.py:31-33
Timestamp: 2025-06-20T02:37:21.292Z
Learning: In the wren-ai-service codebase, when adding new fields like "alias" to the output of functions that use Pydantic models for validation, the user prefers not to update the corresponding Pydantic model definitions to include these new fields.
Applied to files:
wren-ai-service/tools/config/config.full.yamldeployment/kustomizations/base/cm.yamlwren-ai-service/tools/config/config.example.yamlwren-ai-service/pyproject.toml
📚 Learning: 2025-01-27T08:55:44.919Z
Learnt from: cyyeh
PR: Canner/WrenAI#1236
File: wren-ai-service/src/providers/llm/litellm.py:49-49
Timestamp: 2025-01-27T08:55:44.919Z
Learning: In the LitellmLLMProvider class, model parameters (`self._model_kwargs`) are intentionally designed to take precedence over runtime parameters (`generation_kwargs`), which differs from other LLM providers. This is the intended behavior specific to this provider.
Applied to files:
wren-ai-service/tools/config/config.full.yamldeployment/kustomizations/base/cm.yamlwren-ai-service/tools/config/config.example.yaml
📚 Learning: 2025-02-12T22:05:37.109Z
Learnt from: cyyeh
PR: Canner/WrenAI#1293
File: wren-ai-service/pyproject.toml:38-38
Timestamp: 2025-02-12T22:05:37.109Z
Learning: The qdrant-client package version in wren-ai-service must match the Qdrant server version (1.11.0) to maintain compatibility.
Applied to files:
wren-ai-service/pyproject.toml
🪛 YAMLlint (1.37.1)
docker/config.example.yaml
[error] 45-45: trailing spaces
(trailing-spaces)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Analyze (go)
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (5)
wren-ai-service/pyproject.toml (2)
13-13: Verify uvicorn 0.29.x compatibility with Python 3.12
uvicorn0.29.0 was released prior to the final 3.12.0 cut-off and had a few asyncio/typing glitches that were patched in 0.29.1+. Because the service targetspython = ">=3.12.*", please make sure the exact minor chosen (0.29.0 vs 0.29.1/0.29.2) is tested under 3.12; otherwise bump to the latest 0.29.* to avoid runtime warnings.
36-37: No action needed: dependency constraints are satisfied
The caret (^1.34.34) spec for boto3 allows patch-level updates and your lockfile shows:
- boto3==1.34.34
- botocore==1.34.162
Both satisfy
botocore>=1.34.0,<1.35(pulled in by haystack-ai and qdrant-haystack), and there are no mismatched versions. Just keep your caret spec and ensurewren-ai-service/poetry.lockremains committed.wren-launcher/commands/launch.go (2)
110-113: UI prompt uses short model names that don’t exist in config files
The select list returns “gpt-5”, “gpt-5-mini”, etc., but the generated YAML expects the date-suffixed identifiersgpt-5-2025-08-07, … UnlessPrepareConfigFileForOpenAImaps the short names → full names, users will hit “model not found” at runtime. Confirm the mapping or update the prompt to display the exact identifiers.
392-395: Keep validation map in sync with prompt & config
Good catch adding the new keys. Once the mapping issue above is settled, ensure the same canonical identifiers are used here to avoid silent mismatches.deployment/kustomizations/base/cm.yaml (1)
91-95: Ignoreseedplacement recommendationThe LitellmLLMProvider treats any top-level parameters (including
seed) as part of itsmodel_kwargs, so yourseed: 0will be picked up and honored exactly as intended. No change to the ConfigMap is needed here.Likely an incorrect or invalid review comment.
| - model: gpt-5-nano-2025-08-07 | ||
| context_window_size: 1000000 | ||
| kwargs: | ||
| max_tokens: 4096 | ||
| n: 1 | ||
| seed: 0 | ||
| - model: gpt-5-mini-2025-08-07 | ||
| context_window_size: 1000000 | ||
| kwargs: | ||
| max_tokens: 4096 | ||
| n: 1 | ||
| seed: 0 | ||
| - model: gpt-5-2025-08-07 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Add temperature to keep generation behaviour consistent
All existing GPT-4.1 entries explicitly set temperature: 0 inside kwargs.
Omitting it for the three new GPT-5 variants means the provider will fall back to its own default (often ≈ 0.7), changing determinism and potentially breaking tests that assume zero-temperature generation.
kwargs:
max_tokens: 4096
n: 1
seed: 0
+ temperature: 0Please confirm whether non-zero temperature is intended; if not, the above patch aligns behaviour with the existing models.
🏁 Script executed:
#!/bin/bash
set -e
echo "Existing GPT-4 entries (showing 2 lines before and 5 after each match):"
rg -n "model: gpt-4" -B2 -A5 deployment/kustomizations/base/cm.yaml
echo -e "\nNew GPT-5 entries (showing 2 lines before and 5 after each match):"
rg -n "model: gpt-5" -B2 -A5 deployment/kustomizations/base/cm.yamlLength of output: 1516
Add explicit temperature: 0 to new GPT-5 entries for deterministic behavior
The existing GPT-4.1 variants all include
kwargs:
…
seed: 0
temperature: 0but the new GPT-5 models omit temperature, causing the provider to fall back to its default (≈0.7). To keep generation behavior consistent and tests from breaking, please add:
@@ deployment/kustomizations/base/cm.yaml
kwargs:
max_tokens: 4096
n: 1
seed: 0
+ temperature: 0Applies to lines 78–90.
🤖 Prompt for AI Agents
In deployment/kustomizations/base/cm.yaml lines 78 to 90, the new GPT-5 model
entries are missing the explicit temperature: 0 setting under kwargs, which
causes non-deterministic behavior. To fix this, add temperature: 0 alongside
seed: 0 in the kwargs section for each GPT-5 model entry to ensure consistent
and deterministic generation behavior.
docker/config.example.yaml
Outdated
| - model: gpt-5-nano-2025-08-07 | ||
| context_window_size: 1000000 | ||
| kwargs: | ||
| max_tokens: 4096 | ||
| n: 1 | ||
| seed: 0 | ||
| - model: gpt-5-mini-2025-08-07 | ||
| context_window_size: 1000000 | ||
| kwargs: | ||
| max_tokens: 4096 | ||
| n: 1 | ||
| seed: 0 | ||
| - model: gpt-5-2025-08-07 | ||
| context_window_size: 1000000 | ||
| kwargs: | ||
| max_tokens: 4096 | ||
| n: 1 | ||
| seed: 0 | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
temperature missing → non-deterministic GPT-5 responses
All existing GPT-4.1 entries hard-code temperature: 0; the new GPT-5 models omit it, so Litellm/OpenAI will default to ~1.0. That yields very different, non-reproducible answers and breaks tests or embeddings that assume deterministic output.
Add temperature: 0 (or another explicit value) unless randomness is intended.
seed: 0
+ temperature: 0📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - model: gpt-5-nano-2025-08-07 | |
| context_window_size: 1000000 | |
| kwargs: | |
| max_tokens: 4096 | |
| n: 1 | |
| seed: 0 | |
| - model: gpt-5-mini-2025-08-07 | |
| context_window_size: 1000000 | |
| kwargs: | |
| max_tokens: 4096 | |
| n: 1 | |
| seed: 0 | |
| - model: gpt-5-2025-08-07 | |
| context_window_size: 1000000 | |
| kwargs: | |
| max_tokens: 4096 | |
| n: 1 | |
| seed: 0 | |
| - model: gpt-5-nano-2025-08-07 | |
| context_window_size: 1000000 | |
| kwargs: | |
| max_tokens: 4096 | |
| n: 1 | |
| seed: 0 | |
| temperature: 0 | |
| - model: gpt-5-mini-2025-08-07 | |
| context_window_size: 1000000 | |
| kwargs: | |
| max_tokens: 4096 | |
| n: 1 | |
| seed: 0 | |
| temperature: 0 | |
| - model: gpt-5-2025-08-07 | |
| context_window_size: 1000000 | |
| kwargs: | |
| max_tokens: 4096 | |
| n: 1 | |
| seed: 0 | |
| temperature: 0 | |
🧰 Tools
🪛 YAMLlint (1.37.1)
[error] 45-45: trailing spaces
(trailing-spaces)
🤖 Prompt for AI Agents
In docker/config.example.yaml between lines 27 and 45, the new GPT-5 model
entries lack the temperature setting, causing non-deterministic outputs due to
the default temperature of ~1.0. To fix this, add the line "temperature: 0"
under each GPT-5 model's kwargs section to ensure deterministic responses
consistent with the existing GPT-4.1 entries.
| - model: gpt-5-nano-2025-08-07 | ||
| context_window_size: 1000000 | ||
| kwargs: | ||
| max_tokens: 4096 | ||
| n: 1 | ||
| seed: 0 | ||
| - model: gpt-5-mini-2025-08-07 | ||
| context_window_size: 1000000 | ||
| kwargs: | ||
| max_tokens: 4096 | ||
| n: 1 | ||
| seed: 0 | ||
| - model: gpt-5-2025-08-07 | ||
| context_window_size: 1000000 | ||
| kwargs: | ||
| max_tokens: 4096 | ||
| n: 1 | ||
| seed: 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Inconsistent defaults & missing temperature for GPT-5 models
Same issue as in docker/config.example.yaml: omit temperature leads to divergent behaviour. Mirror the GPT-4.1 rows or justify the change.
Also, note that only the GPT-4.1-nano row has alias: default; switching to GPT-5 via wren-launcher will not update the alias and the pipelines will still resolve litellm_llm.default → GPT-4.1-nano. Consider adding an alias or updating PrepareConfigFileForOpenAI.
🏁 Script executed:
#!/bin/bash
set -e
# Locate all config.example.yaml files
echo "Config files in repo:"
fd --type f config.example.yaml
echo
# Show GPT-4.1 entries in wren-ai-service config
echo "=== GPT-4.1 entries in wren-ai-service/tools/config/config.example.yaml ==="
rg -n "model: gpt-4.1" -A5 wren-ai-service/tools/config/config.example.yaml
echo
# Show alias lines in wren-ai-service config
echo "=== alias entries in wren-ai-service/tools/config/config.example.yaml ==="
rg -n "alias:" -A2 wren-ai-service/tools/config/config.example.yaml || echo "No alias found"
echo
# Show temperature settings in wren-ai-service config
echo "=== temperature occurrences in wren-ai-service/tools/config/config.example.yaml ==="
rg -n "temperature" -n wren-ai-service/tools/config/config.example.yaml || echo "No temperature settings found"
echo
# Show GPT-5 entries in wren-ai-service config (for context)
echo "=== GPT-5 entries in wren-ai-service/tools/config/config.example.yaml ==="
rg -n "model: gpt-5" -A4 wren-ai-service/tools/config/config.example.yaml
echo
# Locate docker config example file
DOCKER_CFG=$(fd -g "docker/config.example.yaml")
echo "docker config file: $DOCKER_CFG"
echo
# Preview first 50 lines of docker/config.example.yaml
head -n 50 "$DOCKER_CFG"Length of output: 2373
🏁 Script executed:
#!/bin/bash
set -e
echo "=== First 50 lines of docker/config.example.yaml ==="
head -n 50 docker/config.example.yaml || { echo "Failed to read docker/config.example.yaml"; exit 1; }Length of output: 1145
Ensure consistent temperature and default alias for GPT-5 models
The GPT-5 entries currently omit temperature (falling back to provider defaults) and lack an alias: default, which causes divergent behavior and prevents wren-launcher from switching the default. Please mirror the GPT-4.1 blocks or justify this change:
-
wren-ai-service/tools/config/config.example.yaml
• Under the three GPT-5 models (gpt-5-nano-2025-08-07,gpt-5-mini-2025-08-07,gpt-5-2025-08-07), addtemperature: 0
• Add
- alias: default
to the
gpt-5-nano-2025-08-07block so thatlitellm_llm.defaultno longer resolves to the old GPT-4.1-nano model. -
docker/config.example.yaml
Apply the same updates for the GPT-5 entries: includetemperature: 0and setalias: defaulton thegpt-5-nanorow.
If there’s an intentional reason to drop these defaults (e.g., a new default temperature or alias scheme), please document it or adjust PrepareConfigFileForOpenAI accordingly.
🤖 Prompt for AI Agents
In wren-ai-service/tools/config/config.example.yaml between lines 27 and 44, the
GPT-5 model entries lack the temperature setting and a default alias, causing
inconsistent behavior and preventing wren-launcher from switching defaults. Add
temperature: 0 under each GPT-5 model's configuration and add alias: default to
the gpt-5-nano-2025-08-07 block to align with GPT-4.1 blocks. Also, apply the
same changes to docker/config.example.yaml for consistency. If these defaults
are intentionally omitted, document the reason or update
PrepareConfigFileForOpenAI accordingly.
…ross configuration files
…380000 in config.full.yaml
Summary by CodeRabbit