Skip to content

Conversation

@shige
Copy link
Member

@shige shige commented Dec 9, 2025

Summary

  • Rename Anthropic model IDs from date-suffixed format to dot notation for consistency with claude-opus-4.5
    • claude-sonnet-4-5-20250929claude-sonnet-4.5
    • claude-haiku-4-5-20251001claude-haiku-4.5
  • Update registry keys accordingly (anthropic/claude-sonnet-4.5, anthropic/claude-haiku-4.5)
  • Add legacy model IDs to conversion layer for backward compatibility with existing stored data
main This PR
image image
image image

Note

Standardizes Anthropic Claude IDs to dot notation (e.g., claude-sonnet-4.5, claude-haiku-4.5) across codebase, updates registries/pricing/UI, and adds legacy fallbacks and conversion mappings.

  • Language Model Core:
    • Update AnthropicLanguageModelId enum and fallback parsing to accept dot notation (claude-sonnet-4.5, claude-haiku-4.5) and normalize legacy variants.
    • Adjust model definitions (models) to use new IDs.
    • Update pricing keys in anthropicTokenPricing.
  • Registry:
    • Rename registry entries to anthropic/claude-sonnet-4.5 and anthropic/claude-haiku-4.5.
  • SDK Transform:
    • Switch case handling to new Anthropic IDs in transform-giselle-to-ai-sdk.
  • Conversion Layer:
    • Map text-generation IDs to content-generation IDs and back using dot notation; add legacy IDs to supported conversions.
  • UI:
    • Update default/toolbar recommended Anthropic models to new IDs.
    • Refresh example/loading fixtures to use claude-sonnet-4.5.
  • Tests/Fixtures:
    • Revise tests to validate new enum values and fallbacks; update node fixtures and round-trip expectations.

Written by Cursor Bugbot for commit 0d9a6a0. This will update automatically on new commits. Configure here.

Summary by CodeRabbit

  • Chores

    • Standardized Anthropic Claude model identifiers to a simplified 4.5 format (e.g., claude-sonnet-4.5, claude-haiku-4.5) across the app, registry, pricing, and fixtures.
    • Updated tiered model recommendations to surface the new 4.5 variants for free and paid users.
  • Tests

    • Adjusted tests to expect the normalized 4.5 model identifiers for consistency.

✏️ Tip: You can customize this high-level summary in your review settings.

- claude-sonnet-4-5-20250929 → claude-sonnet-4.5
- claude-haiku-4-5-20251001 → claude-haiku-4.5
- Update fallback regex patterns for backward compatibility
@shige shige self-assigned this Dec 9, 2025
Copilot AI review requested due to automatic review settings December 9, 2025 12:26
@changeset-bot
Copy link

changeset-bot bot commented Dec 9, 2025

⚠️ No Changeset found

Latest commit: 0d9a6a0

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

💥 An error occurred when fetching the changed packages and changesets in this PR
Some errors occurred when validating the changesets config:
The package or glob expression "giselles-ai" is specified in the `ignore` option but it is not found in the project. You may have misspelled the package name or provided an invalid glob expression. Note that glob expressions must be defined according to https://www.npmjs.com/package/micromatch.

@giselles-ai
Copy link

giselles-ai bot commented Dec 9, 2025

Finished running flow.

Step 1
🟢
On Pull Request OpenedStatus: Success Updated: Dec 9, 2025 12:26pm
Step 2
🟢
Manual QAStatus: Success Updated: Dec 9, 2025 12:28pm
🟢
Prompt for AI AgentsStatus: Success Updated: Dec 9, 2025 12:28pm
Step 3
🟢
Create a Comment for PRStatus: Success Updated: Dec 9, 2025 12:31pm
Step 4
🟢
Create Pull Request CommentStatus: Success Updated: Dec 9, 2025 12:31pm

@vercel
Copy link

vercel bot commented Dec 9, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
giselle Ready Ready Preview Comment Dec 9, 2025 0:47am
ui Ready Ready Preview Comment Dec 9, 2025 0:47am

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 9, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

This PR standardizes Anthropic Claude model identifiers from date-specific strings (e.g., "claude-sonnet-4-5-20250929") to generic semantic versions (e.g., "claude-sonnet-4.5") across registries, normalization logic, UI defaults/recommendations, pricing keys, fixtures, converters, and tests.

Changes

Cohort / File(s) Summary
Model Configuration & UI
apps/studio.giselles.ai/app/(main)/stage/tasks/[taskId]/loading.tsx, internal-packages/workflow-designer-ui/src/editor/properties-panel/text-generation-node-properties-panel/model/model-defaults.ts, internal-packages/workflow-designer-ui/src/editor/tool/toolbar/toolbar.tsx
Replaced dated Anthropic model IDs with normalized 4.5 identifiers in loading config, default model data, and toolbar recommendations.
Model Registry & Normalization
packages/language-model-registry/src/anthropic.ts, packages/language-model/src/anthropic.ts
Renamed exported model keys and IDs to claude-*-4.5; updated enum values and regex normalization to recognize and map legacy/date-specific variants to the new 4.5 identifiers; default fallbacks adjusted.
Model Transformation & Migration
packages/giselle/src/generations/v2/language-model/transform-giselle-to-ai-sdk.ts, packages/node-registry/src/node-conversion.ts
Updated case labels and mapping logic to handle new 4.5 labels and map legacy/date variants to normalized content-generation IDs; expanded legacy ID union types and bidirectional conversion cases.
Pricing & Fixtures
packages/language-model/src/costs/model-prices.ts, packages/node-registry/src/__fixtures__/node-conversion/nodes.ts
Renamed anthropic pricing map keys and test fixture LLM IDs to use claude-*-4.5; price entries unchanged.
Tests
packages/language-model/src/anthropic.test.ts, packages/node-registry/src/node-conversion.test.ts
Updated test expectations and assertions to reflect normalized 4.5 identifiers and consistent normalization behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Files span multiple layers (registry, normalization, UI, pricing, conversion, tests) but follow a consistent identifier-renaming and normalization pattern.
  • Pay extra attention to:
    • packages/language-model/src/anthropic.ts — enum, regex normalization, and default fallback behavior.
    • packages/node-registry/src/node-conversion.ts — bidirectional mapping completeness and type updates.
    • packages/language-model/src/costs/model-prices.ts — pricing key renames to ensure lookups still resolve.

Possibly related PRs

  • Backport anthropic models #2393: Overlapping changes that also adjust Anthropic model ID normalization, toolbar recommendations, pricing entries, and node-conversion mappings to adopt the 4.5 Claude variants.

Poem

🐰 I hopped through code with tiny paws,
Switched old dates for tidy dots because,
Claude dons four-point-five so neat,
Models prance in orderly fleet,
A carrot cheer for normalized laws 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: renaming Anthropic model IDs from date-suffixed format (e.g., claude-sonnet-4-5-20250929) to dot notation (e.g., claude-sonnet-4.5) for consistency.
Description check ✅ Passed The PR description is comprehensive and follows the template structure with clear Summary, Changes, visual comparisons, and detailed implementation notes.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch rename-anthropic-models

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates Anthropic model identifiers from date-based suffixes to dot notation (e.g., claude-sonnet-4-5-20250929claude-sonnet-4.5) for consistency with the existing claude-opus-4.5 naming convention.

Key changes:

  • Updated model IDs across registry, conversion, and configuration files
  • Added legacy model IDs to the conversion layer for backward compatibility
  • Updated test expectations and fixtures to reflect new model naming

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated no comments.

Show a summary per file
File Description
packages/node-registry/src/node-conversion.ts Added legacy model IDs to conversion mapping and updated registry keys
packages/node-registry/src/node-conversion.test.ts Updated test expectations for new model ID format
packages/node-registry/src/__fixtures__/node-conversion/nodes.ts Updated fixture data to use new model ID
packages/language-model/src/costs/model-prices.ts Updated pricing table keys to use dot notation
packages/language-model/src/anthropic.ts Updated enum values, fallback logic, and model definitions
packages/language-model/src/anthropic.test.ts Updated all test cases to expect new model IDs
packages/language-model-registry/src/anthropic.ts Updated registry keys and removed trailing spaces in descriptions
packages/giselle/src/generations/v2/language-model/transform-giselle-to-ai-sdk.ts Updated case labels in switch statement
internal-packages/workflow-designer-ui/src/editor/tool/toolbar/toolbar.tsx Updated hardcoded model ID arrays
internal-packages/workflow-designer-ui/src/editor/properties-panel/text-generation-node-properties-panel/model/model-defaults.ts Updated default model ID
apps/studio.giselles.ai/app/(main)/stage/tasks/[taskId]/loading.tsx Updated dummy data model ID

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@qodo-merge-for-open-source
Copy link

qodo-merge-for-open-source bot commented Dec 9, 2025

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

  • Update
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-merge-for-open-source
Copy link

qodo-merge-for-open-source bot commented Dec 9, 2025

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
General
Centralize model ID normalization logic

Refactor the
convertTextGenerationLanguageModelIdToContentGenerationLanguageModelId function
to use the existing AnthropicLanguageModelId.parse() utility for normalizing
model IDs, thereby centralizing logic and reducing duplication.

packages/node-registry/src/node-conversion.ts [27-39]

 function convertTextGenerationLanguageModelIdToContentGenerationLanguageModelId(
 	from: TextGenerationModelIdWithLegacy,
 ): LanguageModelId {
+	// Normalize Anthropic model IDs first to simplify the switch statement
+	if (from.startsWith("claude-")) {
+		from = AnthropicLanguageModelId.parse(from);
+	}
+
 	switch (from) {
 		case "claude-haiku-4.5":
-		case "claude-haiku-4-5-20251001":
 			return "anthropic/claude-haiku-4.5";
 		case "claude-opus-4.5":
-		case "claude-opus-4-1-20250805":
 			return "anthropic/claude-opus-4.5";
 		case "claude-sonnet-4.5":
-		case "claude-sonnet-4-5-20250929":
 			return "anthropic/claude-sonnet-4.5";
 ...

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 7

__

Why: This is a valuable suggestion that improves maintainability by centralizing the model ID normalization logic, leveraging an existing utility function (AnthropicLanguageModelId.parse) to reduce code duplication.

Medium
Consolidate regex for model ID fallbacks

Consolidate multiple regular expression checks for "sonnet" model ID fallbacks
into a single, more comprehensive regex to improve code clarity and
maintainability.

packages/language-model/src/anthropic.ts [40-60]

-		if (/^claude-sonnet-4[.-]5(?:-.+)?$/.test(v)) {
-			return "claude-sonnet-4.5";
-		}
-		if (/^claude-4-sonnet-/.test(v)) {
-			return "claude-sonnet-4.5";
-		}
-		if (/^claude-3-7-sonnet-/.test(v)) {
+		if (/^claude-(?:sonnet-4[.-]5(?:-.+)?|4-sonnet-|3-7-sonnet-|3-5-sonnet-|3-sonnet-)/.test(v)) {
 			return "claude-sonnet-4.5";
 		}
 		if (/^claude-haiku-4[.-]5(?:-.+)?$/.test(v)) {
 			return "claude-haiku-4.5";
 		}
 		if (/^claude-3-5-haiku-/.test(v)) {
 			return "claude-haiku-4.5";
 		}
-		if (/^claude-3-5-sonnet-/.test(v)) {
-			return "claude-sonnet-4.5";
-		}
-		if (/^claude-3-sonnet-/.test(v)) {
-			return "claude-sonnet-4.5";
-		}
  • Apply / Chat
Suggestion importance[1-10]: 5

__

Why: The suggestion correctly identifies an opportunity to refactor multiple if statements with similar regex patterns into a single, more concise one, which improves code readability and maintainability.

Low
  • Update

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +76 to +80
case "anthropic/claude-haiku-4.5":
return "claude-haiku-4.5";
case "anthropic/claude-opus-4.5":
return "claude-opus-4.5";
case "anthropic/claude-sonnet-4-5":
return "claude-sonnet-4-5-20250929";
case "anthropic/claude-sonnet-4.5":

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve legacy anthropic IDs in content->text conversion

The conversion back to text-generation nodes now matches only the dot-notation Anthropics (anthropic/claude-haiku-4.5, anthropic/claude-opus-4.5, anthropic/claude-sonnet-4.5). Content-generation nodes saved before this rename still carry the old registry IDs (anthropic/claude-haiku-4-5, anthropic/claude-sonnet-4-5, etc.), and those will now hit the default branch and throw Unknown language model id, breaking the backward-compatibility the summary promises. Add the legacy anthropic/claude-*-4-5 cases alongside the new ones so existing stored nodes continue to load.

Useful? React with 👍 / 👎.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The registry IDs (anthropic/claude-*) are not persisted to storage. Stored data uses the text-generation format (claude-sonnet-4-5-20250929, etc.), which is handled by LegacyAnthropicModelId in the text→content conversion.

The content→text conversion only needs to handle current registry IDs since content-generation nodes are created in memory with the current registry, not loaded from legacy storage.

@giselles-ai
Copy link

giselles-ai bot commented Dec 9, 2025

🔍 QA Testing Assistant by Giselle

📋 Manual QA Checklist

Based on the changes in this PR, here are the key areas to test manually:

  • Model Selection UI (Paid User): Log in as a paid user, navigate to the Workflow Editor, and verify that "Claude Sonnet 4.5" and "Claude Opus 4.5" are listed in the Anthropic model dropdown without date suffixes.
  • Model Selection UI (Free User): Log in as a free user, navigate to the Workflow Editor, and verify that "Claude Haiku 4.5" is listed in the Anthropic model dropdown without date suffixes.
  • Default Model in New Node: Add a new "Text Generation" node, set the provider to "Anthropic", and confirm that the "Model" field defaults to "Claude Haiku 4.5".
  • New Workflow Generation (Sonnet): Create a workflow with a "Text Generation" node set to "Claude Sonnet 4.5". Provide a prompt and run the workflow to confirm successful generation.
  • New Workflow Generation (Haiku): Create a workflow with a "Text Generation" node set to "Claude Haiku 4.5". Provide a prompt and run the workflow to confirm successful generation.
  • Backward Compatibility (Load Old Sonnet Workflow): Load a previously saved workflow using the old Sonnet ID (claude-sonnet-4-5-20250929). Verify it loads without errors and the model dropdown displays "Claude Sonnet 4.5".
  • Backward Compatibility (Execute Old Sonnet Workflow): Execute the loaded old Sonnet workflow to confirm successful generation.
  • Backward Compatibility (Load Old Haiku Workflow): Load a previously saved workflow using the old Haiku ID (claude-haiku-4-5-20251001). Verify it loads without errors and the model dropdown displays "Claude Haiku 4.5".
  • Backward Compatibility (Execute Old Haiku Workflow): Execute the loaded old Haiku workflow to confirm successful generation.
  • Save and Re-load Migrated Workflow: Load a workflow with an old model ID, save it, then reload and re-open the workflow to ensure persistence of the migration.

✨ Prompt for AI Agents

Use the following prompts with Cursor or Claude Code to automate E2E testing:

📝 E2E Test Generation Prompt

## **Prompt for AI Agent: Generate Playwright E2E Tests**

**Objective:** Based on the provided context, generate a comprehensive suite of E2E tests using Playwright and TypeScript for the "Giselle AI Studio" application. The tests must validate the recent changes to Anthropic model ID naming conventions and ensure both new functionality and backward compatibility are working as expected.

### 1. Context Summary

The Pull Request refactors the internal IDs for two Anthropic language models to use a consistent dot notation:
*   `claude-sonnet-4-5-20250929` is now `claude-sonnet-4.5`
*   `claude-haiku-4-5-20251001` is now `claude-haiku-4.5`

A critical part of this change is the implementation of a conversion/translation layer. This layer ensures that any existing workflows or data stored with the **old** date-suffixed model IDs will be correctly interpreted and mapped to the **new** dot-notation IDs.

**Key User Flows Affected:**
*   Creating a new "Text Generation" node in the workflow designer.
*   Selecting a language model from the model selection dropdown/toolbar for a text generation node.
*   Loading a previously saved workflow that was created using the old Anthropic model IDs.
*   Executing a workflow that uses any of the renamed Anthropic models.

**Critical Paths to Test:**
1.  **UI Verification:** The model selection UI must display the new, user-friendly model names ("Claude Sonnet 4.5", "Claude Haiku 4.5").
2.  **Forward Path:** Creating and running a new workflow with the newly named models must function correctly.
3.  **Backward Compatibility:** Loading an existing workflow saved with the old, date-suffixed model IDs must work seamlessly, with the UI correctly reflecting the corresponding new model. The workflow must also execute successfully.

### 2. Test Scenarios

Create tests covering the following scenarios. Group them within `test.describe()` blocks for clarity (e.g., "Anthropic Model - New Workflow Creation", "Anthropic Model - Backward Compatibility").

**Scenario Group 1: New Workflow Creation & Execution (Happy Path)**
*   **Test 1.1:** Navigate to the workflow designer, create a new "Text Generation" node. Open the model selector and verify that "Claude Sonnet 4.5" and "Claude Haiku 4.5" are present in the list.
*   **Test 1.2:** Select "Claude Sonnet 4.5" for the text generation node. Add a simple prompt. Run the workflow and assert that a generation is successfully produced.
*   **Test 1.3:** Create another workflow, this time selecting "Claude Haiku 4.5". Add a simple prompt, run the workflow, and assert a successful generation.

**Scenario Group 2: Backward Compatibility for Existing Workflows**
*   **Test 2.1 (Sonnet Backward Compatibility):**
    *   **Setup:** Mock the API response for loading a workflow to return a workflow object that uses the *old* Sonnet ID: `claude-sonnet-4-5-20250929`.
    *   **Action:** Load this mocked workflow.
    *   **Assertion 1:** Verify that the model selector UI for the text generation node correctly displays "Claude Sonnet 4.5".
    *   **Assertion 2:** Run the workflow and assert that the generation completes successfully, proving the backend conversion layer worked.

*   **Test 2.2 (Haiku Backward Compatibility):**
    *   **Setup:** Mock the API response for loading a workflow to return a workflow object using the *old* Haiku ID: `claude-haiku-4-5-20251001`.
    *   **Action:** Load this mocked workflow.
    *   **Assertion 1:** Verify that the model selector UI displays "Claude Haiku 4.5".
    *   **Assertion 2:** Run the workflow and assert a successful generation.

**Scenario Group 3: Regression Testing**
*   **Test 3.1:** Verify that the existing "Claude Opus 4.5" model remains unaffected. A user should be able to select it and run a workflow.
*   **Test 3.2:** Verify that a model from a different provider (e.g., an OpenAI or Google model if available in the UI) is selectable and its functionality is not impacted by these changes.

### 3. Playwright Implementation Instructions

*   **File Structure:** Create a new test file named `tests/e2e/anthropic-models.spec.ts`.
*   **Selectors:**
    *   Use `data-testid` attributes where possible for robust selectors. If not available, use ARIA roles and visible text.
    *   To open the model selector, you might look for a button with text like "Model" or a specific `data-testid`.
    *   To select a model, use `page.getByRole('option', { name: 'Claude Sonnet 4.5' })` or a similar accessible selector.
*   **User Interactions:**
    *   Use `page.goto('/stage')` or the relevant URL to access the workflow designer.
    *   Simulate drag-and-drop or clicks to create a "Text Generation" node.
    *   Use `await page.click(...)` and `await page.fill(...)` for standard interactions.
    *   Use `await expect(...).toBeVisible()` for assertions.
*   **Mocking for Backward Compatibility (Crucial):**
    *   Use Playwright's network interception (`page.route`) to mock the API endpoint that fetches the workflow data. This is essential for the backward compatibility tests.

    ```typescript
    // Example for Test 2.1
    await page.route('**/api/v1/tasks/*', (route) => {
      const oldWorkflowData = {
        // ... other workflow properties
        nodes: [{
          id: 'text-node-1',
          type: 'textGeneration',
          content: {
            llm: {
              provider: 'anthropic',
              id: 'claude-sonnet-4-5-20250929', // <-- OLD ID
              configurations: { temperature: 0.7 }
            },
            prompt: 'This is a test prompt.'
          }
        }]
      };
      route.fulfill({
        status: 200,
        contentType: 'application/json',
        body: JSON.stringify(oldWorkflowData),
      });
    });

    // Now, navigate to the page that triggers this API call
    await page.goto('/stage/tasks/some-mocked-task-id');
    ```

*   **Assertions:**
    *   Assert that the correct model name is visible in the UI.
    *   After selecting a model, inspect the component's state or the underlying select element's value if possible to confirm the new ID (e.g., `anthropic/claude-sonnet-4.5`).
    *   For workflow execution, assert that a success message appears or that the output text area is populated. Use a generous timeout for waiting on generations. `await expect(page.getByText('Generation complete')).toBeVisible({ timeout: 30000 });`

### 4. MCP Integration Guidelines

*(This section is optional and provides general guidance.)*

*   **Command Structure:** Assume the test suite is run via a command-line interface. The generated tests should be runnable with standard Playwright commands.
    *   Example: `npx playwright test tests/e2e/anthropic-models.spec.ts --project=chromium`
*   **Environment Configuration:**
    *   The tests should assume that environment variables like `BASE_URL` and any necessary authentication tokens are configured in a `.env` file and loaded in the global setup.
    *   No hardcoded URLs or secrets in the test files. Use `process.env.BASE_URL`.

### 5. CI-Ready Code Requirements

*   **Test Organization:**
    *   Use `test.describe('Feature: Anthropic Model Renaming', () => { ... });` to group all related tests.
    *   Use nested `describe` blocks for each scenario group.
    *   Use `beforeEach` hooks to handle repetitive setup, such as navigating to the main page and logging in (if required). This ensures test isolation.
*   **Naming Conventions:**
    *   Test file: `anthropic-models.spec.ts`
    *   Test titles: Use clear, descriptive names, e.g., `test('should load a workflow with legacy sonnet ID and display the new model name', async ({ page }) => { ... });`
*   **Error Handling and Stability:**
    *   Avoid `page.waitForTimeout()`. Use web-first assertions and locators that auto-wait (e.g., `expect(locator).toBeVisible()`).
    *   Ensure each test cleans up after itself or relies on the `beforeEach` hook for a clean state. Tests must be runnable independently and in parallel.
*   **Code Quality:**
    *   Use `async/await` syntax correctly.
    *   Add comments explaining complex parts, especially the API mocking setup.
    *   Format the code using Prettier or a similar formatter.



expect(result.content.llm?.provider).toBe("anthropic");
expect(result.content.llm?.id).toBe("claude-sonnet-4-5-20250929");
expect(result.content.llm?.id).toBe("claude-sonnet-4.5");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Test expectation not updated for new model ID

The test expects the old model ID anthropic/claude-sonnet-4-5 but the fixture anthropicClaudeSonnet was updated to use claude-sonnet-4.5, and the conversion function now returns anthropic/claude-sonnet-4.5. This test will fail because the expectation wasn't updated to match the new dot notation ID format.

Fix in Cursor Fix in Web

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is correct. Both the fixture and the expected value use the new dot notation:

  • Fixture: claude-sonnet-4.5
  • Expected: anthropic/claude-sonnet-4.5

The conversion function maps claude-sonnet-4.5anthropic/claude-sonnet-4.5, and tests pass successfully.

shige added 2 commits December 9, 2025 21:42
- Rename registry keys to use dot notation (claude-sonnet-4.5, claude-haiku-4.5)
- Add legacy model IDs to conversion mapping for backward compatibility
- Update fixtures and tests
@shige
Copy link
Member Author

shige commented Dec 9, 2025

Thank you! 🚀

@shige shige merged commit da2eec1 into main Dec 9, 2025
13 checks passed
@shige shige deleted the rename-anthropic-models branch December 9, 2025 23:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants