Skip to content

Fix e2e tests & playwright comment job#2392

Merged
wwwillchen merged 1 commit intodyad-sh:mainfrom
wwwillchen:fix-e2etests
Jan 30, 2026
Merged

Fix e2e tests & playwright comment job#2392
wwwillchen merged 1 commit intodyad-sh:mainfrom
wwwillchen:fix-e2etests

Conversation

@wwwillchen
Copy link
Copy Markdown
Collaborator

@wwwillchen wwwillchen commented Jan 29, 2026

#skip-bb


Open with Devin

Summary by cubic

Fixes failing Next.js e2e tests by selecting the correct chat mode and restarting after upgrades, and improves the Playwright PR comment with clearer run/update commands. Also increases local test timeout to reduce flaky failures.

  • Bug Fixes

    • Selects “build” chat mode in Next.js tests.
    • Adds restart after upgrade in select component test.
    • Updates snapshot to expect the “next” template ID.
    • Raises local timeout to 75s to reduce flakes.
  • New Features

    • Playwright PR comment now includes copy-paste commands to run and update snapshots for each failed test.
    • Uses -g "pattern" with proper escaping; groups many commands in a collapsible section.

Written for commit 5124cf2. Summary will update on new commits.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @wwwillchen, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on enhancing the stability and developer experience around the end-to-end testing suite. It addresses flakiness in existing tests by introducing more explicit steps and increases test timeouts. Additionally, it improves the utility of the Playwright summary generation script by providing more comprehensive commands for debugging and maintaining tests.

Highlights

  • E2E Test Stability Improvements: Several end-to-end tests have been updated to improve their reliability and prevent flakiness, including adding explicit chat mode selections and restart actions.
  • Playwright Configuration Update: The default test timeout for Playwright tests in non-CI environments has been increased to provide more buffer for test execution.
  • Enhanced Playwright Summary Script: The script responsible for generating Playwright test summaries has been refactored to provide both 'run test' and 'update snapshots' commands for failed tests, significantly improving developer workflow.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions
Copy link
Copy Markdown
Contributor

🔍 Multi-Agent Code Review

3 independent reviewers analyzed this PR. Found 1 issue with consensus.

Summary

Severity Count
🔴 HIGH 0
🟡 MEDIUM 1
🟢 LOW 0

Issues to Address

Severity File Issue
🟡 MEDIUM e2e-tests/select_component.spec.ts:145 Inconsistent test setup - selectChatMode("build") called after template selection
🟢 Low Priority Issues (2 items - no consensus at MEDIUM+)
  • Missing comment for clickRestart - e2e-tests/select_component.spec.ts:127 (2/3 agents, both LOW)
  • Shell escaping incomplete - scripts/generate-playwright-summary.js:78 (1/3 agents, LOW)

See inline comments for details.


Generated by multi-agent consensus review (3 agents, 2+ agreement required)


await po.goToHubAndSelectTemplate("Next.js Template");

await po.selectChatMode("build");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MEDIUM - Inconsistent test setup pattern (2/3 reviewers)

selectChatMode("build") is called after goToHubAndSelectTemplate, but the template may auto-select a chat mode. This pattern is also used in template-create-nextjs.spec.ts, so it's at least consistent across tests.

Considerations:

  • The order may matter if the template auto-selects a different chat mode
  • Consider whether selecting the chat mode before navigating would be more robust
  • If this is intentional (overriding the template's default), a brief comment would clarify the intent

Flagged by 2/3 independent reviewers (1 MEDIUM, 1 LOW)

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Jan 29, 2026

Greptile Overview

Greptile Summary

Fixes E2E test flakiness and improves CI/CD tooling for Playwright test reporting.

Key Changes:

  • Added explicit selectChatMode("build") calls in Next.js template tests to ensure correct chat mode state, preventing test failures from mode mismatches
  • Added clickRestart() after version upgrade in component selection test to ensure application state is properly reset before continuing
  • Updated test snapshot to reflect explicit chat mode selection (no longer capturing mode change as part of template selection)
  • Increased local test timeout from 45s to 75s to reduce timeout-related test failures during development
  • Enhanced Playwright comment bot to show both "run test" and "update snapshots" commands separately for each failed test, improving developer experience

Confidence Score: 5/5

  • Safe to merge - test fixes and tooling improvements with no production code changes
  • All changes are scoped to E2E tests and CI tooling. The fixes address test flakiness by making state management more explicit, and the timeout increase is reasonable for local development. No application code or IPC handlers were modified.
  • No files require special attention

Important Files Changed

Filename Overview
e2e-tests/select_component.spec.ts Added clickRestart() call after version check and explicit chat mode selection for Next.js test to ensure proper test state
e2e-tests/template-create-nextjs.spec.ts Added explicit selectChatMode("build") after template selection to ensure correct chat mode for test execution
playwright.config.ts Increased local test timeout from 45s to 75s to accommodate slower test execution times
scripts/generate-playwright-summary.js Improved PR comment formatting: now provides separate run and update commands for each failed test, better usability

Sequence Diagram

sequenceDiagram
    participant Test as E2E Test
    participant PO as Page Object
    participant App as Dyad App
    participant Preview as Preview Panel
    
    Note over Test,Preview: Template Selection & Chat Mode Setup
    Test->>PO: goToHubAndSelectTemplate("Next.js Template")
    PO->>App: Select Next.js template
    App-->>PO: Template loaded
    Test->>PO: selectChatMode("build")
    PO->>App: Set chat mode to "build"
    App-->>PO: Chat mode updated
    
    Note over Test,Preview: Test Execution Flow
    Test->>PO: sendPrompt("tc=basic")
    PO->>App: Send test command
    App-->>PO: Command processed
    
    Note over Test,Preview: Component Selection Test
    Test->>PO: clickTogglePreviewPanel()
    PO->>App: Toggle preview
    App->>Preview: Show preview panel
    Test->>PO: clickPreviewPickElement()
    PO->>Preview: Enable element picker
    Test->>Preview: Click element in iframe
    Preview-->>App: Component selected
    
    Note over Test,Preview: Version Upgrade Flow (select_component.spec.ts)
    Test->>PO: clickAppUpgradeButton()
    PO->>App: Trigger upgrade
    App-->>PO: Upgrade complete, Version 2 visible
    Test->>PO: clickRestart()
    PO->>App: Restart application
    App-->>PO: Restarted with new version
Loading

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves end-to-end test stability by adding a restart after app upgrades, explicitly setting chat mode, and increasing local test timeouts. It also enhances the Playwright summary script for easier debugging. However, a significant security vulnerability was found in the command generation logic of the summary script, where unescaped test titles could lead to command injection on a developer's machine. Addressing this vulnerability is crucial. Additionally, there's an opportunity to further simplify the command generation logic.

// Escape special characters in testName for the grep pattern
const escapedTestName = testName.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
return `npm run e2e e2e-tests/${specFile} -- --g="${escapedTestName}" --update-snapshots`;
return `npm run e2e e2e-tests/${specFile} -- -g "${escapedTestName}"`;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

Critical Security Vulnerability: The generated shell command is vulnerable to command injection because testName and specFile are not properly escaped for the shell. An attacker could craft a malicious test title (e.g., containing backticks or double quotes) that executes arbitrary commands when a developer copy-pastes the suggested command. To fix this, you must escape shell-sensitive characters. Additionally, for improved maintainability and robustness, consider simplifying this by using the fullTitle directly in the grep pattern, which would also allow you to remove the parseTestTitle function.

Suggested change
return `npm run e2e e2e-tests/${specFile} -- -g "${escapedTestName}"`;
return "npm run e2e \"e2e-tests/" + specFile.replace(/["\x60\\$]/g, "\\$&") + "\" -- -g \"" + escapedTestName.replace(/["\x60]/g, "\\$&") + "\"";

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 5 files

@wwwillchen wwwillchen merged commit eb909bb into dyad-sh:main Jan 30, 2026
6 of 15 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Jan 30, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant