Fix e2e tests & playwright comment job by wwwillchen · Pull Request #2392 · dyad-sh/dyad

wwwillchen · 2026-01-29T18:50:01Z

#skip-bb

Summary by cubic

Fixes failing Next.js e2e tests by selecting the correct chat mode and restarting after upgrades, and improves the Playwright PR comment with clearer run/update commands. Also increases local test timeout to reduce flaky failures.

Bug Fixes
- Selects “build” chat mode in Next.js tests.
- Adds restart after upgrade in select component test.
- Updates snapshot to expect the “next” template ID.
- Raises local timeout to 75s to reduce flakes.
New Features
- Playwright PR comment now includes copy-paste commands to run and update snapshots for each failed test.
- Uses -g "pattern" with proper escaping; groups many commands in a collapsible section.

^{Written for commit 5124cf2. Summary will update on new commits.}

gemini-code-assist · 2026-01-29T18:50:20Z

Summary of Changes

Hello @wwwillchen, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on enhancing the stability and developer experience around the end-to-end testing suite. It addresses flakiness in existing tests by introducing more explicit steps and increases test timeouts. Additionally, it improves the utility of the Playwright summary generation script by providing more comprehensive commands for debugging and maintaining tests.

Highlights

E2E Test Stability Improvements: Several end-to-end tests have been updated to improve their reliability and prevent flakiness, including adding explicit chat mode selections and restart actions.
Playwright Configuration Update: The default test timeout for Playwright tests in non-CI environments has been increased to provide more buffer for test execution.
Enhanced Playwright Summary Script: The script responsible for generating Playwright test summaries has been refactored to provide both 'run test' and 'update snapshots' commands for failed tests, significantly improving developer workflow.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

github-actions · 2026-01-29T18:52:25Z

🔍 Multi-Agent Code Review

3 independent reviewers analyzed this PR. Found 1 issue with consensus.

Summary

Severity	Count
🔴 HIGH	0
🟡 MEDIUM	1
🟢 LOW	0

Issues to Address

Severity	File	Issue
🟡 MEDIUM	`e2e-tests/select_component.spec.ts:145`	Inconsistent test setup - `selectChatMode("build")` called after template selection

🟢 Low Priority Issues (2 items - no consensus at MEDIUM+)

Missing comment for clickRestart - e2e-tests/select_component.spec.ts:127 (2/3 agents, both LOW)
Shell escaping incomplete - scripts/generate-playwright-summary.js:78 (1/3 agents, LOW)

See inline comments for details.

Generated by multi-agent consensus review (3 agents, 2+ agreement required)

github-actions · 2026-01-29T18:52:34Z

e2e-tests/select_component.spec.ts


  await po.goToHubAndSelectTemplate("Next.js Template");
-
+  await po.selectChatMode("build");


🟡 MEDIUM - Inconsistent test setup pattern (2/3 reviewers)

selectChatMode("build") is called after goToHubAndSelectTemplate, but the template may auto-select a chat mode. This pattern is also used in template-create-nextjs.spec.ts, so it's at least consistent across tests.

Considerations:

The order may matter if the template auto-selects a different chat mode

Consider whether selecting the chat mode before navigating would be more robust

If this is intentional (overriding the template's default), a brief comment would clarify the intent

Flagged by 2/3 independent reviewers (1 MEDIUM, 1 LOW)

greptile-apps · 2026-01-29T18:52:41Z

Greptile Overview

Greptile Summary

Fixes E2E test flakiness and improves CI/CD tooling for Playwright test reporting.

Key Changes:

Added explicit selectChatMode("build") calls in Next.js template tests to ensure correct chat mode state, preventing test failures from mode mismatches
Added clickRestart() after version upgrade in component selection test to ensure application state is properly reset before continuing
Updated test snapshot to reflect explicit chat mode selection (no longer capturing mode change as part of template selection)
Increased local test timeout from 45s to 75s to reduce timeout-related test failures during development
Enhanced Playwright comment bot to show both "run test" and "update snapshots" commands separately for each failed test, improving developer experience

Confidence Score: 5/5

Safe to merge - test fixes and tooling improvements with no production code changes
All changes are scoped to E2E tests and CI tooling. The fixes address test flakiness by making state management more explicit, and the timeout increase is reasonable for local development. No application code or IPC handlers were modified.
No files require special attention

Important Files Changed

Filename	Overview
e2e-tests/select_component.spec.ts	Added `clickRestart()` call after version check and explicit chat mode selection for Next.js test to ensure proper test state
e2e-tests/template-create-nextjs.spec.ts	Added explicit `selectChatMode("build")` after template selection to ensure correct chat mode for test execution
playwright.config.ts	Increased local test timeout from 45s to 75s to accommodate slower test execution times
scripts/generate-playwright-summary.js	Improved PR comment formatting: now provides separate run and update commands for each failed test, better usability

Sequence Diagram

sequenceDiagram
    participant Test as E2E Test
    participant PO as Page Object
    participant App as Dyad App
    participant Preview as Preview Panel
    
    Note over Test,Preview: Template Selection & Chat Mode Setup
    Test->>PO: goToHubAndSelectTemplate("Next.js Template")
    PO->>App: Select Next.js template
    App-->>PO: Template loaded
    Test->>PO: selectChatMode("build")
    PO->>App: Set chat mode to "build"
    App-->>PO: Chat mode updated
    
    Note over Test,Preview: Test Execution Flow
    Test->>PO: sendPrompt("tc=basic")
    PO->>App: Send test command
    App-->>PO: Command processed
    
    Note over Test,Preview: Component Selection Test
    Test->>PO: clickTogglePreviewPanel()
    PO->>App: Toggle preview
    App->>Preview: Show preview panel
    Test->>PO: clickPreviewPickElement()
    PO->>Preview: Enable element picker
    Test->>Preview: Click element in iframe
    Preview-->>App: Component selected
    
    Note over Test,Preview: Version Upgrade Flow (select_component.spec.ts)
    Test->>PO: clickAppUpgradeButton()
    PO->>App: Trigger upgrade
    App-->>PO: Upgrade complete, Version 2 visible
    Test->>PO: clickRestart()
    PO->>App: Restart application
    App-->>PO: Restarted with new version

gemini-code-assist

Code Review

This pull request improves end-to-end test stability by adding a restart after app upgrades, explicitly setting chat mode, and increasing local test timeouts. It also enhances the Playwright summary script for easier debugging. However, a significant security vulnerability was found in the command generation logic of the summary script, where unescaped test titles could lead to command injection on a developer's machine. Addressing this vulnerability is crucial. Additionally, there's an opportunity to further simplify the command generation logic.

gemini-code-assist · 2026-01-29T18:53:08Z

scripts/generate-playwright-summary.js

  // Escape special characters in testName for the grep pattern
  const escapedTestName = testName.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
-  return `npm run e2e e2e-tests/${specFile} -- --g="${escapedTestName}" --update-snapshots`;
+  return `npm run e2e e2e-tests/${specFile} -- -g "${escapedTestName}"`;


Critical Security Vulnerability: The generated shell command is vulnerable to command injection because testName and specFile are not properly escaped for the shell. An attacker could craft a malicious test title (e.g., containing backticks or double quotes) that executes arbitrary commands when a developer copy-pastes the suggested command. To fix this, you must escape shell-sensitive characters. Additionally, for improved maintainability and robustness, consider simplifying this by using the fullTitle directly in the grep pattern, which would also allow you to remove the parseTestTitle function.

Suggested change

return `npm run e2e e2e-tests/${specFile} -- -g "${escapedTestName}"`;

return "npm run e2e \"e2e-tests/" + specFile.replace(/["\x60\\$]/g, "\\$&") + "\" -- -g \"" + escapedTestName.replace(/["\x60]/g, "\\$&") + "\"";

cubic-dev-ai

No issues found across 5 files

Fix e2e tests & playwright comment job

5124cf2

wwwillchen temporarily deployed to ai-bots January 29, 2026 18:50 — with GitHub Actions Inactive

github-actions bot reviewed Jan 29, 2026

View reviewed changes

gemini-code-assist bot reviewed Jan 29, 2026

View reviewed changes

cubic-dev-ai bot reviewed Jan 29, 2026

View reviewed changes

wwwillchen merged commit eb909bb into dyad-sh:main Jan 30, 2026
6 of 15 checks passed

github-actions bot locked and limited conversation to collaborators Jan 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix e2e tests & playwright comment job#2392

Fix e2e tests & playwright comment job#2392
wwwillchen merged 1 commit intodyad-sh:mainfrom
wwwillchen:fix-e2etests

wwwillchen commented Jan 29, 2026 •

edited by cubic-dev-ai bot

Loading

Uh oh!

gemini-code-assist bot commented Jan 29, 2026

Uh oh!

github-actions bot commented Jan 29, 2026

Uh oh!

github-actions bot Jan 29, 2026

Uh oh!

greptile-apps bot commented Jan 29, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 29, 2026

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		await po.goToHubAndSelectTemplate("Next.js Template");

		await po.selectChatMode("build");

	return `npm run e2e e2e-tests/${specFile} -- -g "${escapedTestName}"`;
	return "npm run e2e \"e2e-tests/" + specFile.replace(/["\x60\\$]/g, "\\$&") + "\" -- -g \"" + escapedTestName.replace(/["\x60]/g, "\\$&") + "\"";

Conversation

wwwillchen commented Jan 29, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by cubic

Uh oh!

gemini-code-assist bot commented Jan 29, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

github-actions bot commented Jan 29, 2026

🔍 Multi-Agent Code Review

Summary

Issues to Address

Uh oh!

github-actions bot Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Jan 29, 2026

Greptile Overview

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

wwwillchen commented Jan 29, 2026 •

edited by cubic-dev-ai bot

Loading