Skip to content

Add self-hosted upgrade test script with Playwright browser verification #220

@schjonhaug

Description

@schjonhaug

Problem

When releasing new versions to Umbrel/Start9 via git tags, there's no automated way to verify that upgrading from an older version preserves all data (wallets, contacts, transactions) and that everything continues to work. This is currently a manual process.

Solution

Create an automated upgrade test script that:

  1. Checks out an old tagged version in a git worktree
  2. Sets up a complete self-hosted environment
  3. Creates test data (wallets, contacts, transactions)
  4. Upgrades to HEAD (master)
  5. Verifies all data survived the upgrade
  6. Runs new transactions to confirm end-to-end functionality
  7. Verifies everything in Chrome via Playwright

Files to Create

  • scripts/test-upgrade.sh — main bash orchestrator (~450 lines)
  • scripts/playwright/playwright.config.ts — Playwright config
  • scripts/playwright/package.json — minimal deps (Playwright only)
  • scripts/playwright/tests/upgrade-test.spec.ts — browser verification tests

Playwright lives in scripts/playwright/ (not in frontend/) to keep it isolated as a test-only tool.

Script Flow

Phase 1: Setup

  • Parse --from-tag <tag> argument (default: latest tag via git describe --tags --abbrev=0)
  • Validate the tag exists
  • Create a git worktree: git worktree add <tmpdir>/canary-upgrade-test <tag>
  • Copy .env.example.self-hosted templates into the worktree's backend/.env and frontend/.env.local
  • Kill any existing processes on ports 3000/3001
  • Run echo y | ./dev.sh reset and ./dev.sh start from HEAD's scripts/ dir (has ntfy container in docker-compose)
  • Install Playwright deps: cd scripts/playwright && npm install && npx playwright install chromium

Phase 2: Build & Start Old Version

  • cd $WORKTREE/backend && cargo build && cargo run & — store PID
  • Poll GET /api/wallets until responsive (timeout 120s)
  • cd $WORKTREE/frontend && pnpm install && pnpm dev & — store PID
  • Poll http://localhost:3001 until responsive

Phase 3: Create Test Data

  • Run old version's wallet creation from $WORKTREE/scripts/:
    • Detect whether dev.sh has create-wallets (v1.2–v1.3.1) or init (HEAD-style) and run accordingly
    • Pipe echo y | to handle interactive prompts
  • Poll GET /api/wallets until funded wallets appear (balance_total > 0)
  • Record pre-upgrade snapshot: wallet checksums, names, balances

Phase 4: Add ntfy Contacts

  • Generate unique topic: canary-upgrade-test-<timestamp>
  • For the first funded wallet, POST /api/wallets/{checksum}/contacts with ntfy notification method
  • Verify contact creation via GET /api/wallets/{checksum}/detail

Phase 5: Transaction Tests (Old Version)

  • Use direct docker exec bitcoin-cli commands (version-independent):
    • Discover first funded Bitcoin Core wallet name via listwallets
    • Send 0.01 BTC from miner to that wallet
  • Poll wallet detail API until pending tx appears
  • Mine 1 block via generatetoaddress
  • Poll until tx status is confirmed
  • Record transaction IDs and counts as pre-upgrade snapshot

Phase 5b: Browser Verification (Old Version)

  • Run Playwright tests against the old version's frontend:
    WALLET_CHECKSUM=<checksum> NTFY_TOPIC=<topic> npx playwright test --grep @pre-upgrade
    
  • Checks: wallets page loads, wallet names visible, click into funded wallet, transactions visible, contacts section shows ntfy contact

Phase 6: Upgrade to HEAD

  • Kill backend and frontend
  • cd $WORKTREE && git checkout $(git -C $REPO_ROOT rev-parse HEAD) (detached HEAD at master)
  • Copy HEAD's .env.example.self-hosted templates (in case format changed)
  • Rebuild backend and reinstall frontend deps
  • Wait for both to come up

Phase 7: Post-Upgrade Data Verification (API)

  • Wallets: Same count, same checksums, same names
  • Contacts: Still present on each wallet (via detail endpoint)
  • Transactions: Pre-upgrade txids still exist with correct status
  • Balances: Funded wallets still have balance > 0

Phase 7b: Post-Upgrade Browser Verification

  • Run Playwright tests:
    WALLET_CHECKSUM=<checksum> NTFY_TOPIC=<topic> npx playwright test --grep @post-upgrade
    
  • Same checks as 5b — verifies UI shows all preserved data after upgrade

Phase 8: Post-Upgrade New Transactions

  • Send 0.02 BTC from miner to wallet, poll for pending tx
  • Mine block, poll for confirmed tx
  • Check that notification_status array is populated on the new tx

Phase 9: Cleanup & Report

  • Kill backend/frontend, remove worktree
  • Print summary: API checks passed/failed, Playwright results, overall exit code

Key Design Decisions

  1. Bitcoin CLI directly (not dev.sh) for transactions — avoids wallet name incompatibilities across versions (alice/bob at v1.3.1 vs segwit-desc/legacy-desc at HEAD)
  2. HEAD's docker-compose for infrastructure — includes ntfy container that older versions lack
  3. Old version's dev.sh for wallet creation — matches what the old backend expects
  4. Detached HEAD checkout for upgrade — preserves database files in worktree while updating code
  5. Public ntfy.sh for notifications — just push to a generated topic, no local verification
  6. Playwright in scripts/playwright/ — isolated from frontend's test setup, dedicated to upgrade verification

Playwright Test Design

Selectors (no data-testid attributes exist)

  • Wallet names: page.getByText('wallet-name')
  • Balance label: page.getByText('Balance')
  • Contacts section: page.getByText('Contacts')
  • ntfy topic: page.getByText(NTFY_TOPIC)
  • Transaction rows: table rows in the transactions section

Environment Variables (passed from bash script)

  • WALLET_CHECKSUM — checksum of the first funded wallet
  • NTFY_TOPIC — the ntfy topic name added as a contact
  • EXPECTED_WALLET_COUNT — number of wallets to expect

Config

  • baseURL: 'http://localhost:3001'
  • browserName: 'chromium'
  • timeout: 30000
  • retries: 0

Helper Functions (bash script)

Function Purpose
btc() / btc_wallet() docker exec wrappers for bitcoin-cli
wait_for_backend() / wait_for_frontend() poll with timeout
poll_wallets_synced() wait for wallet sync (balance_total > 0)
poll_wallet_for_tx() wait for specific txid in wallet detail
run_playwright() run Playwright with env vars, report results
log_phase() / check_pass() / check_fail() structured output
cleanup() trap handler for process/worktree cleanup

API Endpoints Used

  • GET /api/wallets{"timestamp": u64, "wallets": [{"checksum", "name", "balance_total", ...}]}
  • GET /api/wallets/{checksum}/detail{"wallet": {...}, "transactions": [...], "contacts": [...], "balance_alerts": [...]}
  • POST /api/wallets/{checksum}/contacts → body: {"name": "...", "notification_methods": [{"provider_type": "ntfy", "notification_target": "..."}]}

Usage

# Upgrade from latest tag to HEAD (default)
./scripts/test-upgrade.sh

# Upgrade from a specific tag
./scripts/test-upgrade.sh --from-tag v1.2.0

Relevant Code References

  • scripts/dev.sh — existing dev script with wallet creation, Docker management
  • scripts/docker-compose.yml — Docker infrastructure (Bitcoin Core + Fulcrum + ntfy)
  • backend/.env.example.self-hosted — self-hosted backend config template
  • frontend/.env.example.self-hosted — self-hosted frontend config template
  • backend/src/handlers/contact.rs — contact creation handler
  • backend/src/models/requests.rs — API request types (CreateContactWithMethodsRequest)
  • backend/src/metadata/types.rs — response types (WalletMetadata, TransactionWithWallet, Contact)
  • frontend/src/app/wallets/page.tsx — wallets list page
  • frontend/src/app/wallets/[checksum]/page.tsx — wallet detail page
  • frontend/src/components/wallet-contacts-list.tsx — contacts list component
  • frontend/src/components/transactions.tsx — transactions component

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions