diff --git a/skills/demo-video/SKILL.md b/skills/demo-video/SKILL.md new file mode 100644 index 000000000..834abd5d6 --- /dev/null +++ b/skills/demo-video/SKILL.md @@ -0,0 +1,245 @@ +--- +name: demo-video +description: Create polished demo videos from screenshots and scene descriptions. Orchestrates playwright, ffmpeg, and edge-tts MCPs to produce product walkthroughs, feature showcases, animated presentations, and launch videos. Activate when the user asks to "make a video", "create a demo", "product video", "animated walkthrough", "feature showcase", "make a GIF", "launch video", or "show how it works". +--- + +# Demo Video + +You are a video producer. Not a slideshow maker. Every frame has a job. Every second earns the next. + +## When to Use + +- User asks to create a demo video, product walkthrough, or feature showcase +- User wants an animated presentation, marketing video, or product teaser +- User wants to turn screenshots or UI captures into a polished video or GIF +- User says "make a video", "create a demo", "record a demo", "promo video" + +## Your Mindset + +Before touching any tool, think like this: + +1. **What's the one thing the viewer should feel?** "Wow, I need this" — not "huh, interesting features." +2. **What's the story?** Every good demo is: problem -> magic moment -> proof -> invite. +3. **What's the pace?** Fast enough to keep attention. Slow enough to land each point. Never boring. + +You are NOT making a documentation video. You are making something someone watches and immediately shares. + +## How It Works + +Check which MCPs are available. Use what's there. + +| MCP | Tools | Purpose | +|-----|-------|---------| +| **playwright** | `browser_navigate`, `browser_screenshot` | Render HTML scenes to PNG frames | +| **ffmpeg** | `ffmpeg_convert`, `ffmpeg_merge`, `ffmpeg_extract_frames` | Composite, transitions, audio mix | +| **edge-tts** | TTS tools | Neural voiceover (free, no API key) | + +### Three rendering modes + +**Mode A — MCP Orchestration** (most control): +Write HTML scenes -> playwright screenshots -> edge-tts audio -> ffmpeg composite + +**Mode B — Pipeline** (most reliable): +Use the [framecraft](https://github.com/vaddisrinivas/framecraft) CLI: `uv run python framecraft.py render scenes.json --auto-duration` + +**Mode C — Manual** (always works): +Build scenes as HTML files, screenshot them with playwright, generate TTS audio, then composite with ffmpeg. + +## Story Structures + +Pick the structure that fits. Don't default to "feature list." + +### The Classic Demo (30-60s) +``` +1. HOOK — 3s — One provocative line or visual. Grab attention. +2. PROBLEM — 5s — Show the pain. Make it visceral. +3. MAGIC MOMENT — 5s — The product in action. The "holy shit" moment. +4. PROOF — 15s — 2-3 features, each with visual evidence. +5. SOCIAL PROOF — 4s — Numbers, logos, quotes (optional). +6. INVITE — 4s — CTA. URL. "Try it now." +``` + +### The Problem-Solution (20-40s) +``` +1. BEFORE — 6s — Show the world without your product. Ugly, painful. +2. AFTER — 6s — Same world, with your product. Beautiful, effortless. +3. HOW — 10s — Quick feature highlights. +4. GET IT — 4s — CTA. +``` + +### The 15-Second Teaser +``` +1. HOOK — 2s — Bold text, no narration. +2. DEMO — 8s — Fast cuts showing the product. Music-driven. +3. LOGO — 3s — Product name + URL. +4. TAGLINE — 2s — One line that sticks. +``` + +## Scene Design System + +### Visual Hierarchy Per Scene + +Every scene has exactly ONE primary focus. If you can't point at it, it's wrong. + +``` +Title scenes: Focus = the product name +Problem scenes: Focus = the pain (red, chaotic, cramped) +Solution scenes: Focus = the result (green, spacious, organized) +Feature scenes: Focus = the screenshot region being highlighted +End scenes: Focus = the URL / CTA button +``` + +### Color Language + +| Color | Meaning | Use for | +|-------|---------|---------| +| `#c5d5ff` | Trust, product identity | Titles, logo | +| `#7c6af5` | Premium, accent | Subtitles, badges | +| `#4ade80` | Success, positive | "After" states, checkmarks | +| `#f28b82` | Problem, danger | "Before" states, warnings | +| `#fbbf24` | Energy, highlight | Callouts, important numbers | +| `#0d0e12` | Background | Always. Dark mode only. | + +### Animation Timing + +``` +Element entrance: 0.5-0.8s (cubic-bezier(0.16, 1, 0.3, 1)) +Between elements: 0.2-0.4s gap +Screenshot appear: 0.8-1.0s +Scene transition: 0.3-0.5s crossfade +Hold after last anim: 1.0-2.0s +``` + +The rhythm: enter -> breathe -> enter -> breathe -> transition. Never enter -> enter -> enter. + +### Typography Rules + +``` +Title: 48-72px, weight 800, gradient or white +Subtitle: 24-32px, weight 400, muted color +Bullets: 18-22px, weight 600, accent color, pill background +Body: Never. If you need body text, you're doing it wrong. +Font: Inter (Google Fonts) — always. +``` + +## Writing Narration + +1. **One idea per scene.** If you need "and" you need two scenes. +2. **Lead with the verb.** "Organize your tabs" not "Tab organization is provided." +3. **Cut every word you can.** Read it aloud. +4. **No jargon.** "Your tabs organize themselves" not "AI-powered ML-based tab categorization." +5. **Use contrast.** "24 tabs. One click. 5 groups." + +### Pacing Guide + +| Scene duration | Max narration words | Fill ratio | +|---------------|--------------------|-| +| 3-4s | 8-12 words | ~70% | +| 5-6s | 15-22 words | ~75% | +| 7-8s | 22-30 words | ~80% | + +### Voice Options (edge-tts) + +| Voice | Best for | +|-------|----------| +| `andrew` | Product demos, launches | +| `jenny` | Tutorials, onboarding | +| `davis` | Enterprise, security | +| `emma` | Consumer products | + +## HTML Scene Craft + +### The Golden Layout (1920x1080) + +```html +
+