Skip to content

imjasonh/marptalk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Marptalk

Automated narrated Marp presentations with Google Cloud Text-to-Speech

See it in action! 👀

Marptalk can also generate videos using ffmpeg, with sound and subtitles -- check it out!

Why?

I believe a good slide presentation isn't just the slides on the screen, but what's being said at the same time, so generating both simultaneously and being able to see and iterate on both simultaneously, produces better presentations.

Since Marp presentations are just text, the first draft can be easily generated by an LLM, then edited by a human.

Using an AI agent, you can request a high-level presentation topic and get a complete first draft in minutes, complete with spoken content.

This can be useful for generating presentations to learn a new topic, with visual guides and spoken narration.

You could also use these tools to generate a presentation in another language, to localize a presentation to an international audience.

Quick Start

  1. Install dependencies:

    npm install
  2. Set up Google Cloud authentication:

    # Option 1: Use gcloud CLI (recommended)
    gcloud auth application-default login
    
    # Option 2: Use service account key file
    export GOOGLE_APPLICATION_CREDENTIALS="/path/to/your/service-account-key.json"
  3. Run the demo:

    npm run demo
  4. Open the result:

    open dist/index.html

Usage

Basic Usage

node src/generate.js <input.md> [options]

Options

  • -o, --output <dir>: Output directory (default: dist)
  • --voice <voice>: GCP TTS voice name (default: en-US-Journey-D)
  • --language <code>: Language code (default: en-US)
  • --key-file <path>: Path to Google Cloud service account key file
  • --max-slides <num>: Maximum slides for testing
  • --generate-tts: Generate TTS audio files (default: false)
  • --generate-srt: Generate SRT subtitle file for video captions (default: false)
  • --generate-chapters: Generate YouTube chapter markers file (default: false)
  • --srt-filename <name>: Custom filename for SRT file (default: subtitles.srt)
  • --chapters-filename <name>: Custom filename for chapters file (default: chapters.txt)
  • --generate-video: Generate MP4 video of the presentation with synchronized audio (default: false)
  • --video-filename <name>: Custom filename for video file (default: presentation.mp4)
  • --video-width <pixels>: Video width in pixels (default: 1920)
  • --video-height <pixels>: Video height in pixels (default: 1080)
  • --video-fps <number>: Video frames per second (default: 30)
  • --video-subtitles <mode>: Subtitle mode: "off", "soft" (toggleable), or "hard" (burned-in) (default: soft)

Popular GCP Voices

English (en-US):

  • en-US-Journey-D - Natural, conversational (default)
  • en-US-Journey-F - Warm, friendly female voice
  • en-US-Neural2-A - Professional male voice
  • en-US-Neural2-C - Clear female voice
  • en-US-Neural2-J - Energetic male voice

Other Languages:

  • es-ES-Neural2-A - Spanish (Spain)
  • fr-FR-Neural2-A - French
  • de-DE-Neural2-B - German
  • ja-JP-Neural2-B - Japanese

Creating Presentations

1. Write your Marp presentation

Create a .md file with Marp frontmatter and speaker notes:

---
marp: true
theme: default
---

# My Slide Title

Some slide content here.

<!--
This is the speaker note that will be converted to audio.
Make sure to write natural, conversational text that sounds
good when spoken aloud.
-->

---

# Second Slide

More content...

<!--
Another speaker note for the second slide.
-->

2. Generate the presentation

# Basic generation
node src/generate.js my-presentation.md

# With custom voice and language
node src/generate.js my-presentation.md --voice en-US-Journey-F --language en-US

# Test with first 3 slides only
node src/generate.js my-presentation.md --max-slides 3

# Skip TTS generation (use existing audio files, saves money)
node src/generate.js my-presentation.md --no-generate-tts

# Generate with SRT subtitles and YouTube chapters
node src/generate.js my-presentation.md --generate-srt --generate-chapters

# Generate complete video with subtitles and chapters
node src/generate.js my-presentation.md --generate-video --generate-srt --generate-chapters

# Quick video generation using existing audio
node src/generate.js my-presentation.md --no-generate-tts --generate-video --generate-srt

# Custom subtitle and chapter filenames
node src/generate.js my-presentation.md --generate-srt --generate-chapters --srt-filename "my-captions.srt" --chapters-filename "my-chapters.txt"

3. Present

Open dist/index.html in your browser and use the controls:

  • ▶ Start: Begin automated playback
  • ⏸ Pause: Pause/resume presentation
  • ⏹ Stop: Stop and return to manual mode
  • 🔊 Sound: Toggle audio on/off

Keyboard Shortcuts

  • Space: Start/pause presentation
  • Escape: Stop presentation
  • M: Toggle mute
  • Arrow keys: Navigate slides (when stopped)

How It Works

Marptalk uses a multi-stage pipeline:

  1. Stage A: Extract speaker notes from Marp markdown
  2. Stage B: Generate audio files using Google Cloud TTS
  3. Stage C: Create self-playing HTML presentation
  4. Stage D (Optional): Generate SRT subtitles, YouTube chapters, and MP4 videos

Subtitle and Chapter Generation

Marptalk can generate additional outputs to help with video publishing:

SRT Subtitles

Generate standard SRT subtitle files for video captions:

node src/generate.js my-presentation.md --generate-srt

The SRT file includes:

  • Sequential numbering
  • Precise timestamps (HH:MM:SS,mmm format)
  • Speaker notes as caption text
  • Automatic duration calculation based on word count (150 words/minute)

Example SRT output:

1
00:00:00,000 --> 00:00:13,599
Welcome to Marptalk, a revolutionary system for creating automated narrated presentations.

2
00:00:13,599 --> 00:00:38,399
Traditional presentation narration faces several challenges...

YouTube Chapter Markers

Generate YouTube-compatible chapter markers:

node src/generate.js my-presentation.md --generate-chapters

The chapters file includes:

  • Timestamps in YouTube format (M:SS or H:MM:SS)
  • Slide titles as chapter names
  • Ready to copy-paste into YouTube video descriptions

Example chapters output:

0:00 - Welcome to Marptalk
0:13 - The Problem
0:38 - The Marptalk Solution
1:01 - Key Features

Combined Usage

Generate both SRT and chapters together:

node src/generate.js my-presentation.md --generate-srt --generate-chapters

Custom Filenames

Specify custom output filenames:

node src/generate.js my-presentation.md \
  --generate-srt --srt-filename "my-captions.srt" \
  --generate-chapters --chapters-filename "my-chapters.txt"

Uploading to YouTube

  1. Upload your video file to YouTube
  2. In YouTube Studio, go to "Subtitles" and upload the generated .srt file
  3. Copy the contents of the chapters file and paste into your video description
  4. Save changes - your video now has both captions and chapter markers!

Video Generation

Generate complete MP4 videos of your presentations with synchronized audio, subtitles, and perfect timing:

# Basic video generation
node src/generate.js my-presentation.md --generate-video

# Video with soft subtitles and chapters (subtitles can be toggled)
node src/generate.js my-presentation.md --generate-video --generate-srt --generate-chapters

# Video without subtitles
node src/generate.js my-presentation.md --generate-video --video-subtitles off

# Video with hard-coded subtitles (always visible)
node src/generate.js my-presentation.md --generate-video --generate-srt --video-subtitles hard

# Custom video settings
node src/generate.js my-presentation.md \
  --generate-video \
  --video-filename "my-presentation.mp4" \
  --video-width 1920 \
  --video-height 1080 \
  --video-subtitles soft

How Video Generation Works

Marptalk uses an efficient static slide approach:

  1. Slide Capture: Takes high-quality screenshots of each slide
  2. Audio Analysis: Analyzes each slide's audio file to get precise durations
  3. Video Assembly: Uses FFmpeg to combine slides with perfect timing
  4. Audio Sync: Each slide displays for exactly its audio duration
  5. Subtitle Integration: Embeds SRT subtitles directly into the video

Video Features

  • Perfect Synchronization: Each slide shows for its exact audio duration
  • High Quality: 1920×1080 H.264 video with AAC audio
  • Flexible Subtitles: Choose between soft (toggleable), hard (burned-in), or no subtitles
  • Efficient Processing: Static slides, not screen recording
  • Fast Generation: Typical presentation renders in under 30 seconds

Output Files

Generated files:

  • presentation.mp4 - Complete MP4 video with audio and subtitles
  • video_info.json - Recording metadata (durations, slide count, etc.)
  • video_frames/ - Individual slide screenshots
  • video_frame_*.jpg - Sample frames for verification
  • combined_audio.wav - Combined audio track
  • slides.txt - FFmpeg timing configuration

Subtitle Modes

  • soft (default): Subtitles are embedded as a separate stream that viewers can turn on/off
  • hard: Subtitles are burned into the video and always visible
  • off: No subtitles are included in the video

Recommendation: Use soft subtitles for most cases as they provide accessibility while allowing viewers to control visibility.

Video Requirements

  • FFmpeg: Must be installed and available in PATH
  • Chrome/Chromium: For slide screenshot capture
  • Audio Files: Generated by previous stages

Video Example Output

{
  "slides": 7,
  "actualDuration": 163.728,
  "slideDurations": [15.528, 28.008, 23.856, 19.488, 28.824, 22.488, 25.536],
  "dimensions": { "width": 1920, "height": 1080 },
  "method": "static-slides"
}

Uploading Generated Videos

Your generated MP4 files are ready for direct upload to:

  • YouTube: With embedded subtitles and chapter markers
  • Vimeo: Professional quality with captions
  • Social Media: Optimized format for most platforms
  • Learning Platforms: Compatible with LMS systems

Requirements

  • Node.js 16+
  • Google Cloud account with Text-to-Speech API enabled
  • Modern web browser
  • Chrome/Chromium browser (for video generation)
  • FFmpeg (for video generation) - Install with:
    • macOS: brew install ffmpeg
    • Ubuntu/Debian: apt-get install ffmpeg
    • Windows: Download from ffmpeg.org

Tips

  • Write speaker notes in a conversational tone
  • Keep notes concise but informative
  • Test different voices to find what works best
  • Use --no-generate-tts for faster iteration when updating slides/styles
  • Use the pause/resume controls during live presentations
  • The system handles browser autoplay restrictions gracefully
  • Soft subtitles are recommended for accessibility and viewer choice

Troubleshooting

Audio doesn't play

  • Check browser autoplay settings
  • Try clicking the start button instead of letting it auto-start
  • Verify audio files were generated in dist/audio/

Authentication errors

  • Verify gcloud auth application-default login worked
  • Check that Text-to-Speech API is enabled in your GCP project
  • Ensure billing is set up (free tier available)

Voice/Language errors

  • Check GCP TTS documentation for supported voices
  • Verify language code matches voice name (e.g., en-US with en-US-Journey-D)

Missing speaker notes

  • Make sure notes are enclosed in <!-- --> comments
  • Verify the Marp CLI extracted notes to .temp/

About

Generate narrated Marp presentations using AI

Resources

Stars

Watchers

Forks