AI YouTube Shorts Generator

AI-powered tool to automatically generate engaging YouTube Shorts from long-form videos. Uses GPT-4o-mini and Whisper to extract highlights, add subtitles, and crop videos vertically for social media.

Features

🎬 Flexible Input: Supports both YouTube URLs and local video files
🎤 GPU-Accelerated Transcription: CUDA-enabled Whisper for fast speech-to-text
🤖 AI Highlight Selection: GPT-4o-mini automatically finds the most engaging 2-minute segments
✅ Interactive Approval: Review and approve/regenerate selections with 15-second auto-approve timeout
📝 Auto Subtitles: Stylized captions with Franklin Gothic font burned into video
🎯 Smart Cropping:
- Face videos: Static face-centered crop (no jerky movement)
- Screen recordings: Half-width display with smooth motion tracking (1 shift/second max)
📱 Vertical Format: Perfect 9:16 aspect ratio for TikTok/YouTube Shorts/Instagram Reels
⚙️ Automation Ready: CLI arguments, auto-quality selection, timeout-based approvals
🔄 Concurrent Execution: Unique session IDs allow multiple instances to run simultaneously
📦 Clean Output: Slugified filenames (e.g., my-video-title_short.mp4) and automatic temp file cleanup

Quick Start (No Setup)

Want better results without the setup? The AI Clipping API offers improved clip selection, faster processing, and no dependencies to manage.

Installation (Self-Hosted)

Prerequisites

Python 3.10+
FFmpeg with development headers
NVIDIA GPU with CUDA support (optional, but recommended for faster transcription)
ImageMagick (for subtitle rendering)
OpenAI API key

Steps

Clone the repository:

git clone https://github.com/SamurAIGPT/AI-Youtube-Shorts-Generator.git
cd AI-Youtube-Shorts-Generator

Install system dependencies:

Ubuntu/Debian:

sudo apt install -y ffmpeg libavdevice-dev libavfilter-dev libopus-dev \
  libvpx-dev pkg-config libsrtp2-dev imagemagick

macOS:

brew install ffmpeg imagemagick

Windows:

Install FFmpeg and add to PATH
Install ImageMagick

Fix ImageMagick security policy (Linux only, required for subtitles):

sudo sed -i 's/rights="none" pattern="@\*"/rights="read|write" pattern="@*"/' /etc/ImageMagick-6/policy.xml

Create and activate virtual environment:

python3.10 -m venv venv
source venv/bin/activate

Install Python dependencies:
```
pip install -r requirements.txt
```
Set up environment variables:

Create a .env file in the project root:
```
OPENAI_API=your_openai_api_key_here
```

CPU-Only Installation

If you don't have an NVIDIA GPU, see INSTALL_CPU.md for CPU-only setup instructions.

Docker Installation

# Build and run with Docker Compose
docker-compose up --build

# Or build manually
docker build -t ai-shorts-generator .
docker run -v $(pwd)/.env:/app/.env -v $(pwd)/videos:/app/videos ai-shorts-generator

Usage

With YouTube URL (Interactive)

./run.sh
# Then enter YouTube URL when prompted
# You'll be able to select video resolution (5s timeout, auto-selects highest)

With YouTube URL (Command-Line)

./run.sh "https://youtu.be/VIDEO_ID"

With Local Video File

./run.sh "/path/to/your/video.mp4"

Batch Processing Multiple URLs

Create a urls.txt file with one URL per line, then:

# Process all URLs sequentially with auto-approve
xargs -a urls.txt -I{} ./run.sh --auto-approve {}

Or without auto-approve (will prompt for each):

xargs -a urls.txt -I{} ./run.sh {}

Resolution Selection

When downloading from YouTube, you'll see:

Available video streams:
  0. Resolution: 1080p, Size: 45.2 MB, Type: Adaptive
  1. Resolution: 720p, Size: 28.1 MB, Type: Adaptive
  2. Resolution: 480p, Size: 15.3 MB, Type: Adaptive

Select resolution number (0-2) or wait 5s for auto-select...
Auto-selecting highest quality in 5 seconds...

Enter a number to select that resolution immediately
Wait 5 seconds to auto-select highest quality (1080p)
Invalid input falls back to highest quality

How It Works

Download/Load: Fetches from YouTube or loads local file
Resolution Selection: Choose video quality (5s timeout, auto-selects highest)
Extract Audio: Converts to WAV format
Transcribe: GPU-accelerated Whisper transcription (~30s for 5min video)
AI Analysis: GPT-4o-mini selects most engaging 2-minute segment
Interactive Approval: Review selection, regenerate if needed, or auto-approve in 15s
Extract Clip: Crops selected timeframe
Smart Crop:
- Detects faces → static face-centered vertical crop
- No faces → half-width screen recording with motion tracking
Add Subtitles: Burns Franklin Gothic captions with blue text/black outline
Combine Audio: Merges audio track with final video
Cleanup: Removes all temporary files

Output: {video-title}_{session-id}_short.mp4 with slugified filename and unique identifier

Interactive Workflow

After AI selects a highlight, you'll see:

============================================================
SELECTED SEGMENT DETAILS:
Time: 68s - 187s (119s duration)
============================================================

Options:
  [Enter/y] Approve and continue
  [r] Regenerate selection
  [n] Cancel

Auto-approving in 15 seconds if no input...

Press Enter or y to approve
Press r to regenerate a different selection (can repeat multiple times)
Press n to cancel
Wait 15 seconds to auto-approve (perfect for automation)

Configuration

Subtitle Styling

Edit Components/Subtitles.py - search for TextClip:

Font: font='Franklin-Gothic' (requires Franklin Gothic installed, or change to any system font)
Size: fontsize=80
Color: color='#2699ff' (blue)
Outline: stroke_color='black', stroke_width=2

To list available fonts:

convert -list font | grep -i "font:"

Highlight Selection Criteria

Edit Components/LanguageTasks.py:

Prompt: Modify the system variable to adjust what's "interesting, useful, surprising, controversial, or thought-provoking"
Model: Change model="gpt-4o-mini" in ChatOpenAI() call
Temperature: Adjust temperature=1.0 (higher = more creative)

Motion Tracking

Edit Components/FaceCrop.py - search for use_motion_tracking:

Update frequency: update_interval = int(fps) - currently 1 shift/second
Smoothing: 0.90 * smoothed_x + 0.10 * target_x - 90% previous, 10% new
Motion threshold: motion_threshold = 2.0

Face Detection

Edit Components/FaceCrop.py - search for detectMultiScale:

Sensitivity: minNeighbors=8 - Higher = fewer false positives
Minimum size: minSize=(30, 30) - Minimum face size in pixels

Video Quality

Edit Components/Subtitles.py - search for write_videofile:

Bitrate: bitrate='3000k'
Preset: preset='medium' (options: ultrafast, fast, medium, slow, veryslow)

Output Files

Final videos are named: {video-title}_{session-id}_short.mp4

Example: my-awesome-video_a1b2c3d4_short.mp4

Slugified title: Lowercase, hyphens instead of spaces
Session ID: 8-character unique identifier for traceability
Resolution: Matches source video height (720p → 404x720, 1080p → 607x1080)

Concurrent Execution

Run multiple instances simultaneously:

./run.sh "https://youtu.be/VIDEO1" &
./run.sh "https://youtu.be/VIDEO2" &
./run.sh "/path/to/video3.mp4" &

Each instance gets a unique session ID and temporary files, preventing conflicts.

Troubleshooting

CUDA/GPU Issues

# Verify CUDA libraries
export LD_LIBRARY_PATH=$(find $(pwd)/venv/lib/python3.10/site-packages/nvidia -name "lib" -type d | paste -sd ":" -)

The run.sh script handles this automatically.

No Subtitles

Ensure ImageMagick policy allows file operations:

grep 'pattern="@\*"' /etc/ImageMagick-6/policy.xml
# Should show: rights="read|write"

Face Detection Issues

Video needs visible faces in first 30 frames
For screen recordings, automatic motion tracking applies
Low-resolution videos may have less reliable detection

Looking for Better Results?

The AI Clipping API uses an improved algorithm that produces higher-quality clips with better highlight detection.

Contributing

Contributions are welcome! Please fork the repository and submit a pull request.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
Components		Components
demos		demos
models		models
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
INSTALL_CPU.md		INSTALL_CPU.md
LICENSE		LICENSE
README.md		README.md
TextOverlay.md		TextOverlay.md
docker-compose.yml		docker-compose.yml
haarcascade_frontalface_default.xml		haarcascade_frontalface_default.xml
main.py		main.py
requirements-cpu.txt		requirements-cpu.txt
requirements.txt		requirements.txt
run.sh		run.sh
test_fix.py		test_fix.py
verify_cuda.py		verify_cuda.py

License

SamurAIGPT/AI-Youtube-Shorts-Generator

Folders and files

Latest commit

History

Repository files navigation

AI YouTube Shorts Generator

Features

Quick Start (No Setup)

Installation (Self-Hosted)

Prerequisites

Steps

CPU-Only Installation

Docker Installation

Usage

With YouTube URL (Interactive)

With YouTube URL (Command-Line)

With Local Video File

Batch Processing Multiple URLs

Resolution Selection

How It Works

Interactive Workflow

Configuration

Subtitle Styling

Highlight Selection Criteria

Motion Tracking

Face Detection

Video Quality

Output Files

Concurrent Execution

Troubleshooting

CUDA/GPU Issues

No Subtitles

Face Detection Issues

Looking for Better Results?

Contributing

License

Related Projects

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 9

Languages

Packages