A CLI tool for creating videos with translations and audio narration.
Download the latest release for your platform from the Releases page.
# Linux/macOS
chmod +x gocreator-*
sudo mv gocreator-* /usr/local/bin/gocreator
# Verify installation
gocreator --helpgo install github.com/Napolitain/gocreator/cmd/gocreator@latest- Automated video creation from slides and text
- Video input support - Use video clips as "slides" with their duration, not just static images
- Google Slides API integration - Fetch slides and speaker notes directly from Google Slides
- Multi-language support with AI-powered translation
- Text-to-speech audio generation
- Intelligent caching to reduce API costs
- Parallel processing for better performance
New to GoCreator? Check out the examples/ directory for a hands-on tutorial:
cd examples/getting-started
gocreator create --lang en --langs-out en,fr,esSee the Getting Started Example for detailed instructions.
Create a data directory in your project with:
data/slides/- Directory containing slide images (PNG, JPEG) or video clips (MP4, MOV, AVI, MKV, WEBM)data/texts.txt- Text file with slide narrations separated by-
gocreator create --lang en --langs-out en,fr,esHow it works:
- Image slides: Duration is determined by the TTS audio length
- Video slides: Duration is determined by the video length, with TTS audio aligned at the beginning
- You can mix images and videos in the same presentation
To use Google Slides API, you need to:
- Set up Google Cloud Project and enable Google Slides API
- Create service account credentials and download JSON file
- Share your presentation with the service account email
- Set environment variable:
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/credentials.json" - Run with Google Slides:
gocreator create --google-slides YOUR_PRESENTATION_ID --lang en --langs-out en,fr,es
The presentation ID can be found in the Google Slides URL:
https://docs.google.com/presentation/d/[PRESENTATION_ID]/edit
How it works:
- Slides are downloaded as images from your Google Slides presentation
- Speaker notes from each slide are used as the narration text
- Videos are generated with audio in multiple languages
- All content is cached for efficient re-generation
π See GOOGLE_SLIDES_GUIDE.md for detailed setup instructions and troubleshooting.
This project uses Calendar Versioning (CalVer) with the format YYYY-MM-DD.
Each release is tagged with the date it was created (e.g., 2025-01-15). This makes it easy to:
- Know when a version was released
- Track the age of your installation
- Plan upgrades based on release frequency
The project follows clean architecture principles with clear separation of concerns:
-
CLI Layer (
internal/cli/)- Command-line interface and user interaction
- Minimal business logic
-
Service Layer (
internal/services/)- Business logic and orchestration
- VideoCreator orchestrates the entire video creation workflow
- Individual services handle specific concerns (text, audio, video, translation)
-
Adapter Layer (
internal/adapters/)- External API integrations (OpenAI)
- Wraps third-party clients with our interfaces
-
Interface Layer (
internal/interfaces/)- Defines contracts between layers
- Enables dependency injection and testing
All services follow dependency injection principles:
// Services receive dependencies through constructors
textService := services.NewTextService(fs, logger)
audioService := services.NewAudioService(fs, openaiClient, textService, logger)
// VideoCreator depends on interfaces, not concrete types
creator := services.NewVideoCreator(
fs,
textService, // interfaces.TextProcessor
translation, // interfaces.Translator
audioService, // interfaces.AudioGenerator
videoService, // interfaces.VideoGenerator
slideService, // interfaces.SlideLoader
logger, // interfaces.Logger
)This design enables:
- Easy testing with mocks
- Swapping implementations without changing code
- Clear dependency graph
- Better maintainability
The project includes comprehensive unit tests with mocked dependencies:
# Run all tests
go test ./...
# Run tests with coverage
go test -cover ./...
# Run specific test package
go test ./internal/services/...
# Run benchmark tests
go test -bench=. ./internal/services/ -run=^$
# Run benchmarks with memory stats
go test -bench=. -benchmem ./internal/services/ -run=^$GoCreator includes a comprehensive performance testing tool that measures cache performance, API latency, and provides end-to-end metrics:
# Build the performance testing tool
go build -o perftest ./cmd/perftest/
# Run in simulation mode (no API key needed)
./perftest
# Run with real OpenAI API (requires OPENAI_API_KEY)
export OPENAI_API_KEY="your-key"
./perftestThe tool generates markdown-formatted performance tables showing:
- Operation timings with and without cache
- Cache hit rates and counts
- End-to-end latency measurements
- Performance improvement factors
See cmd/perftest/README.md for detailed documentation.
- TextService: Load, Save, Hash, and hash file operations
- AudioService: Audio generation with cache validation
- TranslationService: Single and batch translations
- CacheService: Set, Get, Delete, Clear, Expiration
- VideoCreator: Full workflow orchestration with mocked services
- Benchmarks: Performance tests for all core operations with cache scenarios
All external dependencies (filesystem, OpenAI API) are mocked for isolated unit testing.
GoCreator implements a sophisticated multi-layered caching strategy:
- Translation Cache - Saves API costs by caching translations
- Audio Cache - Reuses generated audio with hash validation
- Video Segment Cache - Caches intermediate video segments
- In-Memory Cache - Runtime caching with TTL support
See CACHE_POLICY.md for detailed documentation.
- Cost Reduction: Avoids redundant OpenAI API calls
- Performance: Reuses expensive computations
- Reliability: Works offline for previously processed content
- Debugging: Easier to inspect cached intermediate results
gocreator/
βββ cmd/
β βββ gocreator/ # Main entry point
βββ internal/
β βββ adapters/ # External API adapters
β βββ cli/ # CLI commands
β βββ interfaces/ # Interface definitions
β βββ mocks/ # Mock implementations for testing
β βββ services/ # Business logic
β βββ audio.go # Audio generation
β βββ cache.go # Cache management
β βββ creator.go # Main orchestrator
β βββ slide.go # Slide loading
β βββ text.go # Text processing
β βββ translation.go # Translation service
β βββ video.go # Video generation
β βββ *_test.go # Unit tests
βββ CACHE_POLICY.md # Cache strategy documentation
βββ go.mod
To add a new service:
- Define the interface in
internal/interfaces/interfaces.go - Implement the service in
internal/services/ - Create mock in
internal/mocks/for testing - Write comprehensive unit tests
- Update VideoCreator to use the new service
Example:
// 1. Define interface
type SubtitleGenerator interface {
Generate(ctx context.Context, texts []string, outputPath string) error
}
// 2. Implement service
type SubtitleService struct {
fs afero.Fs
logger interfaces.Logger
}
func NewSubtitleService(fs afero.Fs, logger interfaces.Logger) *SubtitleService {
return &SubtitleService{fs: fs, logger: logger}
}
// 3. Create mock (in internal/mocks/)
type MockSubtitleGenerator struct {
mock.Mock
}
// 4. Write tests
func TestSubtitleService_Generate(t *testing.T) {
// ...
}- github.com/spf13/cobra - CLI framework
- github.com/spf13/afero - Filesystem abstraction (enables easy testing)
- github.com/openai/openai-go - OpenAI API client
- github.com/patrickmn/go-cache - In-memory cache with expiration
- github.com/stretchr/testify - Testing framework with mocking support
- Interfaces over Concrete Types: Services depend on interfaces for flexibility
- Dependency Injection: All dependencies passed through constructors
- Single Responsibility: Each service has one clear purpose
- Comprehensive Testing: Mock external dependencies for unit tests
- Error Handling: Clear error messages with context
- Unit Tests: Test each service in isolation with mocks
- Table-Driven Tests: Use test tables for multiple scenarios
- Mock External APIs: Never call real APIs in unit tests
- Test Edge Cases: Empty inputs, errors, concurrent operations
- Hash-Based Validation: Use content hashes to detect changes
- Layered Caching: Multiple cache levels for different concerns
- Cache Invalidation: Automatic invalidation on content changes
- Documented Policy: Clear documentation of cache behavior
Potential enhancements:
- Integration Tests: End-to-end tests with test fixtures
- Performance Benchmarks: Benchmark critical paths
- Metrics & Monitoring: Track cache hit rates, API costs
- Configuration File: YAML/JSON config instead of CLI flags
- Plugin System: Extensible architecture for custom processors
- Parallel Language Processing: Process multiple languages concurrently
- Resume Support: Resume interrupted video creation
- Quality Profiles: Different quality/speed tradeoffs
When contributing:
- Follow existing code structure and patterns
- Add unit tests for all new code
- Update documentation for user-facing changes
- Keep functions small and focused
- Use meaningful variable and function names
- Add comments for complex logic only
See LICENSE file for details.