Optimized inference engine for GigaAM ASR models with CUDA acceleration, batch processing, and unlimited audio length support.
- 🚀 CUDA Acceleration: 5-13x faster than CPU with mixed precision inference
- 📦 Batch Processing: Process multiple audio chunks simultaneously for maximum GPU utilization
- ♾️ Unlimited Audio Length: Automatic chunking with overlap - no 25-second limit
- 🎯 Simple API: Single
transcribe()method handles any audio length - ⚡ High Throughput: Process up to 262 seconds of audio per second (262x realtime on GPU)
- 🔧 Zero Extra Dependencies: No VAD or additional models required
Benchmark results on real audio (v3_ctc model):
Test Hardware:
- CPU: Intel Core i9-9900K
- RAM: 32GB
- GPU: NVIDIA RTX 3080 (10GB VRAM)
- CUDA: 12.8
| Implementation | Device | Batch Size | Time (s) | Speedup vs Original CPU |
|---|---|---|---|---|
| Original GigaAM | CPU | N/A | 0.537 | 1.00x (baseline) |
| Original GigaAM | CUDA | N/A | 0.122 | 4.40x |
| faster-gigaam | CPU | 8 | 0.513 | 1.05x |
| faster-gigaam | CUDA | 8 | 0.111 | 4.82x |
| Implementation | Device | Batch Size | Time (s) | Throughput |
|---|---|---|---|---|
| Original GigaAM | CPU | N/A | ❌ Failed | Requires transcribe_longform |
| Original GigaAM | CUDA | N/A | ❌ Failed | Requires transcribe_longform |
| faster-gigaam | CPU | 8 | 3.248 | 21.7x realtime |
| faster-gigaam | CUDA | 8 | 0.269 | 262x realtime |
Key Advantage: faster-gigaam handles long audio out-of-the-box, while original GigaAM requires additional dependencies (pyannote.audio, HuggingFace token) and a separate transcribe_longform method.
Note: PyTorch is not included as a dependency to allow you to choose between CPU and CUDA versions. Install PyTorch first according to your hardware.
Tip: For faster installations, consider using UV package manager instead of pip.
Choose the appropriate PyTorch version for your system:
With CUDA (for GPU acceleration):
# CUDA 12.8 (recommended)
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu128
# Or with UV (faster package manager)
uv pip install torch torchvision --index-url https://download.pytorch.org/whl/cu128CPU only:
pip install torch torchvision --index-url https://download.pytorch.org/whl/cpuSee PyTorch installation guide for more options.
pip install git+https://github.com/salute-developers/GigaAM.git# Clone the repository
git clone https://github.com/yourusername/faster-gigaam.git
cd faster-gigaam
# Install in development mode
pip install -e .Requirements:
- Python >= 3.10 (tested with Python 3.13)
- PyTorch >= 2.0.0 (installed in Step 1)
- GigaAM (installed in Step 2)
Recommended:
- UV package manager for faster installations
from faster_gigaam import FastGigaAM
# Initialize model with GPU and batch processing
model = FastGigaAM(
model_name="v3_ctc",
device="cuda",
batch_size=8
)
# Transcribe audio (any length!)
segments, info = model.transcribe("long_audio.wav")
# Print results
for segment in segments:
print(f"[{segment.start:.2f}s - {segment.end:.2f}s] {segment.text}")
# Print performance info
print(f"\nProcessed {info.duration:.1f}s audio in {info.processing_time:.2f}s")
print(f"Throughput: {info.duration / info.processing_time:.1f}x realtime")# Use CPU if CUDA is not available
model = FastGigaAM(
model_name="v3_ctc",
device="cpu",
batch_size=4
)
segments, info = model.transcribe("audio.wav")# Process multiple files efficiently
audio_files = ["file1.wav", "file2.wav", "file3.wav"]
results = model.transcribe_batch(audio_files)
for i, (segments, info) in enumerate(results):
print(f"\nFile {i+1}:")
for segment in segments:
print(f" [{segment.start:.2f}s - {segment.end:.2f}s] {segment.text}")- Examples: See
examples/folder for more usage examples - Benchmarks: See
BENCHMARK_RESULTS.mdfor detailed performance analysis - API Reference: See docstrings in
faster_gigaam/fast_gigaam.py
All GigaAM ASR models are supported:
v3_ctc- CTC-based modelv3_e2e_rnnt- RNN-T based model- Other GigaAM model variants
This is a pet project created for personal use and shared with the community.
- ✅ Fully functional and tested
⚠️ Updates will be sporadic based on my needs- 🤝 Issues and PRs welcome, but no guarantees on response time
- 🍴 Feel free to fork if you need active maintenance
MIT License - same as GigaAM
Built on top of the excellent GigaAM project by Salute Developers.