Name	Name	Last commit message	Last commit date
parent directory ..
Dockerfile	Dockerfile
README.md	README.md
docs.md	docs.md
inference.py	inference.py
test.sh	test.sh

spark-tts

CONTAINERS IMAGES RUN BUILD

SparkTTS

SparkTTS is a high-quality text-to-speech synthesis model that provides natural-sounding speech generation. This container includes the SparkTTS model running optimized for Jetson devices, offering both standard TTS functionality and zero-shot voice cloning capabilities.

System Requirements

Memory: Requires at least 5GB of available RAM

Features

Natural-sounding speech synthesis
Adjustable pitch and speed
Gender selection for standard TTS
Zero-shot voice cloning from audio samples

Usage Examples

When using jetson-containers run, the generated audio files are automatically saved in the jetson-containers/data/audio/tts/spark-tts/ directory on your host system, and models are cached in jetson-containers/data/models/huggingface/.

Standard Text-to-Speech (CLI)

Generate speech from text with customizable parameters:

jetson-containers run $(autotag spark-tts) \
    --pitch "moderate" \
    --speed "moderate" \
    --gender "female" \
    --text "The quick brown fox jumps over the lazy dog"

Available options:

--pitch: "very_low", "low", "moderate", "high", "very_high"
--speed: "very_low", "low", "moderate", "high", "very_high"
--gender: "female", "male"

Zero-shot Voice Cloning (CLI)

Clone a voice from a sample audio file (note: the audio file must be accessible inside the container, put it in the jetson-containers/data directory):

jetson-containers run $(autotag spark-tts) \
    --prompt_speech_path "/data/audio/sample.wav" \
    --prompt_text "This is a sample prompt text that matches the audio sample..." \
    --speed "moderate" \
    --text "Hi, this is a test of voice cloning with Spark TTS!"

Output Location

When using jetson-containers run, the following directories are automatically mounted and accessible:

Audio output: jetson-containers/data/audio/tts/spark-tts/
Model cache: jetson-containers/data/models/huggingface/

The generated audio files will be saved with timestamped filenames like 20250325230742.wav.

Model Source

This container uses the SparkTTS model from Hugging Face: Spark-TTS by SparkAudio

CONTAINERS

`spark-tts`
Requires	`L4T ['>=36.1.0']`
Dependencies	`build-essential` `pip_cache:cu126` `cuda:12.6` `cudnn` `python` `numpy` `cmake` `onnx` `pytorch:2.8` `torchaudio` `torchvision` `huggingface_hub` `rust` `transformers`
Dockerfile	`Dockerfile`
Notes	Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens – https://github.com/SparkAudio/Spark-TTS

RUN CONTAINER

To start the container, you can use jetson-containers run and autotag, or manually put together a docker run command:

# automatically pull or build a compatible container image
jetson-containers run $(autotag spark-tts)

# or if using 'docker run' (specify image and mounts/ect)
sudo docker run --runtime nvidia -it --rm --network=host spark-tts:36.4.0

^{jetson-containers run forwards arguments to docker run with some defaults added (like --runtime nvidia, mounts a /data cache, and detects devices)}
^{autotag finds a container image that's compatible with your version of JetPack/L4T - either locally, pulled from a registry, or by building it.}

To mount your own directories into the container, use the -v or --volume flags:

jetson-containers run -v /path/on/host:/path/in/container $(autotag spark-tts)

To launch the container running a command, as opposed to an interactive shell:

jetson-containers run $(autotag spark-tts) my_app --abc xyz

You can pass any options to it that you would to docker run, and it'll print out the full command that it constructs before executing it.

BUILD CONTAINER

If you use autotag as shown above, it'll ask to build the container for you if needed. To manually build it, first do the system setup, then run:

jetson-containers build spark-tts

The dependencies from above will be built into the container, and it'll be tested during. Run it with --help for build options.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

spark-tts

SparkTTS

System Requirements

Features

Usage Examples

Standard Text-to-Speech (CLI)

Zero-shot Voice Cloning (CLI)

Output Location

Model Source

FilesExpand file tree

spark-tts

Directory actions

More options

Directory actions

More options

Latest commit

History

spark-tts

Folders and files

parent directory

README.md

spark-tts

SparkTTS

System Requirements

Features

Usage Examples

Standard Text-to-Speech (CLI)

Zero-shot Voice Cloning (CLI)

Output Location

Model Source