Repository: https://github.com/ALERTua/stt_ukrainian_docker
GitHub Docker Registry: https://github.com/ALERTua/stt_ukrainian_docker/pkgs/container/stt_ukrainian_docker
Docker Hub: https://hub.docker.com/r/alertua/stt_ukrainian_docker
Docker image for a gradio app that produces text from Ukrainian speech.
Used with https://github.com/ALERTua/stt-ukrainian-api to provide an OpenAI STT API endpoint to use it with Home Assistant.
I'll try to base this image on the most modern and effective SST model. This image is currently based on: Yehor/w2v-bert-uk-v2.1
The best way is to use the docker-compose.yml
Or run directly with Docker:
docker run -d \
-p 7860:7860 \
-v ./docker_volumes/stt/data:/data \
--name stt_ukrainian \
ghcr.io/alertua/stt_ukrainian_docker:latestYou can access the Gradio Web UI at http://{container_ip}:7860
After the first run the data directory will look like this:
.cache- contains models downloaded from HuggingFace Hub. ~2.4GBuv_cache- cache for installing prerequisites ~7.2gbvenv- working environment ~7.4gb
- tag
latestuses ~3 GiB of RAM
- Make this use less RAM
- dummy
-
The first start is slow as the models are downloaded and the prerequisites get installed.
-
If you need a specific
torchversion, you can execute inside the running container:E.g. torch for my GTX1080ti
cd /data source venv/bin/activate uv pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu118 --force-reinstall
Then restart the container.
You can also execute this outside the container within the mounted virtual environment.