GoSubAI is an AI-powered subtitle translator built in Go.
You can define custom prompts, select the output format, and choose which LLM to run it on. It even lets you adjust the color scheme and display both the original and translated subtitles side by side.
I originally built it as a tool for language learning-customizable prompts make it easy to guide the translation style. For instance, you can ask for more literal translations.
- Configurable Automation: Define your own prompts to process subtitles.
- Context aware: You have the ability to create prompts that consider previous title sentences for a more coherent translation.
- Modularity: Choose your preferred AI model.
- Remote and local: You have the option to run the LLM either on your local device or remotely hosted.
- SRT support: Easily import & export subtitles to SRT format.
It was built over a few days but is designed to be modular and extendable, allowing for future enhancements and integrations.
You can easily set up and run this project entirely locally.
Below is an example of a before / after running GoSubAI with an example configuration:
Before using GoSubAI, ensure you have the following installed:
- Golang: Download and install Golang from the official website.
- Ollama: Download and install Ollama from Ollama's website.
To run Ollama in server mode, use the following command:
ollama serveTo produce subtitles, it's essential to specify a configuration that outlines the process to be executed on the subtitle files. Additionally, provide a SubRip (SRT) file containing the subtitles.
Try running this example command from the root of the repository:
go run ./cmd/GoSubAI generate -config ./config/translate_to_eng.json -input ./data/HVOR_BLIR_DET_AV_PENGA.srtYou can also try this setup, which includes context in the translation and often leads to better results:
go run ./cmd/GoSubAI generate -config ./config/translate_to_eng_with_context.json -input ./data/HVOR_BLIR_DET_AV_PENGA.srtTo run unit tests, use the following command:
go clean -testcache; go test ./...The following configuration will include the original text as well as the translated text:
{
"HostUrl": "default",
"Model": "mistral",
"PropertyName": "translated_text",
"SystemPrompt": "You are a subtitle translation assistant. Your only task is to translate subtitles into the target language specified by the user. Subtitles may contain incomplete sentences-when that happens, translate them literally without trying to complete or alter their meaning. Always keep the translation faithful to the original text and do not add explanations or extra words.",
"Prompt": "Translate this to english: '{TEXT}'",
"Template": "{TEXT}\n----\n{GENERATED_TEXT}",
"Debug": true
}If you only want to include the translation you can change the Template to "Template": "{GENERATED_TEXT}".
For some subtitles, it can be useful to include context from the previous line. Here’s an example configuration that does this:
{
"HostUrl": "default",
"Model": "llama3.2",
"PropertyName": "translated_text",
"SystemPrompt": "You are a subtitle translation assistant. Translate subtitles into the target language specified by the user.\n\n- Translate the text literally.\n- Do not add, remove, or guess words.\n- If the text is incomplete, translate it as-is.\n\n⚠️ Very important: The tag ##TAG## must always be copied exactly, in the same position.\n- Never remove it.\n- Never move it.\n- Never translate it.\n- Output is invalid if ##TAG## is missing or changed.\n\n### Examples\n\nInput: \"Bonjour##TAG##comment tu vas aujourd'hui?\"\nOutput: \"Hello##TAG##how are you today?\"\n\nInput: \"##TAG##Oui, j'arrive.\"\nOutput: \"##TAG##Yes, I'm coming.\"\n\nInput: \"Non##TAG##pas du tout.\"\nOutput: \"No##TAG##not at all.\"",
"Prompt": "Translate this to english: '{PREVIOUS_TEXT} ##TAG## {TEXT}'",
"Template": "{TEXT}\n----\n{GENERATED_TEXT}",
"Regex": "##TAG##(.*)",
"RegexRetryLimit": 25,
"Debug": true
}