|
| 1 | +# AI Localization |
| 2 | + |
| 3 | +Crowdin offers a set of tools to help you localize your project with AI. These tools are designed to help you save time and effort on localization tasks, making the process more efficient and cost-effective. |
| 4 | + |
| 5 | +While Crowdin GitHub Action is a powerful tool for automating localization workflows, you will need to make sure that you provide the necessary context for the AI to work effectively. This means that you will need to provide the AI with the necessary information about your project, such as the source files. |
| 6 | + |
| 7 | +## Preparing Your Project |
| 8 | + |
| 9 | +Crowdin integrates with top AI providers, including OpenAI, Google Gemini, Microsoft Azure OpenAI, DeepSeek, xAI, and more, allowing you to leverage advanced AI-powered translations that consider additional context at different levels. |
| 10 | + |
| 11 | +To get started, you will need to add the AI Provider and Prompt to your Crowdin profile or organization settings. |
| 12 | + |
| 13 | +> [!TIP] |
| 14 | +> Visit the [Crowdin AI](https://support.crowdin.com/crowdin-ai/) page to learn more about the AI providers and how to set up AI in your Crowdin account. |
| 15 | +
|
| 16 | +After setting up the AI provider and Prompt, store their IDs in the Actions secrets: create the `PROVIDER_ID`, `PROMPT_ID` secrets in _Repository settings_ -> _Secrets and variables_ -> _Actions_ > _Repository secrets_. Also, create a new GitHub Actions secret to store the Personal Access Token and the Crowdin Project ID: `CROWDIN_PERSONAL_TOKEN`, `CROWDIN_PROJECT_ID`. Read more about [Personal Access Tokens](https://support.crowdin.com/account-settings/#personal-access-tokens/). |
| 17 | + |
| 18 | +As a result, you must have the following secrets configured for your repository: |
| 19 | + |
| 20 | +- `PROVIDER_ID` - AI Provider ID |
| 21 | +- `PROMPT_ID` - Prompt ID |
| 22 | +- `CROWDIN_PERSONAL_TOKEN` - Crowdin Personal Access Token |
| 23 | +- `CROWDIN_PROJECT_ID` - Crowdin Project ID |
| 24 | + |
| 25 | +## Automatic AI Pre-Translation |
| 26 | + |
| 27 | +The basic workflow for AI localization includes the following steps: |
| 28 | + |
| 29 | +- Source files are uploaded to Crowdin. |
| 30 | +- Automatic pre-translation is performed using the AI provider. |
| 31 | +- Translations are downloaded. |
| 32 | +- Push the translations to the repository and create a pull request. |
| 33 | + |
| 34 | +Here is an example of how to set up the automatic AI pre-translation workflow using the Crowdin GitHub Action: |
| 35 | + |
| 36 | +```yaml |
| 37 | +name: Crowdin Pre-translate with AI |
| 38 | + |
| 39 | +on: |
| 40 | + push: |
| 41 | + branches: [ main ] |
| 42 | + workflow_dispatch: |
| 43 | + |
| 44 | +jobs: |
| 45 | + crowdin-process: |
| 46 | + runs-on: ubuntu-latest |
| 47 | + |
| 48 | + steps: |
| 49 | + - uses: actions/checkout@v4 |
| 50 | + |
| 51 | + - name: Upload Sources to Crowdin |
| 52 | + uses: crowdin/github-action@v2 |
| 53 | + with: |
| 54 | + upload_sources: true |
| 55 | + upload_translations: false |
| 56 | + download_translations: false |
| 57 | + create_pull_request: false |
| 58 | + push_translations: false |
| 59 | + env: |
| 60 | + GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} |
| 61 | + CROWDIN_PROJECT_ID: ${{ secrets.CROWDIN_PROJECT_ID }} |
| 62 | + CROWDIN_PERSONAL_TOKEN: ${{ secrets.CROWDIN_PERSONAL_TOKEN }} |
| 63 | + |
| 64 | + - name: Pre-translate with AI |
| 65 | + uses: crowdin/github-action@v2 |
| 66 | + with: |
| 67 | + command: 'pre-translate' |
| 68 | + command_args: '--method ai --ai-prompt=${{ secrets.PROMPT_ID }}' |
| 69 | + env: |
| 70 | + CROWDIN_PROJECT_ID: ${{ secrets.CROWDIN_PROJECT_ID }} |
| 71 | + CROWDIN_PERSONAL_TOKEN: ${{ secrets.CROWDIN_PERSONAL_TOKEN }} |
| 72 | + |
| 73 | + - name: Download Translations from Crowdin |
| 74 | + uses: crowdin/github-action@v2 |
| 75 | + with: |
| 76 | + upload_sources: false |
| 77 | + upload_translations: false |
| 78 | + download_translations: true |
| 79 | + localization_branch_name: l10n_crowdin_ai_translations |
| 80 | + create_pull_request: true |
| 81 | + pull_request_title: 'New Crowdin AI Translations' |
| 82 | + pull_request_body: 'New translations generated with Crowdin AI pre-translation' |
| 83 | + pull_request_base_branch_name: 'main' |
| 84 | + env: |
| 85 | + GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} |
| 86 | + CROWDIN_PROJECT_ID: ${{ secrets.CROWDIN_PROJECT_ID }} |
| 87 | + CROWDIN_PERSONAL_TOKEN: ${{ secrets.CROWDIN_PERSONAL_TOKEN }} |
| 88 | +``` |
| 89 | +
|
| 90 | +## Providing Context for AI |
| 91 | +
|
| 92 | +Context is crucial for accurate and high-quality translations. With the enhanced context, AI produces production-quality translations that were previously only possible with human input. |
| 93 | +
|
| 94 | +> [!IMPORTANT] |
| 95 | +> Experiments have shown that LLM + AI Extracted Context can improve the translation quality by up to 75% and LLM + AI Extracted Context + Screenshots + Crowdin AI Tools can improve the translation quality by up to 95%. |
| 96 | +
|
| 97 | +### Harvesting Context with Crowdin Context Harvester CLI |
| 98 | +
|
| 99 | +To ensure that AI translations are accurate and contextually relevant, you need to provide the AI with the necessary context. |
| 100 | +
|
| 101 | +Crowdin allows you to provide various levels of context to the AI, including the options available during prompt configuration (glossary terms, TM suggestions, previous and next strings, file context, screenshots, and more). It's highly recommended that you provide the AI with as much context as possible to improve the quality of the translations. |
| 102 | +
|
| 103 | +You can use the [Crowdin Context Harvester CLI](https://store.crowdin.com/crowdin-context-harvester-cli) in your CI/CD pipeline to automate the context extraction process. The Context Harvester CLI is designed to simplify the process of extracting context for Crowdin strings from your code. Using Large Language Models (LLMs), it automatically analyzes your project code to find out how each key is used. This information is extremely useful for the human linguists or AI that will be translating your project keys, and is likely to improve the quality of the translation. |
| 104 | +
|
| 105 | +First, install the Crowdin Context Harvester CLI and configure it with your Crowdin project: |
| 106 | +
|
| 107 | +```bash |
| 108 | +npm i -g crowdin-context-harvester |
| 109 | +crowdin-context-harvester configure |
| 110 | +``` |
| 111 | + |
| 112 | +You'll be asked to enter all the necessary information, such as your Crowdin Personal Access Token, Project ID, and other details. |
| 113 | + |
| 114 | +For example: |
| 115 | + |
| 116 | +```bash |
| 117 | +crowdin-context-harvester configure |
| 118 | +? What Crowdin product do you use? Crowdin.com |
| 119 | +? Crowdin Personal API token (with Project, AI scopes): __your_personal_token_ |
| 120 | +? Crowdin project: Test Project |
| 121 | +? AI provider: Crowdin AI Provider |
| 122 | +? Crowdin AI provider (you should have the OpenAI provider configured in Crowdin): Open AI |
| 123 | +? AI model (newest models with largest context window are preferred): gpt-4 |
| 124 | +? Model context window size in tokens: 128000 |
| 125 | +? Model maximum output tokens count: 16384 |
| 126 | +? Check if the code contains the key or the text of the string before sending it to the AI model |
| 127 | +(recommended if you have thousands of keys to avoid chunking and improve speed).: I use keys in the code |
| 128 | +? Custom prompt file. "-" to read from STDIN (optional): |
| 129 | +? Local files (glob pattern): **/*.* |
| 130 | +? Ignore local files (glob pattern). Make sure to exclude unnecessary files to avoid unnecessary AI API calls: /**/node_modules/** |
| 131 | +? Crowdin files (glob pattern e.g. **/*.*).: **/*.* |
| 132 | +? CroQL query (optional): |
| 133 | +? Output: Terminal (dry run) |
| 134 | + |
| 135 | +You can now execute the harvest command by running: |
| 136 | + |
| 137 | +crowdin-context-harvester harvest --token="__your_personal_token_" --project=11111 --ai="crowdin" --crowdinAiId=2222 --model="gpt-4" --localFiles="**/*.*" --localIgnore="/**/node_modules/**" --crowdinFiles="**/*.*" --contextWindowSize="128000" --maxOutputTokens="16384" --screen="keys" --output="terminal" |
| 138 | +``` |
| 139 | +
|
| 140 | +Once you have configured the Crowdin Context Harvester CLI, you can use the received `harvest` command to extract the context from your project files and provide it to the AI for better translations in your CI/CD pipeline. |
| 141 | +
|
| 142 | +Then, add the following steps to your GitHub Actions workflow to extract the context and provide it to the AI: |
| 143 | +
|
| 144 | +```yaml |
| 145 | +# Upload sources step |
| 146 | +
|
| 147 | +- uses: actions/setup-node@vv |
| 148 | + with: |
| 149 | + node-version: '20' |
| 150 | +
|
| 151 | +- name: Extract Context for AI |
| 152 | + run: | |
| 153 | + npm i -g crowdin-context-harvester |
| 154 | + crowdin-context-harvester harvest \ |
| 155 | + --ai="crowdin" \ |
| 156 | + --crowdinAiId="${{ secrets.PROVIDER_ID }}" \ |
| 157 | + --model="gpt-4" \ |
| 158 | + --localFiles="**/*.*" \ |
| 159 | + --localIgnore="/**/node_modules/**" \ |
| 160 | + --crowdinFiles="**/*.*" \ |
| 161 | + --contextWindowSize="128000" \ |
| 162 | + --maxOutputTokens="16384" \ |
| 163 | + --screen="keys" \ |
| 164 | + --output="terminal" |
| 165 | +
|
| 166 | +- name: Upload Context |
| 167 | + run: | |
| 168 | + crowdin-context-harvester upload \ |
| 169 | + --token="${{ secrets.CROWDIN_PERSONAL_TOKEN }}" \ |
| 170 | + --project="${{ secrets.CROWDIN_PROJECT_ID }}" \ |
| 171 | + --csvFile="crowdin-context.csv" |
| 172 | +
|
| 173 | +# Pre-translate with AI step |
| 174 | +# Download translations step |
| 175 | +``` |
| 176 | +
|
| 177 | +> [!CAUTION] |
| 178 | +> Make sure to omit the personal access token and project ID from the command line and store them in the GitHub Actions secrets. The CLI will automatically use the secrets if they are set. |
| 179 | +
|
| 180 | +### Automated Screenshots |
| 181 | +
|
| 182 | +Crowdin also allows you to provide screenshots to the AI to help it better understand the context. Depending on the type of project, Crowdin offers a few different ways to automate the screenshot generation process: |
| 183 | +
|
| 184 | +- [For Web Projects](https://support.crowdin.com/developer/automating-screenshot-management/) |
| 185 | +- [For Android Projects](https://crowdin.github.io/mobile-sdk-android/guides/screenshots-automation) |
| 186 | +- [For iOS Projects](https://crowdin.github.io/mobile-sdk-ios/guides/screenshots-automation) |
| 187 | +
|
| 188 | +Integrate automated screenshot generation into your CI/CD pipeline to give AI the context it needs for better translations. |
0 commit comments