Skip to content

Make GoogleTranslator multiprocessing-safe#1543

Merged
jmartin-tech merged 1 commit intoNVIDIA:mainfrom
motlaharsh0909-lgtm:fix-googletranslator-pickling
Jan 14, 2026
Merged

Make GoogleTranslator multiprocessing-safe#1543
jmartin-tech merged 1 commit intoNVIDIA:mainfrom
motlaharsh0909-lgtm:fix-googletranslator-pickling

Conversation

@motlaharsh0909-lgtm
Copy link
Contributor

@motlaharsh0909-lgtm motlaharsh0909-lgtm commented Dec 26, 2025

Tell us what this change does

GoogleTranslator stores a live Google Cloud Translation client on the instance, which cannot be pickled when parallel_attempts > 1 and causes a PicklingError during multiprocessing startup.

This change prevents the client from being pickled and allows it to be recreated per worker process, following the same pattern used by RivaTranslator.

Fixes #1515
Related issue: #1515

Verification
Supporting configuration such as generator configuration file
run:
target_lang: ar
parallel_attempts: 4

langproviders:

  • language: en,ar
    model_type: remote.GoogleTranslator

Command used for verification
python -m garak
--config test_google_translate.yaml
--model_type openai
--model_name gpt-4o

Run the tests and ensure they pass
python -m pytest tests/

Verify the thing does what it should

Garak starts successfully with parallel_attempts > 1

GoogleTranslator is initialized per worker process

No PicklingError occurs during multiprocessing startup

Probes are queued and begin execution

Verify the thing does not do what it should not

The Google Cloud Translation client is not pickled

Multiprocessing worker initialization does not fail

Additional information

No hardware-specific requirements

Uses Google Cloud Translation API for verification

No new dependencies introduced

@github-actions
Copy link
Contributor

github-actions bot commented Dec 26, 2025

DCO Assistant Lite bot All contributors have signed the DCO ✍️ ✅

GoogleTranslator stores a live Google Cloud translation client on the
instance, which cannot be pickled when parallel_attempts > 1 and causes
PicklingError during multiprocessing startup.

This change prevents the client from being pickled and allows it to be
recreated per worker process, following the same pattern used by
RivaTranslator.# On branch fix-googletranslator-pickling

Signed-off-by: Harsh Motla <[email protected]>
@motlaharsh0909-lgtm motlaharsh0909-lgtm force-pushed the fix-googletranslator-pickling branch from 7092f34 to 97d5644 Compare December 26, 2025 12:21
@motlaharsh0909-lgtm
Copy link
Contributor Author

recheck

@motlaharsh0909-lgtm
Copy link
Contributor Author

@cla-assistant check

@leondz
Copy link
Collaborator

leondz commented Dec 26, 2025

@motlaharsh0909-lgtm "You can sign the DCO by just posting a Pull Request Comment same as the below format.

I have read the DCO Document and I hereby sign the DCO"

@motlaharsh0909-lgtm
Copy link
Contributor Author

I have read the DCO Document and I hereby sign the DCO

@motlaharsh0909-lgtm
Copy link
Contributor Author

recheck

github-actions bot added a commit that referenced this pull request Dec 26, 2025
Copy link
Collaborator

@jmartin-tech jmartin-tech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing of this is queued, the guard suggestion offered is precautionary.

self._tested = True

def _translate(self, text: str) -> str:
retry = 5
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently only the primary process expects to access langproviders meaning client should always be set during a call to translate, this guard however would ensure restoration of the object if access were to occur on a object that passed thru pickle:

Suggested change
retry = 5
if not self.client:
self._load_langprovider()

Copy link
Collaborator

@jmartin-tech jmartin-tech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing shows need to clear one more attribute:

    cls(buf, protocol).dump(obj)
TypeError: cannot pickle 'module' object

Also some minor formatting asks.

Comment on lines +172 to +174
def __getstate__(self):
state = dict(self.__dict__)
state["client"] = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing shows that the ftfy attribute also cannot pickle:

Suggested change
def __getstate__(self):
state = dict(self.__dict__)
state["client"] = None
def __getstate__(self):
state = dict(self.__dict__)
state["client"] = None
state["ftfy"] = None


def __setstate__(self, state):
self.__dict__.update(state)
self.client = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor whitespace nitpick:

Suggested change
self.client = None
self.client = None

@jmartin-tech jmartin-tech merged commit 97d5644 into NVIDIA:main Jan 14, 2026
15 of 16 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Jan 14, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GoogleTranslator fails with PicklingError when using parallel_attempts > 1

3 participants