Make `test_utils.py` `fork`-safe for `torchelastic` by amorehead · Pull Request #1030 · meta-pytorch/tnt

amorehead · 2025-09-10T21:38:05Z

Summary:
Makes test_utils.py (and torchtnt in general) safe to use start_method=fork for multi-GPU training with torchelastic. An example of a project that would benefit from this change is fairchem, which uses both torchelastic and torchtnt in conjunction for multi-GPU training.

Test plan:
I verified that making this change allows me to train models within the fairchem codebase when start_method=fork for elastic_launch. Without this change, a CUDA context will be created within the parent process of any Python package that imports torchtnt, which would subsequently make training with fork impossible when using multiple GPUs in parallel.

Fixes:
Together with this fairchem PR, this will fix crashes related to multi-GPU (local, not SLURM) model training using the fairchem codebase when start_method=fork.

Update test_utils.py

4ec9e16

meta-cla bot added the cla signed label Sep 10, 2025

amorehead mentioned this pull request Sep 10, 2025

Add optional start_method argument to _cli.py facebookresearch/fairchem#1476

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make `test_utils.py` `fork`-safe for `torchelastic` #1030

Make `test_utils.py` `fork`-safe for `torchelastic` #1030
amorehead wants to merge 1 commit intometa-pytorch:masterfrom
amorehead:fork-safe

amorehead commented Sep 10, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

amorehead commented Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

amorehead commented Sep 10, 2025 •

edited

Loading