Skip to content

Commit 31ae21f

Browse files
authored
Fix for Incorrect ex_iterable used with multi num_worker (#6582)
Corrects an issue where `self._ex_iterable` was erroneously used instead of `ex_iterable`, when both Distributed Data Parallel (DDP) and multi num_worker are used concurrently. This improper usage led to the generation of incorrect `shards_indices`, subsequently causing issues with the control flow responsible for worker creation. The fix ensures the appropriate iterable is used, thus providing a more accurate determination of whether a new worker should be instantiated or not.
1 parent e5406f9 commit 31ae21f

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

src/datasets/iterable_dataset.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1275,7 +1275,7 @@ def _iter_pytorch(self):
12751275
)
12761276
# split workload
12771277
_log_prefix = f"node#{self._distributed.rank} " if self._distributed else ""
1278-
shards_indices = self._ex_iterable.split_shard_indices_by_worker(worker_info.id, worker_info.num_workers)
1278+
shards_indices = ex_iterable.split_shard_indices_by_worker(worker_info.id, worker_info.num_workers)
12791279
if shards_indices:
12801280
logger.debug(
12811281
f"{_log_prefix}dataloader worker#{worker_info.id}, ': Starting to iterate over {len(shards_indices)}/{ex_iterable.n_shards} shards."

0 commit comments

Comments
 (0)