You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* fix dataset_dict.shuffle with single seed
* add seed alias
* missing test
* Update src/datasets/dataset_dict.py
Co-authored-by: Thomas Wolf <[email protected]>
Co-authored-by: Thomas Wolf <[email protected]>
You can either supply a NumPy BitGenerator to use, or a seed to initiate NumPy's default random generator (PCG64).
435
436
436
437
Args:
437
-
seeds (Optional `Dict[str, int]`): A seed to initialize the default BitGenerator if ``generator=None``.
438
+
seeds (Optional `Dict[str, int]` or `int`): A seed to initialize the default BitGenerator if ``generator=None``.
438
439
If None, then fresh, unpredictable entropy will be pulled from the OS.
439
440
If an int or array_like[ints] is passed, then it will be passed to SeedSequence to derive the initial BitGenerator state.
440
-
You have to provide one :obj:`seed` per dataset in the dataset dictionary.
441
+
You can provide one :obj:`seed` per dataset in the dataset dictionary.
442
+
seed (Optional `int`): A seed to initialize the default BitGenerator if ``generator=None``. Alias for seeds (the seed argument has priority over seeds if both arguments are provided).
441
443
generators (Optional `Dict[str, np.random.Generator]`): Numpy random Generator to use to compute the permutation of the dataset rows.
442
444
If ``generator=None`` (default), uses np.random.default_rng (the default BitGenerator (PCG64) of NumPy).
443
445
You have to provide one :obj:`generator` per dataset in the dataset dictionary.
@@ -451,8 +453,13 @@ def shuffle(
451
453
Higher value gives smaller cache files, lower value consume less temporary memory while running `.map()`.
452
454
"""
453
455
self._check_values_type()
456
+
ifseedisnotNoneandseedsisnotNone:
457
+
raiseValueError("Please specify seed or seeds, but not both")
0 commit comments