Skip to content

datasets doesn't work with python 3.14 #7839

@zachmoshe

Description

@zachmoshe

Describe the bug

Seems that dataset doesn't work with python==3.14. The root cause seems to be something with a deel API that was changed.

TypeError: Pickler._batch_setitems() takes 2 positional arguments but 3 were given

Steps to reproduce the bug

(on a new folder)
uv init
uv python pin 3.14
uv add datasets
uv run python

(in REPL)
import datasets
datasets.load_dataset("cais/mmlu", "all") # will fail on any dataset

>>> datasets.load_dataset("cais/mmlu", "all")
Traceback (most recent call last):
  File "<python-input-2>", line 1, in <module>
    datasets.load_dataset("cais/mmlu", "all")
    ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^                                                                                                                                
  File "/Users/zmoshe/temp/test_datasets_py3.14/.venv/lib/python3.14/site-packages/datasets/load.py", line 1397, in load_dataset
    builder_instance = load_dataset_builder(                          
        path=path,                                                                                                                                                           
    ...<10 lines>...                                                               
        **config_kwargs,                     
    )                                                                                                                                                                        
  File "/Users/zmoshe/temp/test_datasets_py3.14/.venv/lib/python3.14/site-packages/datasets/load.py", line 1185, in load_dataset_builder
    builder_instance._use_legacy_cache_dir_if_possible(dataset_module)                                                                                                       
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^                                                                                                       
  File "/Users/zmoshe/temp/test_datasets_py3.14/.venv/lib/python3.14/site-packages/datasets/builder.py", line 615, in _use_legacy_cache_dir_if_possible
    self._check_legacy_cache2(dataset_module) or self._check_legacy_cache() or None
    ~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^                                                                                                                                
  File "/Users/zmoshe/temp/test_datasets_py3.14/.venv/lib/python3.14/site-packages/datasets/builder.py", line 487, in _check_legacy_cache2
    config_id = self.config.name + "-" + Hasher.hash({"data_files": self.config.data_files})
                                         ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                  
  File "/Users/zmoshe/temp/test_datasets_py3.14/.venv/lib/python3.14/site-packages/datasets/fingerprint.py", line 188, in hash
    return cls.hash_bytes(dumps(value))  
                          ~~~~~^^^^^^^                                                                                                                                       
  File "/Users/zmoshe/temp/test_datasets_py3.14/.venv/lib/python3.14/site-packages/datasets/utils/_dill.py", line 120, in dumps
    dump(obj, file)             
    ~~~~^^^^^^^^^^^                                                                                                                                                          
  File "/Users/zmoshe/temp/test_datasets_py3.14/.venv/lib/python3.14/site-packages/datasets/utils/_dill.py", line 114, in dump
    Pickler(file, recurse=True).dump(obj)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^                                                                                                                                    
  File "/Users/zmoshe/temp/test_datasets_py3.14/.venv/lib/python3.14/site-packages/dill/_dill.py", line 428, in dump
    StockPickler.dump(self, obj)                                       
    ~~~~~~~~~~~~~~~~~^^^^^^^^^^^                                                                                                                                             
  File "/Users/zmoshe/.local/uv/python/cpython-3.14.0rc2-macos-aarch64-none/lib/python3.14/pickle.py", line 498, in dump
    self.save(obj)                                  
    ~~~~~~~~~^^^^^                                                                                                                                                           
  File "/Users/zmoshe/temp/test_datasets_py3.14/.venv/lib/python3.14/site-packages/datasets/utils/_dill.py", line 70, in save
    dill.Pickler.save(self, obj, save_persistent_id=save_persistent_id)
    ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                      
  File "/Users/zmoshe/temp/test_datasets_py3.14/.venv/lib/python3.14/site-packages/dill/_dill.py", line 422, in save
    StockPickler.save(self, obj, save_persistent_id)
    ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                         
  File "/Users/zmoshe/.local/uv/python/cpython-3.14.0rc2-macos-aarch64-none/lib/python3.14/pickle.py", line 572, in save
    f(self, obj)  # Call unbound method with explicit self
    ~^^^^^^^^^^^                                                                  
  File "/Users/zmoshe/temp/test_datasets_py3.14/.venv/lib/python3.14/site-packages/dill/_dill.py", line 1262, in save_module_dict
    StockPickler.save_dict(pickler, obj)                                                                                                                                     
    ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
  File "/Users/zmoshe/.local/uv/python/cpython-3.14.0rc2-macos-aarch64-none/lib/python3.14/pickle.py", line 1064, in save_dict
    self._batch_setitems(obj.items(), obj)
    ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
TypeError: Pickler._batch_setitems() takes 2 positional arguments but 3 were given

Expected behavior

should work.

Environment info

datasets==v4.3.0
python==3.14

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions