Skip to content

Split order is not preserved #6196

@albertvillanova

Description

@albertvillanova

I have noticed that in some cases the split order is not preserved.

For example, consider a no-script dataset with configs:

configs:
- config_name: default
  data_files:
  - split: train
    path: train.csv
  - split: test
    path: test.csv
  • Note the defined split order is [train, test]

Once the dataset is loaded, the split order is not preserved:

In [16]: ds
Out[16]: 
DatasetDict({
    test: Dataset({
        features: ['text', 'label'],
        num_rows: 1
    })
    train: Dataset({
        features: ['text', 'label'],
        num_rows: 2
    })
})
  • Note the obtained split order is [test, train]

Metadata

Metadata

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions