Fix OSCAR Esperanto #2693

lhoestq · 2021-07-21T14:43:50Z

The Esperanto part (original) of OSCAR has the wrong number of examples:

from datasets import load_dataset
raw_datasets = load_dataset("oscar", "unshuffled_original_eo")

raises

NonMatchingSplitsSizesError:
[{'expected': SplitInfo(name='train', num_bytes=314188336, num_examples=121171, dataset_name='oscar'),
'recorded': SplitInfo(name='train', num_bytes=314064514, num_examples=121168, dataset_name='oscar')}]

I updated the number of expected examples in dataset_infos.json

cc @sgugger

fix oscar esperanto

b5346c5

lhoestq merged commit cddc9f2 into master Jul 21, 2021

lhoestq deleted the fix-oscar-esperanto branch July 21, 2021 14:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix OSCAR Esperanto #2693

Fix OSCAR Esperanto #2693

Uh oh!

lhoestq commented Jul 21, 2021 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix OSCAR Esperanto #2693

Fix OSCAR Esperanto #2693

Uh oh!

Conversation

lhoestq commented Jul 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lhoestq commented Jul 21, 2021 •

edited

Loading