-
Notifications
You must be signed in to change notification settings - Fork 3k
Closed
Description
Describe the bug
Datasets stored in the JSON format cannot be loaded using json.load()
Steps to reproduce the bug
import json
from datasets import load_dataset
dataset = load_dataset("squad")
train_dataset, test_dataset = dataset["train"], dataset["validation"]
test_dataset.to_json("full_dataset.json")
# This works
loaded_test = load_dataset("json", data_files="full_dataset.json")
# This fails
loaded_test = json.load(open("full_dataset.json", "r"))
Expected behavior
The JSON should be correctly formatted when writing so that it can be loaded using json.load().
Environment info
Colab: https://colab.research.google.com/drive/1st1iStFUVgu9ZPvnzSzL4vDeYWDwYpUm?usp=sharing
Metadata
Metadata
Assignees
Labels
No labels