[Arrow writer, Trivia_qa] Could not convert TagMe with type str: converting to null type

Running the following code 

```
import nlp
ds = nlp.load_dataset("trivia_qa", "rc", split="validation[:1%]")  # this might take 2.3 min to download but it's cached afterwards...
ds.map(lambda x: x, load_from_cache_file=False)
```

triggers a `ArrowInvalid: Could not convert TagMe with type str: converting to null type` error.

On the other hand if we remove a certain column of `trivia_qa` which seems responsible for the bug, it works:

```
import nlp
ds = nlp.load_dataset("trivia_qa", "rc", split="validation[:1%]")  # this might take 2.3 min to download but it's cached afterwards...
ds.map(lambda x: x, remove_columns=["entity_pages"], load_from_cache_file=False)
```

. Seems quite hard to debug what's going on here... @lhoestq @thomwolf - do you have a good first guess what the problem could be?

**Note** BTW: I think this could be a good test to check that the datasets work correctly: Take a tiny portion of the dataset and check that it can be written correctly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Arrow writer, Trivia_qa] Could not convert TagMe with type str: converting to null type #211

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Arrow writer, Trivia_qa] Could not convert TagMe with type str: converting to null type #211

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions