Skip to content

Conversation

@pcuenca
Copy link
Member

@pcuenca pcuenca commented Jul 21, 2021

This prevents an schema update with unknown column types, as reported in #2644.

This is my first attempt at fixing the issue. I tested the following:

  • First batch returned by a batched map operation is empty.
  • An intermediate batch is empty.
  • python -m unittest tests.test_arrow_writer passes.

However, arrow_writer looks like a pretty generic interface, I'm not sure if there are other uses I may have overlooked. Let me know if that's the case, or if a better approach would be preferable.

pcuenca and others added 2 commits July 22, 2021 00:19
This prevents an schema update with unknown column types.
Reference: huggingface#2644
Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works like a charm, thanks :)

@lhoestq lhoestq merged commit 6b809c0 into huggingface:master Jul 26, 2021
@pcuenca pcuenca deleted the map-ignore-empty-results branch July 26, 2021 14:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants