Skip to content

Conversation

@lhoestq
Copy link
Member

@lhoestq lhoestq commented Apr 23, 2021

Added fingerprint and format update wrappers + update the indices by adding the index of the newly added item in the table.

Copy link
Member

@albertvillanova albertvillanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I will take this PR as an example to complete my PR on add_column.

if self._indices is None:
indices_table = None
else:
new_indices_array = pa.array([len(self._data)], type=pa.uint64())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe calling it item_indices_table to be consistent with item_table above.

dataset.reset_format()
dataset_to_test.reset_format()
assert dataset[:-1] == dataset_to_test[:]
assert {k: int(v) for k, v in dataset[-1].items()} == {k: int(v) for k, v in item.items()}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to introduce an explicit test for the _indices?

For example, this test passes even if I wrongly set in add_item:

new_indices_array = pa.array([9], type=pa.uint64())

@lhoestq
Copy link
Member Author

lhoestq commented Apr 27, 2021

I renamed the variable, added a test for dataset._indices and fixed an issue with class_encode_column

@lhoestq lhoestq merged commit 80e59ef into master Apr 27, 2021
@lhoestq lhoestq deleted the add_item2 branch April 27, 2021 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants