Skip to content

Conversation

@bpronan
Copy link
Collaborator

@bpronan bpronan commented Apr 28, 2025

With a new feature in xet-core, we now support specifying a byte array as upload data for a xet file upload. We are leveraging that to provide support for specifying an array of bytes in the path_or_fileobj parameter to the file upload methods.

The xet-core change comes with some updates to the hf_xet interface. Notably, the notion of a "pointer file" has been removed from the library entirely. During the next major version release of hf_xet, we will be removing the PyPointerFile entirely. This python library PR includes moving on to the new data structures, but we've added a test here to ensure backwards compatibility remains until then.

Note: the xet tests here are run against all PRs in the xet-core library.

This should allow us to address the dataset viewer issue here (cc: @lhoestq).

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Contributor

@hanouticelina hanouticelina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bpronan for the PR! we should set the minimal version of hf_xet to 1.1.0 here and here so that users won't get an incompatible huggingface_hub <> hf_xet pair in their environment.

Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding support for byte arrays @bpronan ! I've left a few comments mostly related to Python syntax. All good otherwise

Copy link
Contributor

@hanouticelina hanouticelina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! thanks @bpronan for the PR

Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating! Let's ship this! :)

@hanouticelina hanouticelina merged commit caeaeeb into main May 6, 2025
25 checks passed
@hanouticelina hanouticelina deleted the hf-xet-upload-with-bytes branch May 6, 2025 10:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants