You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
I pushed a modification of a large dataset (remove a column) to the hub. The push was interrupted after some files were committed to the repo. This left the dataset to raise an error on load_dataset() (ValueError couldn’t cast … because column names don’t match). Only by specifying the previous (complete) commit as revision=commit_hash in load_data(), I was able to repair this and after a successful, complete push, the dataset loads without error again.
Describe the solution you'd like
Would it make sense to detect an incomplete push_to_hub() and automatically revert to the previous commit/revision?
Describe alternatives you've considered
Leave everything as is, the revision parameter in load_dataset() allows to manually fix this problem.