-
Notifications
You must be signed in to change notification settings - Fork 3k
Module namespace cleanup for v2.0 #3875
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
|
@severo No, this PR doesn't fix that issue in the current state. We can fix it by adding @lhoestq @albertvillanova WDYT? |
lhoestq
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you ! This is much cleaner
| # Add vectors | ||
| logger.info(f"Adding {len(vectors)} vectors to the faiss index") | ||
| for i in utils.tqdm(range(0, len(vectors), batch_size), disable=not utils.is_progress_bar_enabled()): | ||
| for i in utils.tqdm_utils.tqdm( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine if we have tqdm directly in utils with disable_progress_bar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it okay if I address this in a separate PR before the release (I'll make disable_progress_bar public again + add enable_progress_bar, enable_caching and disable_caching) to close #3586?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure !
| from .arrow_reader import ArrowReader, ReadInstruction | ||
| from .arrow_writer import ArrowWriter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here ArrowWriter and ArrowReader are removed from the top level module. I've seen that the lxmert example in transformers needs datasets.ArrowWriter, so I'm not sure if we should remove them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok :)
|
Feel free to merge this one if it's good for you :) |
This is an attempt to make the user-facing
datasets' submodule namespace cleaner:In particular, this PR does the following:
zip_nestedandflatten_nest_dictand their accompanying testspyarrowfrom the top-level namespace__all__and thefrom <module> import *syntax to avoid importing the<module>'s submodulesutilsnamespacetemp_seedcontext manage fromdatasets/utils/file_utils.pytodatasets/utils/py_utils.py