-
Notifications
You must be signed in to change notification settings - Fork 3k
Use tqdm from tqdm_utils #2667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use tqdm from tqdm_utils #2667
Conversation
| return function(data_struct) | ||
|
|
||
| disable_tqdm = bool(logger.getEffectiveLevel() > INFO) | ||
| disable_tqdm = bool(logger.getEffectiveLevel() > logging.INFO) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replacing this line with disable_tqdm = bool(logging.get_verbosity() == logging.NOTSET) (this check was introduced in #2534) causes 2 tests to fail on Windows, which is very strange.
UPDATE:
It took me some time to find the bug. This is the PR with the fix I've opened in the tqdm repo. I'll update the line in a separate PR if it gets merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job discovering what caused this issue !
|
The current CI failure is due to modifications in the dataset script. |
lhoestq
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice thanks :)
I just have one comment about the change in src/datasets/__init__.py
| return function(data_struct) | ||
|
|
||
| disable_tqdm = bool(logger.getEffectiveLevel() > INFO) | ||
| disable_tqdm = bool(logger.getEffectiveLevel() > logging.INFO) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job discovering what caused this issue !
lhoestq
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot :)
|
Merging since the CI is only failing because of dataset card issues, which is unrelated to this PR |
This PR replaces
tqdmfrom thetqdmlib withtqdmfromdatasets.utils.tqdm_utils. With this change, it's possible to disable progress bars just by callingdisable_progress_bar. Note this doesn't work on Windows when using multiprocessing due to how global variables are shared between processes. Currently, there is no easy way to disable progress bars in a multiprocess setting on Windows (patching logging withdatasets.utils.logging.get_verbosity = lambda: datasets.utils.logging.NOTSETdoesn't seem to work as well), so adding support for this is a future goal. Additionally, this PR adds a unit ("ba" for batches) to the bar printed byDataset.to_json(this change is motivated by #2657).