-
Notifications
You must be signed in to change notification settings - Fork 3k
Add KS task to SUPERB #2783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add KS task to SUPERB #2783
Conversation
|
thanks a lot for implementing this @anton-l !! i won't have time to review this while i'm away, so happy for @albertvillanova and @patrickvonplaten to decide when to merge :) |
patrickvonplaten
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very clean to me!
albertvillanova
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @anton-l, thanks a lot for the addition of this SUPERB task. You did an awesome job! ^^
Just some comments and suggested changes before we merge it into master.
Co-authored-by: Albert Villanova del Moral <[email protected]>
|
@albertvillanova thanks! Everything should be ready now :) |
albertvillanova
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one minor missing "id" and that's all! :)
albertvillanova
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!!
@anton-l I was thinking that maybe we could give some hints in the dataset card (in a Usage section); something similar as for diarization: https://github.com/huggingface/datasets/blob/master/datasets/superb/README.md#example-of-usage |
|
@albertvillanova yeah, I'm not sure how to best implement it in pure def map_to_array(example):
import soundfile as sf
speech_array, sample_rate = sf.read(example["file"])
example["speech"] = speech_array
example["sample_rate"] = sample_rate
return example
def sample_noise(example):
# Use a version of this function in a stateless way to extract random 1 sec slices of background noise
# on each epoch
from random import randint
# _silence_ audios are longer than 1 sec
if example["label"] == "_silence_":
random_offset = randint(0, len(example["speech"]) - example["sample_rate"] - 1)
example["speech"] = example["speech"][random_offset : random_offset + example["sample_rate"]]
return example |
|
I see... Yes, not trivial indeed. Maybe for the moment you could add those functions above to the README (as it is the case for now in diarization)? What do you think? |
Add the KS (keyword spotting) task as described in the SUPERB paper.
Some notable quirks:
_split_ks_files())._background_noise_/_silence_audio files are much longer than others, so they require some sort of slicing for downstream training. I decided to leave the implementation of that up to the users, since TFDS and s3prl take different approaches (either slicing wavs deterministically, or subsampling randomly at runtime)Related to #2619.