-
Notifications
You must be signed in to change notification settings - Fork 3k
Create new sections for audio and vision in guides #4519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 5 commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
390730d
📝 first draft
stevhliu 7d22e2c
📝 create modality specific pages
stevhliu d3ae512
📝 create NLP section
stevhliu e5863a0
📝 update set_format section
stevhliu 493213b
📝 add use tf/torch to toctree
stevhliu b5f290c
🖍 remove visual cues
stevhliu 375d217
🖍 apply quentin review
stevhliu aa497e9
🖍 minor edits
stevhliu 8b9c7ea
🖍 apply mario review
stevhliu 54f9b3b
🖍 collapse some sections
stevhliu e41728f
🖍 try collapse again
stevhliu b203023
Update _toctree.yml
cdc963a
🖍 collapse all nested sections except for general usage
stevhliu 768d6c7
🖍 add link to install dependencies for audio/vision sections
stevhliu 104eac9
✨ add text decoration for different guides
stevhliu df6cb4f
🖍 remove text decorations for now
stevhliu File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,49 @@ | ||
| # Load audio data | ||
|
|
||
| Audio datasets are loaded from the `audio` column, which contains three important fields: | ||
|
|
||
| * `array`: the decoded audio data represented as a 1-dimensional array. | ||
| * `path`: the path to the downloaded audio file. | ||
| * `sampling_rate`: the sampling rate of the audio data. | ||
|
|
||
| When you load an audio dataset and call the `audio` column, the [`Audio`] feature automatically decodes and resamples the audio file: | ||
|
|
||
| ```py | ||
| >>> from datasets import load_dataset, Audio | ||
|
|
||
| >>> dataset = load_dataset("PolyAI/minds14", "en-US", split="train") | ||
| >>> dataset[0]["audio"] | ||
| {'array': array([ 0. , 0.00024414, -0.00024414, ..., -0.00024414, | ||
| 0. , 0. ], dtype=float32), | ||
| 'path': '/root/.cache/huggingface/datasets/downloads/extracted/f14948e0e84be638dd7943ac36518a4cf3324e8b7aa331c5ab11541518e9368c/en-US~JOINT_ACCOUNT/602ba55abb1e6d0fbce92065.wav', | ||
| 'sampling_rate': 8000} | ||
| ``` | ||
|
|
||
| <Tip warning={true}> | ||
|
|
||
| Index into an audio dataset using the row index first and then the `audio` column - `dataset[0]["audio"]` - to avoid decoding and resampling all the audio files in the dataset. Otherwise, this can be a slow and time-consuming process if you have a large dataset. | ||
|
|
||
| </Tip> | ||
|
|
||
| ## Path | ||
|
|
||
| The `path` is useful for loading your own dataset. Use the [`~Dataset.cast_column`] function to take a column of audio file paths, and decode it into `array`'s with the [`Audio`] feature: | ||
|
|
||
| ```py | ||
| >>> audio_dataset = audio_dataset.cast_column("paths_to_my_audio_files", Audio()) | ||
stevhliu marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| ``` | ||
|
|
||
| If you only want to load the underlying path to the audio dataset without decoding the audio file into an `array`, set `decode=False` in the [`Audio`] feature: | ||
|
|
||
| ```py | ||
| >>> dataset = load_dataset("PolyAI/minds14", "en-US", split="train").cast_column('audio', Audio(decode=False)) | ||
| >>> dataset[0] | ||
| {'audio': {'bytes': None, | ||
| 'path': '/root/.cache/huggingface/datasets/downloads/extracted/f14948e0e84be638dd7943ac36518a4cf3324e8b7aa331c5ab11541518e9368c/en-US~JOINT_ACCOUNT/602ba55abb1e6d0fbce92065.wav'}, | ||
| 'english_transcription': 'I would like to set up a joint account with my partner', | ||
| 'intent_class': 11, | ||
| 'lang_id': 4, | ||
| 'path': '/root/.cache/huggingface/datasets/downloads/extracted/f14948e0e84be638dd7943ac36518a4cf3324e8b7aa331c5ab11541518e9368c/en-US~JOINT_ACCOUNT/602ba55abb1e6d0fbce92065.wav', | ||
| 'transcription': 'I would like to set up a joint account with my partner'} | ||
| ``` | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,25 +1,28 @@ | ||
| # Overview | ||
|
|
||
| Our how-to guides will show you how to complete a specific task. These guides are intended to help you apply your knowledge of 🤗 Datasets to real-world problems you may encounter. Want to flatten a column or load a dataset from a local file? We got you covered! You should already be familiar and comfortable with the 🤗 Datasets basics, and if you aren't, we recommend reading our [tutorial](./tutorial) first. | ||
| The how-to guides offer a more comprehensive overview of all the tools 🤗 Datasets offers and how to use them. This will help you tackle some of the messier real-world datasets, where you may need to manipulate the dataset structure or content to get it ready for training. | ||
|
|
||
| The how-to guides will cover eight key areas of 🤗 Datasets: | ||
| The guides assume you are familiar and comfortable with the 🤗 Datasets basics. We recommend newer users check out our [tutorials](tutorial) first. | ||
|
|
||
| * How to load a dataset from other data sources. | ||
| <Tip> | ||
|
|
||
| * How to process a dataset. | ||
| Interested in learning more? Take a look at [Chapter 5](https://huggingface.co/course/chapter5/1?fw=pt) of the Hugging Face course! | ||
|
|
||
| * How to use a dataset with your favorite ML/DL framework. | ||
| </Tip> | ||
|
|
||
| * How to stream large datasets. | ||
| The guides cover four key areas of 🤗 Datasets: | ||
|
|
||
| * How to upload and share a dataset. | ||
| <div> | ||
| <span class="bg-pink-200 text-pink-900 dark:bg-pink-500 px-1 rounded font-bold">General usage</span>: Functions for general dataset loading and processing. The functions shown in this section are applicable across all dataset modalities. | ||
| </div> | ||
| <div> | ||
| <span class="bg-yellow-200 text-yellow-900 dark:bg-yellow-500 px-1 rounded font-bold">Audio</span>: How to load, process, and share audio datasets. | ||
| </div> | ||
| <div> | ||
| <span class="bg-green-200 text-green-900 dark:bg-green-500 px-1 rounded font-bold">Vision</span>: How to load, process, and share image datasets. | ||
| </div> | ||
| <div> | ||
| <span class="bg-blue-200 text-blue-900 dark:bg-blue-500 px-1 rounded font-bold">NLP</span>: How to load, process, and share NLP datasets. | ||
| </div> | ||
|
|
||
| * How to create a dataset loading script. | ||
|
|
||
| * How to create a dataset card. | ||
|
|
||
| * How to compute metrics. | ||
|
|
||
| * How to manage the cache. | ||
|
|
||
| You can also find guides on how to process massive datasets with Beam, how to integrate with cloud storage providers, and how to add an index to search your dataset. | ||
| If you have any questions about 🤗 Datasets, feel free to join and ask the community on our [forum](https://discuss.huggingface.co/c/datasets/10). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.