You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update use_dataset tutorial to integrate Albumentations for data augmentation
- Replaced torchvision transforms with Albumentations for image augmentation.
- Renumbered sections for clarity and updated descriptions accordingly.
- Emphasized key points for using Albumentations with 🤗 Datasets.
**3**. Now, you can apply some transforms to the image. Feel free to take a look at the [various transforms available](https://docs.pytorch.org/vision/stable/transforms.html#v2-api-reference-recommended) in torchvision and choose one you'd like to experiment with. This example applies a transform that randomly rotates the image:
... examples["pixel_values"] = [rotate(image) for image in examples["image"]]
186
-
...return examples
187
-
```
188
-
189
-
**4**. Use the [`~Dataset.set_transform`] function to apply the transform on-the-fly. When you index into the image `pixel_values`, the transform is applied, and your image gets rotated.
190
-
191
-
```py
192
-
>>> dataset.set_transform(transforms)
193
-
>>> dataset[0]["pixel_values"]
194
-
```
195
-
196
-
**5**. The dataset is now ready for training with your machine learning framework!
178
+
**3**. Now let's apply data augmentations to your images. 🤗 Datasets works with any augmentation library, and in this example we'll use Albumentations.
197
179
198
180
### Using Albumentations
199
181
200
-
[Albumentations](https://albumentations.ai) is another popular image augmentation library that provides a [rich set of transforms](https://albumentations.ai/docs/reference/supported-targets-by-transform/) including spatial-level transforms, pixel-level transforms, and mixing-level transforms. When running on CPU, which is typical for transformers pipelines, Albumentations is [faster than torchvision](https://albumentations.ai/docs/benchmarks/image-benchmarks/).
182
+
[Albumentations](https://albumentations.ai) is a popular image augmentation library that provides a [rich set of transforms](https://albumentations.ai/docs/reference/supported-targets-by-transform/) including spatial-level transforms, pixel-level transforms, and mixing-level transforms. When running on CPU, which is typical for transformers pipelines, Albumentations is [faster than torchvision](https://albumentations.ai/docs/benchmarks/image-benchmarks/).
201
183
202
-
**1**. Install Albumentations:
184
+
Install Albumentations:
203
185
204
186
```bash
205
187
pip install albumentations
206
188
```
207
189
208
-
**2**. Create a typical augmentation pipeline with Albumentations:
190
+
**4**. Create a typical augmentation pipeline with Albumentations:
209
191
210
192
```py
211
193
>>>import albumentations as A
@@ -219,7 +201,7 @@ pip install albumentations
219
201
... ])
220
202
```
221
203
222
-
**3**. Since 🤗 Datasets uses PIL images but Albumentations expects OpenCV format (numpy arrays), you need to convert between formats:
204
+
**5**. Since 🤗 Datasets uses PIL images but Albumentations expects OpenCV format (numpy arrays), you need to convert between formats:
223
205
224
206
```py
225
207
>>>defalbumentations_transforms(examples):
@@ -240,14 +222,14 @@ pip install albumentations
240
222
...return examples
241
223
```
242
224
243
-
**4**. Apply the transform using [`~Dataset.set_transform`]:
225
+
**6**. Apply the transform using [`~Dataset.set_transform`]:
0 commit comments