Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions docs/source/object_detection.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@

Object detection models identify something in an image, and object detection datasets are used for applications such as autonomous driving and detecting natural hazards like wildfire. This guide will show you how to apply transformations to an object detection dataset following the [tutorial](https://albumentations.ai/docs/examples/example_bboxes/) from [Albumentations](https://albumentations.ai/docs/).

To run these examples, make sure you have up-to-date versions of `albumentations` and `cv2` installed:
To run these examples, make sure you have up-to-date versions of [albumentations](https://albumentations.ai/docs/) and [cv2](https://docs.opencv.org/4.10.0/) installed:

```
```bash
pip install -U albumentations opencv-python
```

Expand Down Expand Up @@ -40,12 +40,12 @@ The dataset has the following fields:
- `objects`: A dictionary containing bounding box metadata for the objects in the image:
- `id`: The annotation id.
- `area`: The area of the bounding box.
- `bbox`: The object's bounding box (in the [coco](https://albumentations.ai/docs/getting_started/bounding_boxes_augmentation/#coco) format).
- `bbox`: The object's bounding box (in the [coco](https://albumentations.ai/docs/3-basic-usage/bounding-boxes-augmentations/#understanding-bounding-box-formats) format).
- `category`: The object's category, with possible values including `Coverall (0)`, `Face_Shield (1)`, `Gloves (2)`, `Goggles (3)` and `Mask (4)`.

You can visualize the `bboxes` on the image using some internal torch utilities. To do that, you will need to reference the [`~datasets.ClassLabel`] feature associated with the category IDs so you can look up the string labels:


```py
>>> import torch
>>> from torchvision.ops import box_convert
Expand All @@ -68,7 +68,7 @@ You can visualize the `bboxes` on the image using some internal torch utilities.
```

<div class="flex justify-center">
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/datasets/visualize_detection_example.png">
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/datasets/visualize_detection_example.png"/>
</div>


Expand Down Expand Up @@ -110,7 +110,7 @@ Now when you visualize the result, the image should be flipped, but the `bboxes`
```

<div class="flex justify-center">
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/datasets/visualize_detection_example_transformed.png">
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/datasets/visualize_detection_example_transformed.png"/>
</div>

Create a function to apply the transform to a batch of examples:
Expand Down Expand Up @@ -152,7 +152,7 @@ You can verify the transform works by visualizing the 10th example:
```

<div class="flex justify-center">
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/datasets/visualize_detection_example_transformed_2.png">
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/datasets/visualize_detection_example_transformed_2.png"/>
</div>

<Tip>
Expand Down
Loading