Add `oriented_box_iou_batch` function to `detection.utils` by patel-zeel · Pull Request #1502 · roboflow/supervision

patel-zeel · 2024-09-05T07:00:27Z

Description

Implemented oriented_box_iou_batch function in detection.utils as discussed with @LinasKo in #1295.

No additional dependencies are required for this change.

Context

This function computes IoU (intersection over union) for oriented/rotated bounding boxes (aka OBB).

Important details for the reviewers:

The implementation is generic and can work for axis-aligned, oriented, and irregular shape polygons. There is no check in the function to verify if the passed box is oriented rectangular bounding box.
polygon_to_mask function requires resolution_wh argument but I have bypassed it by considering a pseudo-image that covers the true and detected boxes. I am assuming that we will always pass box values in the true scale of the image size and not scaled to [0, 1].

max_height = max(boxes_true[:, :, 0].max(), boxes_detection[:, :, 0].max()) + 1
# adding 1 because we are 0-indexed
max_width = max(boxes_true[:, :, 1].max(), boxes_detection[:, :, 1].max()) + 1

Questions for the reviewers:

Once this function is reviewed and merged, would/should it be used as a generic function and replace the box_iou_batch function?
Do we have corner cases where we must use np.nan_to_num(ious) before returning ious?

Type of change

New feature (non-breaking change which adds functionality)

How has this change been tested, please provide a testcase or example of how you tested the change?

Thus far, I have not added any tests. I have tested the function with the following code snippet. Showing the examples and plots first followed by the code snippet:

Examples and plots

box_detection = np.array([[1, 1], [2, 0], [4, 2], [3, 3]]) # overlap
box_true = np.array([[1, 0], [0, 1], [3, 4], [4, 3]])

box_detection = np.array([[4, 1], [5, 0], [5, 2], [6, 1]]) # no overlap
box_true = np.array([[1, 0], [0, 1], [3, 4], [4, 3]])

box_detection = np.array([[1, 0], [0, 1], [3, 4], [4, 3]]) # perfect match

Code snippet

import numpy as np
import matplotlib.pyplot as plt
from supervision.detection.utils import polygon_to_mask, oriented_box_iou_batch

# config
box_detection = np.array([[1, 1], [2, 0], [4, 2], [3, 3]]) # overlap
# box_detection = np.array([[4, 1], [5, 0], [5, 2], [6, 1]]) # no overlap
# box_detection = np.array([[1, 0], [0, 1], [3, 4], [4, 3]]) # perfect match

box_true = np.array([[1, 0], [0, 1], [3, 4], [4, 3]])

height = 5
width = 8

# plotting function
def plot_img(img, alpha, ax):
    mappable = ax.imshow(img, interpolation="none", alpha=alpha, vmin=0, vmax=255, cmap="coolwarm")
    ax.set_xticks(np.arange(0.5, width, 1))
    ax.set_xticklabels(np.arange(0, width, 1))
    ax.set_yticks(np.arange(0.5, height, 1))
    ax.set_yticklabels(np.arange(0, height, 1))
    ax.grid(True, which='both', linestyle='--', linewidth=1, color='k')
    return mappable

# create image
img = np.zeros((height, width)).astype(int) + 128

# manual process
mask_true = polygon_to_mask(box_true, (img.shape[1], img.shape[0])).astype(int)
mask_detection = polygon_to_mask(box_detection, (img.shape[1], img.shape[0])).astype(int)
union_pixels = np.sum((mask_true + mask_detection).clip(0, 1))
intersection_pixels = np.sum((mask_true * mask_detection))
iou = intersection_pixels / union_pixels

# Using the function
func_iou = oriented_box_iou_batch(box_true, box_detection)
assert np.isclose(iou, func_iou), f"{iou=}, {func_iou=}"

# plot
fig, axs = plt.subplots(1, 3, figsize=(12, 3.5))

img_true = (img + (mask_true * 255)).clip(0, 255)
plot_img(img_true, 1, axs[0])
axs[0].set_title("True")

img_detection = (img - (mask_detection * 255)).clip(0, 255)
plot_img(img_detection, 1, axs[1])
axs[1].set_title("Detection")

plot_img(img_true, 0.8, axs[2])
plot_img(img_detection, 0.5, axs[2])
axs[2].set_title("Overlay")

manual = f"Manual: Intersection={intersection_pixels}, Union={union_pixels}, IoU={intersection_pixels}/{union_pixels}={intersection_pixels/union_pixels}"
automatic = f"{oriented_box_iou_batch(box_true, box_detection)=}"
fig.suptitle(f"{manual}\n{automatic}", ha='center')

Any specific deployment considerations (this section is not yet edited)

For example, documentation changes, usability, usage/costs, secrets, etc.

Docs (this section is not yet edited)

Docs updated? What were the changes:

CLAassistant · 2024-09-05T07:00:32Z

All committers have signed the CLA.

LinasKo · 2024-09-05T09:46:21Z

Excellent work, @patel-zeel! I especially like the visuals.

To answer your questions:

No, the usage of box_iou_batch and mask_iou_batch in the codebase should remain unchanged. Especially the former, as it provides a faster and less complex implementation than the one required by OBB. Keeping oriented_box_iou_batch separate also allows its implementation to vary independently of others.
I wonder how small an OBB can be. Suppose you had an xyxyxyxy where all x values are equal, and all y values are equal too. I don't know if a model can produce that in reality, let's try to support it. Could you check what masks are created and what IoU is produced?

patel-zeel · 2024-09-05T10:24:54Z

Excellent work, @patel-zeel! I especially like the visuals.

Thank you, @LinasKo. While making the plots, I realized that gridlines pass from the middle of a pixel, making it hard to visually count union, intersection, and IoU. Thus, I shifted the gridlines by half of the pixel size to make sure pixels overlap with the cells of the grid and not on the intersections of gridlines. It was fun working on this!

Another question:

Do we need to / Can we vectorize this further? As per my trials, cv2.fillPoly was not vectorizable.

Now, coming to other points raised by you:

No, the usage of box_iou_batch and mask_iou_batch in the codebase should remain unchanged. Especially the former, as it provides a faster and less complex implementation than the one required by OBB. Keeping oriented_box_iou_batch separate also allows its implementation to vary independently of others.

Yes, that makes sense! Thank you.

I wonder how small an OBB can be. Suppose you had an xyxyxyxy where all x values are equal, and all y values are equal too. I don't know if a model can produce that in reality, let's try to support it. Could you check what masks are created and what IoU is produced?

Given all the same values, cv2.fillPoly creates a filter of size 1 pixel, which should be error-free. I am showing two more examples here:

box_detection = np.array([[0, 0], [0, 0], [0, 0], [0, 0]])
box_true = np.array([[0, 0], [0, 0], [0, 0], [0, 0]])

box_detection = np.array([[1, 1], [1, 1], [1, 1], [1, 1]])
box_true = np.array([[0, 0], [0, 0], [0, 0], [0, 0]])

patel-zeel · 2024-09-19T16:04:40Z

@LinasKo, this is a gentle ping to let you know that I can make the desired changes before you again review this PR. Feel free to let me know whenever you take a look.

LinasKo · 2024-09-19T16:40:20Z

Hi @patel-zeel, I have it in my sights - I'll 100% include it before the new release, and very likely in the next few days. Apologies for the long wait!

patel-zeel · 2024-09-19T17:07:16Z

Thank you for the quick response, @LinasKo. No worries at all.

LinasKo · 2024-09-19T18:10:27Z

Hi @patel-zeel, here's my feedback:

Please pull before making any edits - I pushed a small change, adding the new function to the global scope and the docs.
xyxyxyxy are expected to be of type np.float32, but an error is thrown when that is the case.
I don't think we should be adding +1 to the width / height. The result it too large. An xyxy box [0, 0, 0, 0] should have area of 0, and so should a mask that is entirely empty. We'd like OBB to match that behaviour.

Here's the Colab I worked with.

patel-zeel · 2024-09-24T01:56:51Z

Hi @LinasKo, thank you for the feedback.

An xyxy box [0, 0, 0, 0] should have area of 0, and so should a mask that is entirely empty.

Yes, that seems logically correct, but the current implementation of polygon_to_mask does not generate an empty mask for a zero-area polygon. Considering xyxy a simpler case than xyxyxyxy, we have to remove the last/first row and column of the mask to match it with the area formula. I think xyxyxyxy mask will be even more complex to handle. Any ideas on how to tackle this?

To add more context, the area of the mask depends on whether we include the pixels on which the box boundary lies. However, none of those two cases will match the area formula. For example, box [0, 0, 2, 2] has area 2 x 2 = 4. Including the pixels where the boundary lies, the area is 3 x 3 = 9. The area is 1 x 1 = 1, excluding the pixels where the boundary lies.

This brings a question: Shouldn't this match how object detection models handle bounding boxes, as in, how do they process pixels on which the box boundary lies?

from supervision import polygon_to_mask

def xyxy_to_xyxyxyxy(xyxy) -> np.ndarray:
    results = []
    for box in xyxy:
        obb_single = np.array([
            (box[0], box[1]),
            (box[2], box[1]),
            (box[2], box[3]),
            (box[0], box[3]),
        ])
        results.append(obb_single)
    return np.array(results)

xyxy = np.array([[0, 0, 0, 0]])
xyxyxyxy = xyxy_to_xyxyxyxy(xyxy)
mask = polygon_to_mask(xyxyxyxy, (3, 3))
print(mask)

Output

[[1. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]

supervision/detection/utils.py

LinasKo · 2024-09-24T09:23:43Z

Hi @patel-zeel, thank you for your thoughts.

Indeed, I can see the issue. The problem preceding this discussion is that xyxy and xyxyxyxy are in subpixel precision (as returned by the models), and masks, as we represent them today, are integer-shaped.

For example, box [0, 0, 2, 2] has an area of 2 x 2 = 4. Including the pixels where the boundary lies, the area is 3 x 3 = 9. The area is 1 x 1 = 1, excluding the pixels where the boundary lies.

Quick note: here we should not consider the standard where all boundary pixels are excluded. The viable options are:

Include start, exclude end - match how programming languages treat array indexing (area = 2 x 2 = 4)
Include start, include end - match how masks are treated, necessitating a special case when 0-sized boxes are needed. (area = 3 x 3 = 9)

Putting that aside, let's sacrifice correctness for now and focus on enabling more use cases. Areas are primarily used to compare detections of the same type, after all.

I'll take over, carry out the minor doc fixes, run a few tests, and merge this in. I've also noted down that polygon_to_mask needs looking into.

patel-zeel · 2024-09-24T09:55:37Z

Sure, @LinasKo. It was interesting to work on this issue. I'm looking forward to seeing it in the next release.

LinasKo

I've checked the code, verified the results are what we expected and built the docs. Looks good; merging!

Thank you @patel-zeel! 🤝
It is a true delight, seeing a thoughtfully made and tested PR such as yours. Especially when it reveals further flaws we can investigate to make our library even better. It has been a pleasure working with you.

MoAbbasid · 2024-10-01T03:24:16Z

hi @patel-zeel, kinda unrelated question
how do you generate these visuals?

patel-zeel · 2024-10-01T03:44:03Z

Hi @MoAbbasid, you can generate these plots with the Code snippet given in the PR Description.

Add oriented_box_iou_batch function to detection.utils

323e758

Add obb_iou_batch to __init__ and docs

6acfdbf

Merge branch 'roboflow:develop' into feat/oriented_box_iou_batch

887bb7c

LinasKo suggested changes Sep 24, 2024

View reviewed changes

supervision/detection/utils.py Outdated Show resolved Hide resolved

supervision/detection/utils.py Outdated Show resolved Hide resolved

LinasKo added 2 commits September 24, 2024 12:57

Fix arrya shapes in OBB IoU docstring

1b6fd01

Fix OBB shape when computing object size

a29d959

LinasKo approved these changes Sep 24, 2024

View reviewed changes

LinasKo merged commit 93190b2 into roboflow:develop Sep 24, 2024

LinasKo mentioned this pull request Oct 4, 2024

supervision-0.24.0 release #1573

Merged

Conversation

patel-zeel commented Sep 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Context

Important details for the reviewers:

Questions for the reviewers:

Type of change

How has this change been tested, please provide a testcase or example of how you tested the change?

Examples and plots

Code snippet

Any specific deployment considerations (this section is not yet edited)

Docs (this section is not yet edited)

Uh oh!

CLAassistant commented Sep 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LinasKo commented Sep 5, 2024

Uh oh!

patel-zeel commented Sep 5, 2024

Uh oh!

patel-zeel commented Sep 19, 2024

Uh oh!

LinasKo commented Sep 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

patel-zeel commented Sep 19, 2024

Uh oh!

LinasKo commented Sep 19, 2024

Uh oh!

patel-zeel commented Sep 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LinasKo commented Sep 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

patel-zeel commented Sep 24, 2024

Uh oh!

LinasKo left a comment

Choose a reason for hiding this comment

Uh oh!

MoAbbasid commented Oct 1, 2024

Uh oh!

patel-zeel commented Oct 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

patel-zeel commented Sep 5, 2024 •

edited

Loading

CLAassistant commented Sep 5, 2024 •

edited

Loading

LinasKo commented Sep 19, 2024 •

edited

Loading

patel-zeel commented Sep 24, 2024 •

edited

Loading

LinasKo commented Sep 24, 2024 •

edited

Loading

patel-zeel commented Oct 1, 2024 •

edited

Loading