Skip to content

Rotated bounding box NMS implementation for CPU#9450

Draft
zy1git wants to merge 3 commits intopytorch:mainfrom
zy1git:rotated-NMS
Draft

Rotated bounding box NMS implementation for CPU#9450
zy1git wants to merge 3 commits intopytorch:mainfrom
zy1git:rotated-NMS

Conversation

@zy1git
Copy link
Contributor

@zy1git zy1git commented Mar 23, 2026

Summary:
Implemented rotated box NMS (Non-Maximum Suppression) for CPU, adapted from Detectron2's nms_rotated implementation. The NMS algorithm is identical to standard NMS — sort by scores, suppress overlapping boxes — but uses single_box_iou_rotated for IoU computation instead of axis-aligned intersection. The public API follows the existing nms op pattern in TorchVision.

Test Plan:

Added TestNMSRotated test class adapted from Detectron2's test suite:

  • 0° rotation test: rotated NMS with angle=0 should match reference horizontal NMS (IoU thresholds 0.2, 0.5, 0.8)

  • 90° rotation test: rotated NMS with angle=90 and swapped width/height should match reference horizontal NMS

  • 180° rotation test: rotated NMS with angle=180 should match reference horizontal NMS

  • TorchScript compatibility test

Results are compared using edit distance (≤ 1 allowed) to account for floating-point precision differences at IoU threshold boundaries.

Run pytest test/test_ops.py::TestNMSRotated -v
All tests pass locally.

@pytorch-bot
Copy link

pytorch-bot bot commented Mar 23, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/9450

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit ae5fb41 with merge base d7400a3 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@zy1git zy1git marked this pull request as draft March 23, 2026 04:28
@meta-cla meta-cla bot added the cla signed label Mar 23, 2026

auto ovr = single_box_iou_rotated<scalar_t>(
dets[i].data_ptr<scalar_t>(), dets[j].data_ptr<scalar_t>());
if (ovr >= iou_threshold) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flagging that this is different from the iou threshold comparison we have in the non-rotated case:

if (ovr > iou_threshold) {
.

See my other comment about unifying the implementation, which should resolve this as a consequence.

namespace {

template <typename scalar_t>
at::Tensor nms_rotated_cpu_kernel(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly the same implementation we already have for the non-rotated case, the only difference being the iou computation:

at::Tensor nms_kernel_impl(

Could we consider fusing the two implementations, perhaps templating over the iou computation function?

return torch.ops.torchvision.nms(boxes, scores, iou_threshold)


def nms_rotated(boxes: Tensor, scores: Tensor, iou_threshold: float) -> Tensor:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason to expose nms_rotated instead of just handling all this within a single nms function?

For iou, we chose not to expose iou_rotated at the Python layer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants