diff --git a/docs/changelog.md b/docs/changelog.md index 6dd9097e9d..fc7a1d1efb 100644 --- a/docs/changelog.md +++ b/docs/changelog.md @@ -5,9 +5,27 @@ date_modified: 2026-04-23 # Changelog -### 0.28.0 Unreleased +### 0.28.0 Apr 30, 2026 -- Changed [#2178](https://github.com/roboflow/supervision/pull/2178): [`sv.Detections.from_inference`](https://supervision.roboflow.com/latest/detection/core/#supervision.detection.core.Detections.from_inference) now supports compressed COCO RLE masks. Inference responses with `rle` or `rle_mask` fields containing a compressed counts string (as produced by `pycocotools`) are decoded directly into binary masks, avoiding a lossy polygon round-trip. +- Added [#2159](https://github.com/roboflow/supervision/pull/2159): [`sv.CompactMask`](https://supervision.roboflow.com/develop/detection/compact_mask/#supervision.detection.compact_mask.CompactMask) for memory-efficient mask storage. Masks are stored as crop-region bounding boxes plus RLE-encoded data instead of full-resolution bitmaps, reducing memory by up to 240× for sparse masks. Integrates transparently with `sv.Detections.mask` — filtering, merging, and `area` all work without materialising the full array. + +- Added [#2227](https://github.com/roboflow/supervision/pull/2227): [`sv.CompactMask.resize(new_image_shape)`](https://supervision.roboflow.com/develop/detection/compact_mask/#supervision.detection.compact_mask.CompactMask.resize) rescales all stored crops to match a new image resolution, enabling use across frames or after image resizing pipelines. + +- Added [#2178](https://github.com/roboflow/supervision/pull/2178): [`sv.Detections.from_inference`](https://supervision.roboflow.com/latest/detection/core/#supervision.detection.core.Detections.from_inference) now supports compressed COCO RLE masks. Inference responses with `rle` or `rle_mask` fields containing a compressed counts string (as produced by `pycocotools`) are decoded directly into binary masks, avoiding a lossy polygon round-trip. + +- Added [#2004](https://github.com/roboflow/supervision/pull/2004): [`sv.Color.from_hex`](https://supervision.roboflow.com/latest/utils/draw/#supervision.draw.color.Color.from_hex) now accepts 8-digit hexadecimal RGBA codes (e.g. `#ff00ff80`). [`Color.as_hex()`](https://supervision.roboflow.com/latest/utils/draw/#supervision.draw.color.Color.as_hex) serialises back, including alpha when not fully opaque. New utility functions `sv.hex_to_rgba`, `sv.rgba_to_hex`, and `sv.is_valid_hex` are exported at the top level. + +- Added [#709](https://github.com/roboflow/supervision/pull/709): [`sv.BlurAnnotator`](https://supervision.roboflow.com/latest/detection/annotators/#supervision.annotators.core.BlurAnnotator) and [`sv.PixelateAnnotator`](https://supervision.roboflow.com/latest/detection/annotators/#supervision.annotators.core.PixelateAnnotator) now support dynamic sizing. When `kernel_size=None` or `pixel_size=None` (the new default), the size is computed per detection as a fraction of the shorter bounding-box dimension, producing consistent visual results across objects of different sizes. + +- Added [#2186](https://github.com/roboflow/supervision/pull/2186): [`sv.InferenceSlicer`](https://supervision.roboflow.com/latest/detection/tools/inference_slicer/#supervision.detection.tools.inference_slicer.InferenceSlicer) now emits a warning when detections returned by the callback fall outside the tile boundaries, helping catch coordinate-system bugs in custom callbacks. + +- Added [#2103](https://github.com/roboflow/supervision/pull/2103), [#2152](https://github.com/roboflow/supervision/pull/2152): New [`sv.Detections.from_sam3()`](https://supervision.roboflow.com/latest/detection/core/#supervision.detection.core.Detections.from_sam3) classmethod parses SAM3 PCS (text-prompted) and PVS (visual-prompted video segmentation) response formats into a standard `sv.Detections`, both from the local `inference` package and from Roboflow-hosted server responses. + +- Added [#2154](https://github.com/roboflow/supervision/pull/2154): The library now uses Python's `logging` module instead of `print` for diagnostic output. Messages are emitted under the `supervision` logger so applications can capture, filter, or silence them through standard `logging` configuration. + +- Added [#932](https://github.com/roboflow/supervision/pull/932): [`sv.ImageAssets`](https://supervision.roboflow.com/latest/assets/) for downloading sample images alongside existing video assets, useful for examples and tutorials. + +- Changed [#2169](https://github.com/roboflow/supervision/pull/2169): [`sv.MeanAveragePrecisionResult`](https://supervision.roboflow.com/latest/metrics/mean_average_precision/) and related metric arrays (`mAP_scores`, `ap_per_class`, `iou_thresholds`, precision/recall) are now `float32` instead of `float64`. Reduces memory and speeds up computation; numerical results may differ in the last few digits. - Changed [#2178](https://github.com/roboflow/supervision/pull/2178): [`sv.rle_to_mask`](https://supervision.roboflow.com/latest/detection/utils/converters/#supervision.detection.utils.converters.rle_to_mask) and [`sv.mask_to_rle`](https://supervision.roboflow.com/latest/detection/utils/converters/#supervision.detection.utils.converters.mask_to_rle) moved to `supervision.detection.utils.converters`. The old import path `supervision.dataset.utils` continues to work but is deprecated. @@ -15,6 +33,46 @@ date_modified: 2026-04-23 - Fixed [#2210](https://github.com/roboflow/supervision/pull/2210): [`sv.VideoInfo.fps`](https://supervision.roboflow.com/latest/utils/video/#supervision.utils.video.VideoInfo) now returns a `float` instead of a truncated `int`. Previously, frame rates like 23.976, 29.97, and 59.94 were silently truncated, causing frame-timing drift that accumulates over long videos. The type of `VideoInfo.fps` has changed from `int` to `float`; callers that pass `fps` to APIs requiring an integer (such as `deque(maxlen=...)` or `TraceAnnotator(trace_length=...)`) should wrap the value with `int()`. +- Fixed [#2209](https://github.com/roboflow/supervision/pull/2209): [`sv.Detections.is_empty()`](https://supervision.roboflow.com/latest/detection/core/#supervision.detection.core.Detections.is_empty) now returns `True` for detections filtered down to zero rows, even when `tracker_id` is an empty array. Previously this case incorrectly returned `False`. + +- Fixed [#2199](https://github.com/roboflow/supervision/pull/2199): [`sv.CSVSink`](https://supervision.roboflow.com/latest/detection/tools/save_detections/#supervision.detection.tools.csv_sink.CSVSink) now correctly slices numpy array values in `custom_data` per row. Previously the full array was written for every detection. + +- Fixed [#2216](https://github.com/roboflow/supervision/pull/2216): [`sv.CSVSink`](https://supervision.roboflow.com/latest/detection/tools/save_detections/#supervision.detection.tools.csv_sink.CSVSink) and [`sv.JSONSink`](https://supervision.roboflow.com/latest/detection/tools/save_detections/#supervision.detection.tools.json_sink.JSONSink) now slice plain Python `list` and `tuple` values in `custom_data` per detection row. Lists and tuples matching the detection count are indexed per row, consistent with `np.ndarray` behavior. + +- Fixed [#2217](https://github.com/roboflow/supervision/pull/2217): [`sv.TraceAnnotator`](https://supervision.roboflow.com/latest/detection/annotators/#supervision.annotators.core.TraceAnnotator) no longer crashes in `smooth` mode when a tracker remains stationary. Duplicate consecutive points caused `splprep` to fail; the annotator now deduplicates anchor points and falls back to a raw polyline when fewer than 4 unique points are available. + +- Fixed [#2218](https://github.com/roboflow/supervision/pull/2218): [`load_coco_annotations`](https://supervision.roboflow.com/latest/datasets/core/) now rejects COCO annotations whose `file_name` escapes the images directory via `../` traversal or absolute paths, preventing path-traversal attacks from malicious annotation files. + +- Fixed [#2187](https://github.com/roboflow/supervision/pull/2187): Extreme memory usage when loading OBB (oriented bounding box) datasets, caused by allocating full-image masks for each rotated box, has been resolved. + +- Fixed [#2188](https://github.com/roboflow/supervision/pull/2188): [`sv.KeyPoints`](https://supervision.roboflow.com/latest/keypoint/core/#supervision.key_points.core.KeyPoints) boolean mask indexing now works correctly when all instances have the same keypoint count (uniform-count selection). + +- Fixed [#2185](https://github.com/roboflow/supervision/pull/2185): [`sv.DetectionDataset.as_coco()`](https://supervision.roboflow.com/latest/datasets/core/#supervision.dataset.core.DetectionDataset.as_coco) now preserves `area` and `iscrowd` fields instead of silently dropping them in the round-trip. + +- Fixed [#1746](https://github.com/roboflow/supervision/pull/1746): Precision loss when converting annotations with `force_mask=True` in dataset format converters. + +- Fixed [#1991](https://github.com/roboflow/supervision/pull/1991): [`sv.PolygonZone`](https://supervision.roboflow.com/latest/detection/tools/polygon_zone/) no longer double-counts the same object when multiple zones overlap. Detection bounding boxes were incorrectly clipped to each zone's ROI before anchor computation, causing the same detection to appear at a different anchor point in each zone; anchor is now computed from the original bounding box so containment is independent per zone. + +- Fixed [#1868](https://github.com/roboflow/supervision/pull/1868): [`sv.LineZone`](https://supervision.roboflow.com/latest/detection/tools/line_zone/) no longer mis-attributes crossings when a tracker reuses the same `tracker_id` across different classes. Class-aware bookkeeping prevents a new object from inheriting another class's prior crossing state. + +- Fixed [#2022](https://github.com/roboflow/supervision/pull/2022): [`sv.process_video`](https://supervision.roboflow.com/latest/utils/video/#supervision.utils.video.process_video) now raises immediately when the user callback throws, instead of silently swallowing the exception and hanging until the writer is flushed. + +- Fixed [#2156](https://github.com/roboflow/supervision/pull/2156): [`sv.DetectionDataset`](https://supervision.roboflow.com/latest/datasets/core/#supervision.dataset.core.DetectionDataset) now populates `data["class_name"]` on every loaded annotation, matching what model connectors produce. Downstream code can rely on `class_name` being present whether detections come from a dataset or a model. + +- Fixed [#1364](https://github.com/roboflow/supervision/pull/1364): [`sv.ByteTrack`](https://supervision.roboflow.com/latest/trackers/#supervision.tracker.byte_tracker.core.ByteTrack) now preserves externally assigned `tracker_id` values instead of overwriting them with internal ids on the first update. + +- Fixed [#1853](https://github.com/roboflow/supervision/pull/1853): [`sv.ConfusionMatrix`](https://supervision.roboflow.com/latest/detection/metrics/#supervision.metrics.detection.ConfusionMatrix) `evaluate_detection_batch` now matches predictions to ground truth correctly when multiple detections fall on the same target. Previously, double-counting inflated false-positive and false-negative counts. + +- Fixed [#2136](https://github.com/roboflow/supervision/pull/2136): [`sv.MeanAverageRecall`](https://supervision.roboflow.com/latest/metrics/mean_average_recall/) now computes mAR@K using the top-K detections per image, matching the COCO definition. Previous values were inflated relative to `pycocotools`. + +- Fixed [#1086](https://github.com/roboflow/supervision/pull/1086), [#265](https://github.com/roboflow/supervision/pull/265): COCO export and `force_masks` behaviour are now consistent across dataset formats. Empty polygons no longer raise during `as_coco`, and `force_masks=True` produces masks regardless of source format. + +- Deprecated [#2215](https://github.com/roboflow/supervision/pull/2215): [`sv.ByteTrack`](https://supervision.roboflow.com/latest/trackers/#supervision.tracker.byte_tracker.core.ByteTrack) is deprecated in favour of `ByteTrackTracker` from the external [`trackers`](https://pypi.org/project/trackers/) package (`pip install trackers`). The update method is renamed from `update_with_detections()` to `update()`. Removal planned for `supervision-0.30.0`. + +- Deprecated [#2214](https://github.com/roboflow/supervision/pull/2214): `supervision.keypoint` module is deprecated; use `supervision.key_points` instead. `create_tiles` in `supervision.utils.image`, `ensure_cv2_image_for_processing` in `supervision.utils.conversion`, and keypoint validation utilities in `supervision.validators` are deprecated. The `LMM` enum (use `VLM`) and `from_lmm` method (use `from_vlm`) were deprecated in 0.26.0; this release migrates their deprecation mechanism to `pydeprecate`. + +- Deprecated: `normalized_xyxy` argument in [`sv.denormalize_boxes`](https://supervision.roboflow.com/latest/detection/utils/boxes/#supervision.detection.utils.boxes.denormalize_boxes) renamed to `xyxy`. Passing `normalized_xyxy=` now emits a `FutureWarning`; support will be removed in `supervision-0.30.0`. + ### 0.27.0 Nov 16, 2025 - Added [#2008](https://github.com/roboflow/supervision/pull/2008): [`sv.filter_segments_by_distance`](https://supervision.roboflow.com/0.27.0/detection/utils/masks/#supervision.detection.utils.masks.filter_segments_by_distance) to keep the largest connected component and nearby components within an absolute or relative distance threshold. Useful for cleaning segmentation predictions from models such as SAM, SAM2, YOLO segmentation, and RF-DETR segmentation. diff --git a/notebooks/convert_to_ipynb.sh b/notebooks/convert_to_ipynb.sh new file mode 100755 index 0000000000..602c70718d --- /dev/null +++ b/notebooks/convert_to_ipynb.sh @@ -0,0 +1,9 @@ +#!/usr/bin/env bash +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" + +for py in "$SCRIPT_DIR"/*.py; do + echo "Converting: $(basename "$py")" + jupytext --to ipynb "$py" +done diff --git a/notebooks/release-demo_0-28.py b/notebooks/release-demo_0-28.py new file mode 100644 index 0000000000..e5166b1043 --- /dev/null +++ b/notebooks/release-demo_0-28.py @@ -0,0 +1,450 @@ +# --- +# jupyter: +# jupytext: +# cell_metadata_filter: -all +# formats: ipynb,py:percent +# text_representation: +# extension: .py +# format_name: percent +# format_version: '1.3' +# jupytext_version: 1.19.1 +# --- +# ruff: noqa: E402 + +# %% [markdown] +# # supervision 0.28.0: Memory-Efficient Instance Segmentation +# +# [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/roboflow/supervision/blob/develop/notebooks/release-demo_0-28.ipynb) +# +# **supervision** is a set of reusable tools for computer vision. +# Two headlining changes in 0.28.0: +# +# 1. **`sv.Detections.from_sam3`** -- first-class support for SAM3 (Segment +# Anything Model 3) inference responses. supervision now parses both the +# PCS (prompt-controlled segmentation) and PVS (point-video segmentation) +# output formats directly into a `sv.Detections` object. +# +# 2. **`sv.CompactMask`** -- instance masks stored as RLE-encoded bounding-box +# crops instead of full-resolution bitmaps. Any segmentation model -- +# RF-DETR Seg, SAM3, YOLO-Seg -- can feed into CompactMask. Memory drops +# 10-100x without changing the API anywhere in supervision. +# +# **Story**: run RF-DETR Seg on a real image, visualise the masks, then convert +# to CompactMask and watch the memory footprint collapse. +# +# **Sections:** +# 1. [Install](#1-install) +# 2. [Download sample image](#2-download-sample-image) +# 3. [RF-DETR Seg -- instance segmentation](#3-rf-detr-seg) +# 4. [CompactMask -- memory-efficient storage](#4-compactmask) +# 5. [SAM3 -- text-prompted segmentation](#5-sam3) +# 6. [Other notable changes in 0.28.0](#6-other-notable-changes) +# 7. [Next steps](#7-next-steps) + +# %% [markdown] +# ## 1. Install + +# %% +# !pip install -q 'supervision==0.28.0' 'rfdetr' 'inference-sdk>=0.9' numpy matplotlib + +# %% [markdown] +# ## 2. Download sample image +# +# `sv.ImageAssets` is new in 0.28.0 -- a counterpart to the existing +# `sv.VideoAssets`. `download_assets` caches locally and returns the path. + +# %% +# %matplotlib inline + +import cv2 +import matplotlib.pyplot as plt +import numpy as np + +import supervision as sv +from supervision.assets import ImageAssets, download_assets + +image_path = download_assets(ImageAssets.PEOPLE_WALKING) +print(f"Image: {image_path}") + +image_bgr = cv2.imread(image_path) +image_rgb = cv2.cvtColor(image_bgr, cv2.COLOR_BGR2RGB) +H, W = image_bgr.shape[:2] +print(f"Resolution: {W} x {H}") + +plt.figure(figsize=(12, 7)) +plt.imshow(image_rgb) +plt.axis("off") +plt.title("people-walking.jpg") +plt.tight_layout() +plt.show() + +# %% [markdown] +# ## 3. RF-DETR Seg +# +# **RF-DETR** is a real-time transformer-based object detection model from Roboflow. +# The `RFDETRSegSmall` variant adds an instance segmentation head -- it produces +# one binary mask per detected instance alongside the bounding box. +# +# Key facts for this demo: +# +# - Pretrained on **COCO** (80 object categories) -- detects people, bags, cars, etc. +# - Weights download automatically on first `RFDETRSegSmall()` call (~100 MB). +# - `model.predict()` returns **`sv.Detections`** directly -- no converter needed. +# Masks are a `(N, H, W)` bool array attached as `detections.mask`. + +# %% +from rfdetr.detr import RFDETRSegSmall + +model = RFDETRSegSmall() +model.optimize_for_inference() + +# predict accepts a file path, PIL Image, or RGB numpy array +detections = model.predict(image_path, threshold=0.3) +if not isinstance(detections, sv.Detections): + raise TypeError(f"Expected sv.Detections, got {type(detections).__name__}") + +n_masks = 0 if detections.mask is None else len(detections.mask) +print(f"Detections: {len(detections)} (with masks: {n_masks})") + +# %% [markdown] +# ### 3.1 COCO class names +# +# COCO has 90 numeric class IDs; map them to readable names for annotation. + +# %% +# Subset of COCO class names (IDs 0-based after RF-DETR's remapping). +COCO_NAMES: dict[int, str] = { + 0: "person", + 1: "bicycle", + 2: "car", + 3: "motorcycle", + 4: "airplane", + 5: "bus", + 6: "train", + 7: "truck", + 8: "boat", + 24: "backpack", + 25: "umbrella", + 26: "handbag", + 28: "suitcase", + 56: "chair", + 57: "couch", + 58: "potted plant", + 59: "bed", + 60: "dining table", + 62: "tv", + 63: "laptop", + 67: "cell phone", + 72: "refrigerator", + 74: "clock", + 76: "scissors", +} + +labels = [] +assert detections.class_id is not None +for cid, conf in zip( + detections.class_id, + detections.confidence + if detections.confidence is not None + else [None] * len(detections), +): + name = COCO_NAMES.get(int(cid), f"cls_{cid}") + labels.append(f"{name} {conf:.2f}" if conf is not None else name) + +# %% [markdown] +# ### 3.2 Visualise RF-DETR Seg output + +# %% +PALETTE = sv.ColorPalette.DEFAULT + +annotated = image_bgr.copy() +annotated = sv.MaskAnnotator(color=PALETTE, opacity=0.45).annotate( + annotated, detections +) +annotated = sv.BoxAnnotator(color=PALETTE, thickness=2).annotate(annotated, detections) +annotated = sv.LabelAnnotator(color=PALETTE, text_scale=0.5, text_thickness=1).annotate( + annotated, detections, labels=labels +) + +plt.figure(figsize=(12, 7)) +plt.imshow(cv2.cvtColor(annotated, cv2.COLOR_BGR2RGB)) +plt.axis("off") +plt.title(f"RF-DETR Seg -- {len(detections)} instance(s)") +plt.tight_layout() +plt.show() + +# %% [markdown] +# ## 4. CompactMask +# +# RF-DETR Seg returns one full-resolution binary mask per detected instance. +# On a 1280 x 720 image with 12 people that is: +# +# `12 x 720 x 1280 x 1 byte = 11 MB` +# +# Most of those pixels are background. The actual person silhouette fits in a +# tight bounding box. `sv.CompactMask` stores **only the bounding-box crop**, +# RLE-encoded: +# +# - A 200 x 100 person crop: `~2.5 KB` instead of `900 KB` +# - Drop-in replacement -- all annotators, filters, and `area` keep working + +# %% [markdown] +# ### 4.1 Measure dense mask footprint + +# %% +from typing import Any + +dense_bytes: int = 0 +dense_mask: "np.ndarray[Any, np.dtype[np.bool_]] | None" = None + +assert detections.mask is not None and isinstance(detections.mask, np.ndarray) + +dense_mask = detections.mask +dense_bytes = dense_mask.nbytes +n_inst = len(dense_mask) +print(f"Instances: {n_inst}") +print(f"Mask shape: {dense_mask.shape} (N x H x W, bool)") +print(f"Dense footprint: {dense_bytes / 1024:.1f} KB") +print(f" = {n_inst} masks x {H} x {W} x 1 byte") + +# %% [markdown] +# ### 4.2 Convert to CompactMask + +# %% +compact: "sv.CompactMask | None" = None +crop_bytes: int = 0 + +assert dense_mask is not None +compact = sv.CompactMask.from_dense( + masks=dense_mask, + xyxy=detections.xyxy, + image_shape=(H, W), +) + +# Measure compact size via uncompressed crop booleans (upper bound; RLE < this). +crop_bytes = sum(compact.crop(i).nbytes for i in range(len(compact))) + +print(f"Crop size (est.): {crop_bytes / 1024:.1f} KB (uncompressed crops)") +if crop_bytes > 0 and dense_bytes > 0: + ratio = dense_bytes / crop_bytes + print(f"Reduction factor: {ratio:.1f}x (before RLE compression)") + +# Swap in CompactMask -- supervision uses it transparently from here on. +detections.mask = compact +print(f"\ndetections.mask type: {type(detections.mask).__name__}") + +# %% [markdown] +# ### 4.3 Filtering by mask area +# +# `compact.area` returns the true pixel count of each instance mask. +# Filter out tiny detections (partial occlusions, image-edge artefacts). + +# %% +large: sv.Detections = detections +large_labels: list[str] = labels + +assert isinstance(detections.mask, sv.CompactMask) +areas = detections.mask.area +print( + f"Mask areas (px): min={areas.min():.0f} " + f"mean={areas.mean():.0f} max={areas.max():.0f}" +) + +# Keep instances larger than 0.1% of the image. +min_area = 0.001 * H * W +keep_idx = np.where(areas > min_area)[0] +_filtered = detections[keep_idx] +if isinstance(_filtered, sv.Detections): + large = _filtered + large_labels = [labels[i] for i in keep_idx] if labels else [] +print(f"\nInstances > {min_area:.0f} px: {len(large)}") + +# %% [markdown] +# ### 4.4 Annotate with CompactMask +# +# Annotators call `.to_dense()` internally -- CompactMask is invisible to them. + +# %% +assert isinstance(detections.mask, sv.CompactMask) and dense_bytes > 0 + +annotated_compact = image_bgr.copy() +annotated_compact = sv.MaskAnnotator(color=PALETTE, opacity=0.45).annotate( + annotated_compact, large +) +annotated_compact = sv.BoxAnnotator(color=PALETTE, thickness=2).annotate( + annotated_compact, large +) +annotated_compact = sv.LabelAnnotator( + color=PALETTE, text_scale=0.5, text_thickness=1 +).annotate(annotated_compact, large, labels=large_labels) + +plt.figure(figsize=(12, 7)) +plt.imshow(cv2.cvtColor(annotated_compact, cv2.COLOR_BGR2RGB)) +plt.axis("off") +plt.title( + f"CompactMask (filtered) -- {len(large)} instance(s) " + f"| {dense_bytes / 1024:.0f} KB dense -> {crop_bytes / 1024:.0f} KB crops" +) +plt.tight_layout() +plt.show() + +# %% [markdown] +# ### 4.5 Per-instance crop +# +# `compact.crop(i)` decodes only the bounding-box crop for instance `i` as a +# `(H_crop, W_crop)` bool array -- no full mask materialised. + +# %% +assert isinstance(detections.mask, sv.CompactMask) and len(detections) > 0 + +crop = detections.mask.crop(0) +bbox = detections.mask.bbox_xyxy[0].astype(int) + +fig, axes = plt.subplots(1, 2, figsize=(10, 4)) +axes[0].imshow(image_rgb[bbox[1] : bbox[3], bbox[0] : bbox[2]]) +axes[0].set_title("Image crop (instance 0)") +axes[0].axis("off") + +axes[1].imshow(crop, cmap="gray") +axes[1].set_title(f"Mask crop ({crop.shape[1]} x {crop.shape[0]} px)") +axes[1].axis("off") + +plt.tight_layout() +plt.show() + +full_px = H * W +crop_kb = crop.nbytes / 1024 +print(f"Full-res mask slot: {H} x {W} = {full_px / 1024:.0f} KB") +print(f"Compact crop: {crop.shape[0]} x {crop.shape[1]} = {crop_kb:.1f} KB") + +# %% [markdown] +# ## 5. SAM3 +# +# `sv.Detections.from_sam3()` is the other headline in 0.28.0. +# SAM3 segments objects by free-text prompts -- `"person"`, `"bag"`, any phrase. +# supervision parses both the PCS and PVS response formats into a standard +# `sv.Detections`, with `class_id` set to the prompt index. +# +# This section runs only when `ROBOFLOW_API_KEY` is available. + +# %% +import base64 +import os +from typing import Optional + +import requests + +try: + from google.colab import userdata # type: ignore[import, unused-ignore] + + ROBOFLOW_API_KEY: str = userdata.get("ROBOFLOW_API_KEY") or "" +except Exception: + ROBOFLOW_API_KEY = os.environ.get("ROBOFLOW_API_KEY", "") + +PROMPTS = ["person", "bag"] +sam3_detections: Optional[sv.Detections] = None + +assert ROBOFLOW_API_KEY + +with open(image_path, "rb") as _f: + _img_b64 = base64.b64encode(_f.read()).decode("utf-8") + +_response = requests.post( + f"https://api.roboflow.com/inferenceproxy/seg-preview?api_key={ROBOFLOW_API_KEY}", + json={ + "image": {"type": "base64", "value": _img_b64}, + "prompts": [{"type": "text", "text": p} for p in PROMPTS], + "output_prob_thresh": 0.3, + }, + headers={"Content-Type": "application/json"}, + timeout=60, +) +_response.raise_for_status() +sam3_result: dict[str, Any] = _response.json() +sam3_detections = sv.Detections.from_sam3(sam3_result=sam3_result, resolution_wh=(W, H)) +print(f"SAM3 detections: {len(sam3_detections)}") +if sam3_detections.class_id is not None: + for idx, prompt in enumerate(PROMPTS): + count = int((sam3_detections.class_id == idx).sum()) + print(f" [{idx}] '{prompt}': {count} instance(s)") + +# %% +assert sam3_detections is not None and len(sam3_detections) > 0 + +sam3_labels = ( + [PROMPTS[c] for c in sam3_detections.class_id] + if sam3_detections.class_id is not None + else [] +) +SAM3_PALETTE = sv.ColorPalette.from_hex(["#ff6b6b", "#4ecdc4"]) + +annotated_sam3 = image_bgr.copy() +annotated_sam3 = sv.MaskAnnotator(color=SAM3_PALETTE, opacity=0.45).annotate( + annotated_sam3, sam3_detections +) +annotated_sam3 = sv.BoxAnnotator(color=SAM3_PALETTE, thickness=2).annotate( + annotated_sam3, sam3_detections +) +annotated_sam3 = sv.LabelAnnotator( + color=SAM3_PALETTE, text_scale=0.5, text_thickness=1 +).annotate(annotated_sam3, sam3_detections, labels=sam3_labels) + +plt.figure(figsize=(12, 7)) +plt.imshow(cv2.cvtColor(annotated_sam3, cv2.COLOR_BGR2RGB)) +plt.axis("off") +plt.title(f"SAM3 -- from_sam3() -- {len(sam3_detections)} instance(s)") +plt.tight_layout() +plt.show() + +# %% [markdown] +# ## 6. Other notable changes in 0.28.0 +# +# ### `VideoInfo.fps` is now `float` +# +# NTSC frame rates (23.976, 29.97, 59.94) were silently truncated to `int`. +# Wrap with `int()` at call sites that require an integer. + +# %% +import collections + +from supervision.assets import VideoAssets + +video_path = download_assets(VideoAssets.PEOPLE_WALKING) +info = sv.VideoInfo.from_video_path(video_path) + +print(f"fps: {info.fps} ({type(info.fps).__name__}) -- was int before 0.28.0") + +fps_int = int(info.fps) +buf: collections.deque[sv.Detections] = collections.deque(maxlen=fps_int) +trace = sv.TraceAnnotator(trace_length=fps_int) +print(f"deque maxlen: {buf.maxlen} (= int({info.fps}))") + +# %% [markdown] +# ### `sv.ByteTrack` deprecated +# +# `sv.ByteTrack` still works in 0.28.0 and 0.29.0 but emits a +# `DeprecationWarning`. Migrate to `ByteTrackTracker` from the external +# [`trackers`](https://pypi.org/project/trackers/) package before 0.30.0. +# +# ```python +# # Before +# tracker = sv.ByteTrack() +# detections = tracker.update_with_detections(detections) +# +# # After (pip install trackers) +# from trackers import ByteTrackTracker +# tracker = ByteTrackTracker() +# detections = tracker.update(detections) +# ``` + +# %% [markdown] +# ## 7. Next steps +# +# - [`sv.CompactMask` docs](https://supervision.roboflow.com/develop/detection/compact_mask/) +# -- full API reference: `resize`, `merge`, `with_offset` +# - [`sv.Detections.from_sam3` docs](https://supervision.roboflow.com/develop/detection/core/) +# -- PCS and PVS format reference +# - [RF-DETR docs](https://github.com/roboflow/rf-detr) +# -- training, export, and deployment +# - [Full changelog](https://supervision.roboflow.com/develop/changelog/) +# -- every change in 0.28.0 diff --git a/pyproject.toml b/pyproject.toml index ed0f83f779..6ae70d1e81 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -4,7 +4,7 @@ requires = [ "setuptools>=61" ] [project] name = "supervision" -version = "0.28.0rc2" +version = "0.28.0" description = "A set of easy-to-use utils that will come in handy in any Computer Vision project" readme = "README.md" keywords = [ @@ -154,6 +154,10 @@ lint.select = [ ] lint.ignore = [] lint.per-file-ignores."__init__.py" = [ "E402", "F401" ] +lint.per-file-ignores."notebooks/**" = [ + "PT018", # Assertion should be broken down into multiple parts + "S101", # Use of `assert` detected +] lint.per-file-ignores."src/**" = [ "S101", # TODO: Replace asserts with proper error handling ] diff --git a/src/supervision/detection/utils/boxes.py b/src/supervision/detection/utils/boxes.py index e92924a678..a19b05d445 100644 --- a/src/supervision/detection/utils/boxes.py +++ b/src/supervision/detection/utils/boxes.py @@ -2,6 +2,7 @@ import numpy as np import numpy.typing as npt +from deprecate import deprecated from supervision.detection.utils.iou_and_nms import box_iou_batch @@ -95,10 +96,17 @@ def pad_boxes( return result +@deprecated( # type: ignore[untyped-decorator] + target=True, + deprecated_in="0.27.0", + remove_in="0.30.0", + args_mapping={"normalized_xyxy": "xyxy"}, +) def denormalize_boxes( xyxy: npt.NDArray[np.number], resolution_wh: tuple[int, int], normalization_factor: float = 1.0, + normalized_xyxy: npt.NDArray[np.number] | None = None, ) -> npt.NDArray[np.number]: """ Convert normalized bounding box coordinates to absolute pixel coordinates.