Skip to content

geotiff: forward missing_sources from open_geotiff to read_vrt (#1810)#1811

Open
brendancol wants to merge 3 commits into
mainfrom
deep-sweep-api-consistency-geotiff-2026-05-13-1778686244-2987753
Open

geotiff: forward missing_sources from open_geotiff to read_vrt (#1810)#1811
brendancol wants to merge 3 commits into
mainfrom
deep-sweep-api-consistency-geotiff-2026-05-13-1778686244-2987753

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Background

read_vrt gained missing_sources='warn'|'raise' in #1806, but the dispatcher open_geotiff did not expose it. Callers wanting strict failure on broken VRT sources had to call read_vrt directly. Same class of bug as #1561 / #1605 / #1685 / #1795.

Repro (pre-fix)

from xrspatial.geotiff import open_geotiff
open_geotiff("mosaic.vrt", missing_sources="raise")
# TypeError: open_geotiff() got an unexpected keyword argument 'missing_sources'

Test plan

  • New regression suite test_open_geotiff_missing_sources_1810.py exercises signature, warn/raise/invalid policies through the dispatcher, non-VRT rejection, and the no-kwarg backward-compat path (6 tests pass)
  • Existing parity suites still pass: test_open_geotiff_vrt_kwarg_drop_1685.py, test_vrt_missing_sources_policy_1799.py, test_backend_kwarg_parity_1561.py, test_signature_annotations_1654.py, test_signature_parity_1631.py (48 tests pass)
  • Broader smoke: 253 tests pass for open_geotiff or read_vrt or write_vrt or signature or kwarg or missing_sources or namespace
  • No source signature is renamed; default 'warn' preserves the existing behaviour for callers that never set the kwarg

read_vrt gained a missing_sources='warn'|'raise' policy kwarg in #1806
but the documented dispatcher open_geotiff did not expose it. Callers
who wanted strict failure on broken VRT sources had to call read_vrt
directly, defeating the dispatcher contract.

Same class of bug as #1561 / #1605 / #1685 / #1795: a backend kwarg
reachable on the routed-to function is unreachable through the
documented entry point. Fix mirrors the on_gpu_failure pattern that
open_geotiff already uses for GPU-only kwargs:

- Add missing_sources= to open_geotiff with a sentinel default so the
  dispatcher can tell "caller never set it" (skip forwarding, let the
  read_vrt default of 'warn' apply) from "caller set it" (forward
  verbatim).
- Reject missing_sources on non-VRT sources with a clear ValueError
  so callers learn the policy is being ignored instead of getting a
  silent drop.
- Document the kwarg in the open_geotiff docstring with the same
  values and semantics as read_vrt.

Regression tests in test_open_geotiff_missing_sources_1810.py cover
the signature, warn/raise/invalid policies through the dispatcher,
non-VRT rejection, and the no-kwarg backward-compatible path.
Record the v5 api-consistency sweep against geotiff. One MEDIUM
finding filed and fixed (#1810). Cross-sibling write-return-type
drift remains deferred at LOW.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 13, 2026
PR #1803 forwarded the caller's max_pixels to read_to_array inside
read_vrt's source loop so a tiny VRT output cannot force a huge source
decode (#1796). The output-window check at the source read enforces that
correctly. A separate per-tile dimension check at the same call sites
also consumed the caller's max_pixels, so a caller setting max_pixels as
an output budget (e.g. 10_000) failed the per-tile sanity check on any
normal source whose default tile size is 256x256 (= 65_536 pixels).

Use MAX_PIXELS_DEFAULT for the per-tile dim check at the two call sites
in _read_tiles (local) and _read_tiles_cog_http (HTTP). The output-window
check at the same functions continues to enforce the user-supplied
max_pixels, preserving the #1796 protection.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves GeoTIFF/VRT API consistency by exposing missing_sources on the public open_geotiff dispatcher and forwarding it to read_vrt when opening .vrt sources. It also adjusts the internal per-tile dimension safety checks so that a caller’s max_pixels (used as an output/window budget) no longer rejects normal source tile sizes during VRT assembly.

Changes:

  • Add missing_sources to open_geotiff, forward it to read_vrt for .vrt sources, and reject it for non-VRT sources with a clear ValueError.
  • Update _reader.py per-tile dimension sanity checks to use MAX_PIXELS_DEFAULT rather than the caller’s max_pixels.
  • Add regression tests for both the open_geotiff(missing_sources=...) dispatcher behavior and the VRT per-tile safety-check behavior.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
xrspatial/geotiff/__init__.py Adds missing_sources to open_geotiff, validates/rejects on non-VRT, and forwards to read_vrt.
xrspatial/geotiff/_reader.py Changes per-tile dimension checks to use MAX_PIXELS_DEFAULT instead of caller-provided max_pixels.
xrspatial/geotiff/tests/test_open_geotiff_missing_sources_1810.py New regression tests ensuring the dispatcher exposes/forwards/validates missing_sources.
xrspatial/geotiff/tests/test_vrt_source_tile_check_1823.py New regression tests covering the per-tile check behavior vs caller max_pixels.
.claude/sweep-api-consistency-state.csv Updates sweep state metadata for the API consistency sweep.
Comments suppressed due to low confidence (1)

xrspatial/geotiff/_reader.py:2031

  • Same issue as the local tile path: using MAX_PIXELS_DEFAULT unconditionally for the per-tile dimension check prevents callers from increasing max_pixels to handle very-large-tile COGs. Consider basing this check on max(max_pixels, MAX_PIXELS_DEFAULT) (or a dedicated per-tile override) so the check ignores small output budgets but still supports explicit opt-in for big tiles.
    # ``max_pixels`` -- intended as an output-window budget -- does not
    # reject normal 256x256 tiles. See #1823.
    if window is None:
        _check_dimensions(width, height, samples, max_pixels)
    _check_dimensions(tw, th, samples, MAX_PIXELS_DEFAULT)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1 to +5
"""Regression tests for #1823.

PR #1803 forwarded the caller's ``max_pixels`` to ``read_to_array`` inside
the VRT source loop so that a tiny VRT output could not force a huge
source decode (#1796). The output-window check enforces that. A separate
Comment on lines +1569 to +1570
# #1823.
_check_dimensions(tw, th, samples, MAX_PIXELS_DEFAULT)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

geotiff: open_geotiff silently drops missing_sources when routing to read_vrt

2 participants