Skip to content

Refactor GeoTIFF Phase 5c-read: extract _layout.py from _reader.py#2257

Merged
brendancol merged 2 commits into
mainfrom
issue-2247-layout-extraction
May 21, 2026
Merged

Refactor GeoTIFF Phase 5c-read: extract _layout.py from _reader.py#2257
brendancol merged 2 commits into
mainfrom
issue-2247-layout-extraction

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Phase 5c-read of the GeoTIFF refactor epic. Move read-side layout and validation helpers out of _reader.py into a new _layout.py.

Moved

  • MAX_PIXELS_DEFAULT, PixelSafetyLimitError
  • _check_dimensions, _check_source_dimensions
  • _sparse_fill_value, _has_sparse
  • _FULL_IMAGE_BUDGET_HEADER_SLACK, _compute_full_image_byte_budget
  • _ifd_required_extent

Kept in _reader.py

  • Top-level orchestration: _read_to_array, the read_to_array alias.
  • COG-HTTP transport: _parse_cog_http_meta, _read_cog_http, _fetch_decode_cog_http_strips, _fetch_decode_cog_http_tiles. The issue scopes the COG-HTTP extraction to a separate follow-up.

Compatibility

  • _reader.py re-exports every moved name, so from xrspatial.geotiff._reader import PixelSafetyLimitError (and friends) keeps working for the backends, the VRT module, and the test suite.
  • _decode.py still reads MAX_PIXELS_DEFAULT via _reader, so the existing monkeypatch.setattr(reader_mod, "MAX_PIXELS_DEFAULT", N) contract on the per-tile safety check keeps firing.

Line count

_reader.py: 1406 to 1258 (148 lines moved). _layout.py is 217 lines.

Test plan

  • pytest xrspatial/geotiff/tests/ locally: 5037 passed, 68 skipped. One pre-existing lz4 failure (test_lowlevel_write_pushdown_2138.py::test_write_vs_to_geotiff_byte_parity_uint8[lz4]) reproduces on main and is unrelated.
  • Smoke-import the backends and the VRT module.

Closes #2247
Part of #2211

…2247)

Move read-side layout / validation helpers out of ``_reader.py`` into a
new ``_layout`` module so the top-level reader keeps just orchestration
and the COG-HTTP transport layer.

Moved:
- ``MAX_PIXELS_DEFAULT``, ``PixelSafetyLimitError``,
  ``_check_dimensions``, ``_check_source_dimensions``
- ``_sparse_fill_value``, ``_has_sparse``
- ``_FULL_IMAGE_BUDGET_HEADER_SLACK``, ``_compute_full_image_byte_budget``
- ``_ifd_required_extent``

``_reader.py`` re-exports each name so existing
``from xrspatial.geotiff._reader import ...`` call sites and test
monkeypatches against ``_reader.MAX_PIXELS_DEFAULT`` keep working.
``_decode.py`` switches to importing the helpers from ``_layout``
directly and keeps the ``_reader.MAX_PIXELS_DEFAULT`` lookup so the
existing monkeypatch contract on the per-tile safety check still
fires.

Mechanical, behaviour-neutral. ``_reader.py`` drops from 1406 to 1258
lines; the remaining bulk is the COG-HTTP transport which the issue
scopes out of this PR.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 21, 2026
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Refactor GeoTIFF Phase 5c-read: extract _layout.py from _reader.py

Mechanical extraction. Code lifts cleanly, the re-export in _reader.py keeps every historical import working, and the documented _reader.MAX_PIXELS_DEFAULT monkeypatch contract is preserved.

Blockers

None.

Suggestions

None blocking.

Nits

  • xrspatial/geotiff/_layout.py:23 -- TIFFHeader is imported and appears in _ifd_required_extent's signature but is unused in the body (pre-existing in _reader.py, just travels with the move). Same for the data_len: int parameter on that function. Worth pruning in a follow-up cleanup PR, not here -- this PR is scoped to a behaviour-neutral move.
  • xrspatial/geotiff/_layout.py:91 and :154 -- the lazy from ._decode import _int_nodata_in_range and from ._sources import _max_tile_bytes_from_env are defensible (they keep _layout import-cheap and side-step a load-order surprise if _decode is re-shuffled later), but _decode does not import _layout at top level and _sources has no path back to _layout either, so both could be hoisted to module-level imports without creating a cycle. The lazy form is the safer choice; just flagging that it is a choice, not a constraint.
  • xrspatial/geotiff/_decode.py:562 -- the split (MAX_PIXELS_DEFAULT from _reader, the rest from _layout) is intentional and well-commented for monkeypatch compatibility. If a future cleanup unifies the patch surface (e.g. patch _layout.MAX_PIXELS_DEFAULT and re-export downstream), this hop can collapse.

What looks good

  • _reader.py keeps the re-export block prominent at the top with a clear "Source: PR-H, issue #2247" pointer.
  • _decode.py docstring is updated to reflect the new layout-module relationship and explains exactly why the lazy imports stay lazy.
  • The monkeypatch contract (test_vrt_source_tile_check_1823::test_per_tile_check_caps_at_default) is the load-bearing test; it passes locally, which confirms the _reader -> _layout aliasing did not break the rebinding semantics.
  • Documented circular-relationship management at the head of _decode.py is accurate and pays off when reading _layout.py's lazy imports.

Checklist

  • Mechanical move only; no algorithm changes (correctness review is N/A)
  • No backend dispatch changes
  • No NaN-handling changes
  • Test contract preserved (5037 passed locally; one pre-existing lz4 failure unrelated to this PR)
  • No dask chunk-boundary changes
  • No new materialization
  • No benchmark needed (no perf-sensitive code touched)
  • No README feature-matrix change needed
  • Docstrings preserved verbatim on moved functions

- Drop unused TIFFHeader import and unused header/data_len params from
  _ifd_required_extent. The function never read them; they traveled
  with the move from _reader.py.
- Hoist lazy imports (_int_nodata_in_range, _parse_nodata_str,
  _max_tile_bytes_from_env) to module-level. No cycle exists: _decode
  and _sources do not import _layout.

Leaves the _decode.py MAX_PIXELS_DEFAULT patch-surface split alone
(the third nit) since it is load-bearing for monkeypatch compatibility.
@brendancol brendancol merged commit fe5aea3 into main May 21, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor GeoTIFF Phase 5c-read: extract _layout.py from _reader.py (PR-H of #2211)

1 participant