Pin HTTP/range COG byte-budget contract (#2293)#2298
Merged
brendancol merged 2 commits intoMay 22, 2026
Conversation
Add test_http_cog_range_contract_2286.py to lock down the HTTP COG reader's transport behaviour with explicit byte-count and range-count assertions. Tests only; no production code changes. Coverage rows: * Windowed tile and multi-tile reads fetch only intersecting tiles (no read_all fallback; total bytes below file size and bounded by the windowed footprint). * Overview reads pull the overview IFD's tiles, not the full-res pixel data. * band= on multi-band chunky COGs returns correct pixels with bounded reads, alone and combined with window=. * Dask graphs parse IFDs once across all chunk tasks (O(1) header GETs in chunk count). * Truncated buffers, malformed IFD chains, and short pixel bodies close the HTTP source via the try/finally guard and raise a clear exception rather than hanging. * coalesce_ranges respects the configured and default max-merged-range caps; split_coalesced_bytes round-trips bytes under the cap. Closes xarray-contrib#2293.
brendancol
commented
May 22, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
Self-review: PR #2298
Single new test file, 864 lines. All 15 tests pass locally. Read the file in full; here is what stood out.
Blockers
None.
Suggestions
xrspatial/geotiff/tests/test_http_cog_range_contract_2286.py:43--_parse_cog_http_metais imported but never used.pyflakesflags it. Drop the import to keep the test file lint-clean.xrspatial/geotiff/tests/test_http_cog_range_contract_2286.py:112-120--_serveis defined but never called. The two tests that stand up loopback servers (test_short_body_during_pixel_fetch_closes_sourceandtest_loopback_end_to_end_windowed_byte_budget) each inline their own handler/server setup. Either delete the helper or refactor the two sites to use it. Leaving it as dead code adds 9 lines of misleading scaffolding.
Nits (stale comments after the fixture resize)
The fixtures were bumped from 64x64 / tile=16 to 256x256 / tile=32 to clear the 16 KiB header probe, but a handful of docstrings and inline comments still reference the old sizes. Worth scrubbing:
test_http_cog_range_contract_2286.py:247-- docstring says "A 16x16 window aligned to one tile" but the window is 32x32.test_http_cog_range_contract_2286.py:292-- "A window that touches 2x2=4 tiles must not fetch all 16 tiles" but the file has 64 tiles, not 16. The comment two lines below (>=64 separate GETs) is correct; the docstring is stale.test_http_cog_range_contract_2286.py:421-- "16x16 window aligned to one tile" but the window is 32x32.test_http_cog_range_contract_2286.py:483-- docstring says "two chunk granularities (8 and 16)" but the run uses chunks=32 and chunks=64.
Unused locals
test_band_selection_multiband_chunky_bounded_reads(line 381) andtest_band_selection_with_window_bounded_range_count(line 409) unpackexpectedfrom the fixture but never use it. Replace with_expectedor drop the unpack to silence linters.
What looks good
- Coverage matches the issue checklist row-for-row: windowed reads, overview reads, band+window, dask once-per-graph, close-on-error (truncated buffer, malformed IFD, short pixel body), and the coalesce cap. All six required assertions have at least one test row.
- Fixture sizing is deliberate (256x256 random pixels so deflate cannot collapse below the header probe) and the rationale is documented in the docstrings.
_no_sidecar_probeautouse fixture is the right call here -- sidecar discovery on a 200-everything mock server would skew the byte-count assertions, and the issue explicitly says sidecar behaviour belongs intest_remote_sidecar_chunked_2239.py._RecordingHTTPSourceextends_HTTPSourceso it inheritsread_ranges_coalesced, which is what makestest_coalesced_get_size_capped_on_real_http_sourceexercise the real code path instead of a stub.- The
expected = _Handler.payload[:64]style upper bounds (file size, header + tile budget, 75% of base for overviews) are tight enough to catch a regression that pulls everything but loose enough to survive codec/compression changes. The hard bound is always "less than file size"; the soft bound is the early-warning indicator. - Short-body test accepts both
OSErrorandurllib3.exceptions.ProtocolErrorso the test does not become brittle ifread_range's wrapping changes.
Checklist
- Tests only; no production code touched (per #2293 scope).
- All six required assertions present.
- Reuses fixtures from the sibling files listed in the issue.
- Deterministic (in-process mock + loopback server, no real network).
- Suite green on CPU CI -- 15 passed locally; 46 passed across this file + two sibling files combined.
- No real production bugs uncovered; no
xfailrows needed.
Drop unused _parse_cog_http_meta import (flagged by pyflakes), drop the unused _serve helper that the two loopback tests bypassed inline, update stale docstrings/comments left over from the 64x64 -> 256x256 fixture resize, and rename unused `expected` unpacks to `_expected`. No assertion changes; all 15 tests still pass.
brendancol
commented
May 22, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
Self-review round 2 (post-fix)
All findings from round 1 addressed in 1a82640:
- Dropped unused
_parse_cog_http_metaimport.pyflakesnow clean. - Removed unused
_servehelper. The two tests that stand up loopback servers (test_short_body_during_pixel_fetch_closes_source,test_loopback_end_to_end_windowed_byte_budget) both inline their setup so the helper was dead code. - Fixed four stale docstrings/comments referring to the old 64x64 / tile=16 fixture sizes (lines 247, 292, 421, 483). All four now match the actual 256x256 / tile=32 fixture and the chunks=32/64 dask runs.
- Renamed
expectedto_expectedin the two band-selection tests where the fixture's reference array is not used (the tests compare againstopen_geotiff(path, band=...)directly).
15 tests still pass locally; no assertion logic touched.
Nothing else to flag. Ready for CI.
This was referenced May 22, 2026
brendancol
added a commit
that referenced
this pull request
May 22, 2026
* Promote local COG contract to stable (#2300) Flip the local COG read and write paths to the stable tier in ``xrspatial.geotiff.SUPPORTED_FEATURES``: - ``SUPPORTED_FEATURES['writer.cog']``: advanced -> stable. - ``SUPPORTED_FEATURES['reader.local_cog']``: advanced -> stable. - ``SUPPORTED_FEATURES['reader.http_cog']`` stays advanced; the inline comment now spells out why (range fetching, redirect handling, SSRF filter, cache / retry behaviour not yet contracted). Document the stable COG contract: - ``xrspatial/geotiff/_attrs.py`` carries a block comment above ``SUPPORTED_FEATURES`` describing what the stable contract guarantees and what stays advanced. - ``docs/source/reference/geotiff.rst`` grows a *Stable COG contract* section at the top of the page plus an *Outside the stable contract* list. - The COG overview notebook (``examples/user_guide/52_COG_Overview_ Generation.ipynb``) carries a short note that the examples sit inside the stable contract while HTTP / GPU / BigTIFF stay outside. - ``CHANGELOG.md`` records the promotion under Unreleased. Backed by the writer compliance suite (#2292), the cross-backend parity gate (#2293), and the per-tile byte-budget contract (#2294 / #2298). The full 5169-test geotiff suite passes locally. Closes #2300. * Address self-review: bump RST underline above title length
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Part of #2286 (COG readiness rollout). Adds
xrspatial/geotiff/tests/test_http_cog_range_contract_2286.pyto lock down the HTTP COG reader's transport behaviour with explicit byte-count and range-count assertions. Tests only; no production code changes.Coverage rows
read_allfallback, total bytes stay below the file size, and the byte budget is bounded by the windowed footprint.band=on multi-band chunky COGs returns the same pixels as the local path with bounded reads, both alone and combined withwindow=.coalesce_rangesrespects the configuredmax_coalesced_range_bytesand the default cap (COG range coalescing can turn safe tile reads into huge over-fetches #2266).split_coalesced_bytesround-trips bytes under the cap. The real_HTTPSource.read_ranges_coalescedpath propagates the cap to wire-level GETs.Fixture choices
test_remote_sidecar_chunked_2239.py._RecordingHTTPSource/_CloseCountingSourcepatterns from the sibling test files referenced in the issue (test_http_cog_coalesce.py,test_cog_http_close_on_error_1816.py,test_http_stripped_window_max_pixels_issue_A_1842.py).Test plan
pytest xrspatial/geotiff/tests/test_http_cog_range_contract_2286.py -v-- 15 passed locallytest_http_cog_coalesce.py,test_http_window_band_planar_1669.py) -- 46 passed, no regressionsCloses #2293.