Centralise VRT capability validator at both read entry points (#2371)#2376
Open
brendancol wants to merge 4 commits into
Open
Centralise VRT capability validator at both read entry points (#2371)#2376brendancol wants to merge 4 commits into
brendancol wants to merge 4 commits into
Conversation
…-contrib#2371) Extend ``validate_parsed_vrt`` to cover nested VRT references, per-source mask-band semantics, and dataset / band-level ``<GDALWarpOptions>`` blocks. Route the internal ``_vrt.read_vrt`` entry point through the validator so direct callers and the chunked dask path get the same capability rejections as ``_backends/vrt.read_vrt``. Expose ``validate_vrt_capability`` as a public alias on ``_vrt_validation`` matching the epic naming (the implementation continues under ``validate_parsed_vrt`` for backward compatibility). Update the legacy issue xarray-contrib#1751 resample-alg regression tests to accept the validator's ``VRTUnsupportedError`` alongside the original ``NotImplementedError``; both encode the same contract that nearest must not be silently substituted for an unsupported algorithm.
Cover nested VRT, dataset-level and band-level <GDALWarpOptions>, per-source <UseMaskBand> and <MaskBand> children, and the now-typed resample-alg rejection at the internal entry point. Each rejection is exercised at validator-direct, public _backends/vrt.read_vrt, internal _vrt.read_vrt, and open_geotiff entry points so the wiring is covered.
brendancol
commented
May 25, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
PR Review: Centralise VRT capability validator at both read entry points (#2371)
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
-
xrspatial/geotiff/_vrt.py:713: the<UseMaskBand>truthy set is('1', 'true', 'yes'). GDAL writes lowercasetrue, so'1'and'yes'are spellings the parser will never see from GDAL itself. Tightening to('1', 'true')keeps the surface narrow and matches what real VRTs contain. Not blocking; the wider set is harmless.
Nits (optional improvements)
-
xrspatial/geotiff/_vrt_validation.py:303: the nested-VRT message ends with "are not a supported feature in this release". Theparse_vrtrejections point readers atxrspatial.geotiff.SUPPORTED_FEATURES. Adding the same pointer here keeps the rejection messages aligned on one discovery anchor. -
xrspatial/geotiff/_vrt.py:1247: the validator import is lazy inside the function body. That matches the existing pattern elsewhere in the file, but a one-line comment ("lazy to avoid circular import") would help the next reader. -
xrspatial/geotiff/tests/test_vrt_capability_validator_2371.py:339: thetest_use_mask_band_truthy_spellings_rejectedparametrize list includes'yes'. If you tighten the truthy set per the suggestion above, drop that row or move it to a "must be accepted as not-truthy" test.
What looks good
- The validator extends cleanly. Every new rejection names the offending source path and field, so the error message tells the caller where to look.
validate_vrt_capabilityas an alias keeps every existing call site working.- Test coverage is strong: every new rejection path runs at four entry points (validator-direct, internal
_vrt.read_vrt, public_backends/vrt.read_vrt,open_geotiff). - The
test_vrt_resample_alg_1751.pyfix accepts bothNotImplementedErrorandVRTUnsupportedError, so the legacy contract intent is preserved without breaking the new routing. - The
parsed is Noneshort-circuit in_vrt.read_vrtavoids double-validation when the chunked path threads a pre-validated instance in. <GDALWarpOptions>rejection covers dataset-level, band-level, and the existingsubClass="VRTWarpedRasterBand"marker. All three locations a warp config can appear are caught.
Checklist
- Algorithm matches the VRT spec
- Implemented backends produce consistent results (validator runs before backend dispatch)
- NaN handling not applicable (metadata-only gate)
- Edge cases covered: uppercase
.VRT,falseflag accepted, truthy spellings, both source tag types - Dask chunk boundaries handled correctly (chunked path threads a pre-validated VRT)
- No premature materialization (metadata gate runs before any decode)
- Benchmark not needed (cheap XML-only check)
- README feature matrix not applicable (no new public API)
- Docstrings present and accurate
…n SUPPORTED_FEATURES (xarray-contrib#2371) - Narrow <UseMaskBand> truthy set to ('1', 'true') to match what GDAL actually emits. Tokens like yes / on / Y now pass through as not-mask rather than tripping the rejection. - Anchor the nested-VRT rejection message on xrspatial.geotiff.SUPPORTED_FEATURES so the new path matches the parse_vrt rejection style. - Add a one-line comment explaining why the validator import in _vrt.read_vrt is lazy (circular import with _vrt_validation). - Replace the original yes parametrize row with a separate non_canonical_truthy_accepted test so the contract for outside the canonical set is explicit.
brendancol
commented
May 25, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
Follow-up review after addressing prior comments
Disposition of original findings
- Suggestion (
_vrt.py:713UseMaskBand truthy set): Fixed. Narrowed to('1', 'true'). Addedtest_use_mask_band_non_canonical_truthy_acceptedto lock in the contract for non-canonical tokens. - Nit 1 (
_vrt_validation.py:303SUPPORTED_FEATURES anchor): Fixed. Nested-VRT message now referencesxrspatial.geotiff.SUPPORTED_FEATURES. - Nit 2 (lazy import comment): Fixed. Added a one-line note explaining the circular-import avoidance.
- Nit 3 (drop
'yes'parametrize row): Fixed. Row removed; non-canonical tokens are covered by the new explicit test instead.
Verification
- Full geotiff test suite: 5663 passed, 68 skipped, 4 xfailed, 1 xpassed.
- New test count: 24 (was 22). The two new tests cover the narrowed truthy set.
No further findings.
…ntrib#2371) Switch the nested-VRT rejection from {src.filename!r} to direct interpolation so Windows paths render with single backslashes instead of the doubled escapes repr emits, matching the path-containment pattern in parse_vrt. Update the test assertion to compare on the file basename so any further normalisation difference (short-name vs long-name, symlink resolution) between str(tmp_path / name) and the os.path.realpath form parse_vrt stores does not break the match.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #2371.
Summary
validate_parsed_vrtto reject nested VRTs, per-source<UseMaskBand>flags, and per-source<MaskBand>children withVRTUnsupportedError.parse_vrtto reject dataset-level and band-level<GDALWarpOptions>blocks withUnsupportedGeoTIFFFeatureError._vrt.read_vrtentry point throughvalidate_parsed_vrtso direct callers and the chunked dask path get the same capability rejections as the public_backends/vrt.read_vrtentry point.validate_vrt_capabilityas a public alias on_vrt_validationmatching the epic naming.Backend coverage
The validator is a metadata gate that runs before any backend dispatch, so numpy, cupy, dask+numpy, and dask+cupy all see the same rejections at graph-build / eager-read setup time.
Test plan
xrspatial/geotiff/tests/test_vrt_capability_validator_2371.py(22 tests) covers every new rejection path at four entry points: validator-direct, internal_vrt.read_vrt, public_backends/vrt.read_vrt, andopen_geotiff.test_vrt_resample_alg_1751.pyacceptsVRTUnsupportedErroralongside the historicalNotImplementedError. Both encode the same contract.xrspatial/geotiff/tests/suite passes (5661 passed).Sub-task of epic #2342. Sibling PRs in flight: positive simple-mosaic tests, negative tests for unsupported VRT features, missing-source policy tests, and VRT contract docs.