test: run the unit suite against both validation engines (#467)#487
Open
SJrX wants to merge 1 commit into
Open
test: run the unit suite against both validation engines (#467)#487SJrX wants to merge 1 commit into
SJrX wants to merge 1 commit into
Conversation
To gain confidence before any of this is released, run the whole suite twice — once on the original SyntacticMatch/SemanticMatch engine and once on the new list-of-successes engine — rather than relying on a couple of e2e tests. - GrammarOptionValue.FORCE_PARSE_ENGINE: a system property (-Dsystemd.unit.grammarParseEngine=true) forces validation onto the new engine, independent of the per-project experimental flag. Only validation is forced; the cosmetic annotators (coloring, key marker) stay on the user flag, so problem COUNTS are unchanged and only exact error spans can differ between engines. - build.gradle.kts forwards the property to the forked test JVM. - CI (main.yml) gains a grammarParseEngine: [false, true] matrix dimension, running both engines in parallel. Result: the entire suite passes under BOTH engines. The only test needing engine-aware expectations is InvalidValueInspectionForCGroupSocketBind, where the new engine localizes two errors differently (and arguably better): "ipv6::tcp" highlights ":tcp" after the consumed "ipv6:" rather than "::tcp"; "12--21485" points at the out-of-range port "-21485" rather than the dash. Counts/validity match everywhere else — strong evidence the new engine is at parity. Refs #467 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Unit Test Results (grammar engine false)1 159 tests 1 159 ✅ 50s ⏱️ Results for commit 5b746be. |
Unit Test Results (grammar engine true)1 159 tests 1 159 ✅ 49s ⏱️ Results for commit 5b746be. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Runs the whole unit-test suite against both validation engines — the original
SyntacticMatch/SemanticMatchand the new list-of-successesparse()— so the new engine is exercised across every grammar before any of this is released, rather than trusted on a handful of e2e tests.How
GrammarOptionValue.FORCE_PARSE_ENGINE— a system property (-Dsystemd.unit.grammarParseEngine=true) forces validation onto the new engine regardless of the per-project experimental flag. Crucially it forces only validation — the cosmetic annotators (coloring, key marker, deprecation, IPv6) stay on the user flag. So problem counts are unchanged between engines and only exact error spans/messages can differ. That keeps the parity comparison clean (otherwise the cosmetic annotators'INFORMATIONhighlights would break everyassertSize).build.gradle.ktsforwards the property to the forked test JVM. Run locally with:main.yml) gains agrammarParseEngine: [false, true]matrix dimension — both engines run in parallel, each publishing its own test results. (Mirror this on the Kubernetes jobs by duplicating with the-Dflag.)Result — strong parity signal
The entire suite passes under both engines. The only test needing engine-aware expectations is
InvalidValueInspectionForCGroupSocketBind, where the new engine localizes two errors differently (and arguably better):ipv6::tcp→ highlights:tcp(after consumingipv6:) vs::tcpipv6:tcp:12--21485→ points at the out-of-range port-21485vs the dash-Counts and validity verdicts match everywhere else — i.e. the new engine reproduces the original's validation behavior across all grammars, differing only in where a couple of errors are underlined.
Refs #467
🤖 Generated with Claude Code