feat(compat): runtime SDK↔backend version guard at ACP startup#408
Conversation
Complements the build-time cross-version compat tests (#407): on ACP/worker startup, read the backend's reported contract version (/openapi.json info.version) and fail fast with an actionable error if the backend is older than MIN_BACKEND_CONTRACT — instead of the mismatch surfacing later as opaque 500s / missing-field errors (the agentex-sdk 0.13 friction). - agentex/lib/core/compat/version_guard.py: assert_backend_compatible() + MIN_BACKEND_CONTRACT (kept in sync with tests/compat min-supported) + AGENTEX_SKIP_VERSION_CHECK escape hatch; warns (no crash) on unreachable/unknown. - wired into BaseACPServer lifespan (runs before register_agent when AGENTEX_BASE_URL set). - unit tests. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
SemVer §11: a prerelease precedes its stable release (0.1.0-rc.1 < 0.1.0). The old _parse dropped the suffix, so a release-candidate backend compared equal to a stable floor and slipped past the guard even though it may lack the final contract. Parse the prerelease and compare via a SemVer precedence key; a prerelease of a higher version (0.2.0-rc.1) still clears a 0.1.0 floor. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The stable vs prerelease key branches returned different tuple shapes ((maj,min,patch,(1,)) vs (...,(0,list))), so pyright couldn't prove < was defined on the union. Make the 4th element a uniform (rank, identifiers) pair — stable rank 1 > prerelease rank 0 — keeping the ordering identical. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Async/Temporal agents run a separate worker process that never goes through the ACP server lifespan, so the guard there wouldn't cover them. Wire it into AgentexWorker._register_agent (same AGENTEX_BASE_URL gate, before register_agent). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
max-parke-scale
left a comment
There was a problem hiding this comment.
Re-reviewed at 5bbb9356 — worker-startup wiring and prerelease ordering both handled now, nice. Verified the version axis works: scale-agentex main serves info.version from _version.py (#321), so it's self-describing and moves with releases. One inline ask left.
🧑💻🤖 — posted via Claude Code
| return os.environ.get(name, "").strip().lower() in ("1", "true", "yes", "on") | ||
|
|
||
|
|
||
| async def fetch_backend_version(base_url: str, *, timeout: float = 5.0) -> str | None: |
There was a problem hiding this comment.
Nothing tests this — every test mocks fetch_backend_version out. Worth a respx test for the parse paths (missing info, missing version, non-2xx, non-JSON → all should degrade to None).
🧑💻🤖 — posted via Claude Code
There was a problem hiding this comment.
Addressed in 4d9e2c2. fetch_backend_version is now exercised for real via httpx.MockTransport (the function actually runs — request build, status check, JSON parse — just no network):
- success + asserts URL is
…/openapi.jsonand method GET - missing
info.version,infoabsent,info: null→None - HTTP 404 / 503 →
None - non-JSON body →
None httpx.ConnectError→None
Plus end-to-end assert_backend_compatible through the real fetch (old → raises, new → passes, unreachable → proceeds). Writing these caught a real bug in the first test helper (it recursed infinitely), which the mock-everything tests would never have surfaced.
…tests
- Anchor _VERSION_RE at both ends so a malformed tail (0.1.0rc1, 0.1.0foo,
0.1.0.1) is rejected to None ('unknown, proceed') instead of silently
parsing as stable 0.1.0 and satisfying MIN_BACKEND_CONTRACT.
- Test fetch_backend_version for real via httpx.MockTransport (success/URL,
missing version, missing/null info, 404/503, non-JSON, connection error)
plus end-to-end assert_backend_compatible through the real fetch.
- Test the regex anchors explicitly (leading/trailing junk rejected;
whitespace + leading v permitted).
- Test AgentexWorker._register_agent wiring: guard runs before register_agent,
incompatible backend blocks registration, no AGENTEX_BASE_URL skips both.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
What
A runtime SDK↔backend contract-version guard. On ACP/worker startup it reads the backend's reported contract version (
/openapi.jsoninfo.version) and fails fast with an actionable error if the backend is older than this SDK supports — instead of the mismatch surfacing later as opaque 500s / missing-field errors deep in a request.Why
This is the deploy-time complement to the build-time cross-version compat tests (#407). They cover different moments:
#407compat testsSame source of truth — the supported window (
min-supported..current);#321provides the version the backend reports. It directly addresses the agentex-sdk 0.13 friction (e.g. Cengage): a client on an older backend got late, opaque failures (droppedtask_id/agent_id→ 500s) with no startup signal. This turns that into one actionable line at boot, and covers the whole "SDK needs a newer backend" class — not just this one break.How
agentex/lib/core/compat/version_guard.pyassert_backend_compatible(base_url)— fetch/openapi.jsoninfo.version, compare toMIN_BACKEND_CONTRACTvia a SemVer §11 precedence key (a prerelease like0.1.0-rc.1correctly sorts below the stable0.1.0floor), raiseIncompatibleBackendErrorif older.MIN_BACKEND_CONTRACT— kept in sync withtests/compatmin-supported(test(compat): cross-version request-compatibility against supported server contracts #407); version axis from#321tags.AGENTEX_SKIP_VERSION_CHECK=1escape hatch; warns (does not crash) on unreachable/unknown/unparseable version (transient blip or contract-less server shouldn't kill startup).register_agentand gated onAGENTEX_BASE_URL, so a bad pairing fails startup with a clear message rather than serving broken traffic:BaseACPServerlifespan (sync + async ACP servers).AgentexWorker._register_agent(Temporal worker — it never goes through the ACP lifespan, so it needs its own guard).tests/test_version_guard.py): parse, prerelease precedence, compatible-passes, incompatible-raises, prerelease-below-stable-floor-raises, skip-env, unknown-version-no-crash, no-base-url-noop. ✅ 8 passing.Open / follow-ups (draft)
MIN_BACKEND_CONTRACTto a real#321release tag once those land (today it's seeded at0.1.0, mirroringtests/compatmin-supported); ideally derive it from the same manifest so the two can't drift. Interim: a drift-lock test assertingMIN_BACKEND_CONTRACT == min-supported.yaml info.version.current) as a soft warning._server_compat.py(per-response header check). This guard is the boot-time fail-fast via/openapi.json(works today, before traffic); feat(adk): warn when the agentex server reports an unsupported version #410 is the per-response soft warning (dormant until a server setsx-agentex-version). They're complementary — reconcile before un-drafting.Draft for design review — happy to adjust the wiring (e.g. derive
MIN_BACKEND_CONTRACTfromtests/compat/server_specs/manifest.json) before un-drafting.🤖 Generated with Claude Code
Greptile Summary
/openapi.json.MIN_BACKEND_CONTRACTwith prerelease-aware SemVer ordering.Confidence Score: 5/5
The compatibility guard is narrowly scoped to startup checks and is covered across the direct guard behavior plus ACP/worker wiring.
No issues were identified in the reviewed changes; tests cover the expected version comparison, bypass, warning, and startup integration paths.
What T-Rex did
Reviews (3): Last reviewed commit: "fix(compat): anchor version regex + add ..." | Re-trigger Greptile