fix(persona/PO-QA): tighten MET vs PARTIAL — runtime steps don't downgrade#237
Conversation
… don't downgrade Anti-pattern observed: PO/QA grading an AC PARTIAL because 'SSM param population is wired correctly but is a post-apply verification step from the test plan, not assertable from the diff'. That is not a partial state. The code change is complete; only the runtime / deploy / manual step lives outside the diff. Calling it PARTIAL silently holds the auto-approve gate for no real reason (per the gate's existing rule: any PO/QA PARTIAL = hesitation = drop to COMMENT, see auto_approve.py). Tighten the persona's grading bar with explicit anchor definitions: - MET = code wiring is right. Runtime / deploy / manual-verify steps that aren't in the diff DO NOT downgrade. - PARTIAL = the code itself has a concrete gap (handler missing the call; conditional misses an enum case; field not wired). - MISSED = code change doesn't address the AC at all. Plus an explicit anti-pattern guard: 'If you find yourself writing "wired correctly but not assertable from the diff" — that's MET with a caveat in the body. Never PARTIAL.' Tests: 506/506 still pass. mypy + ruff clean. Personas are prompt strings — no behaviour change in code, just the prompt the PO/QA judge sees. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📜 Recent review details🔇 Additional comments (1)
WalkthroughThis PR updates the Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Comment |
There was a problem hiding this comment.
🤖 Agent Cube Review
✅ All 5/5 judges approved this PR.
✅ Auto-approved by cube — 5/5 judges APPROVED, zero review blockers.
Per-judge:
- judge_1: APPROVED
- judge_2: APPROVED
- judge_3: APPROVED
- judge_4: APPROVED
- judge_5: APPROVED
CI runs in parallel and continues to gate merge; cube does not race it for approval.
🤖 Agent Cube Peer Review
Summary
Anti-pattern Jacob caught: PO/QA grading an AC
PARTIALbecause "X is wired correctly but is a post-apply verification step from the test plan, not assertable from the diff".That isn't a partial state — the code change is complete; only the runtime / deploy / manual step lives outside the diff. Calling it
PARTIALsilently holds the auto-approve gate (perauto_approve.py: any PO/QA PARTIAL = hesitation = drop to COMMENT).Change
Adds explicit anchor definitions to the PO/QA persona's grading bar:
MET= code wiring is right. Runtime / deploy / manual-verify steps that aren't in the diff DO NOT downgrade.PARTIAL= the code itself has a concrete gap (handler missing the call; conditional misses an enum case; field not wired).MISSED= code change doesn't address the AC at all.Plus an explicit anti-pattern guard near the bottom: "If you find yourself writing 'wired correctly but not assertable from the diff' — that's MET with whatever caveat about test-plan steps lives in the body. Never PARTIAL."
Test plan
🤖 Generated with Claude Code
Summary
Updated the PO/QA persona grading prompt to establish explicit definitions for
MET,PARTIAL, andMISSEDacceptance criteria verdicts. The change clarifies that runtime, deployment, or manual-verification steps existing outside the code diff do not downgrade a verdict fromMETtoPARTIAL.Key clarifications
MET: Code implementation is correct; runtime steps beyond the diff don't downgrade the verdictPARTIAL: Concrete code-level gaps only (e.g., missing handler calls, incomplete field wiring, missing enum cases)MISSED: AC unaddressed in codeThe update includes an explicit anti-pattern guard: graders noting "wired correctly but not assertable from the diff" should mark
METwith caveats in the comment, notPARTIAL. This prevents the gate from being incorrectly blocked when code changes are complete and only non-diff test-plan steps remain.Change scope
Single file (
judge_personas.py), 7 lines added. Persona text only; no runtime behaviour change.