Skip to content

fix(persona/PO-QA): tighten MET vs PARTIAL — runtime steps don't downgrade#237

Merged
jacsamell merged 1 commit into
mainfrom
fix/poqa-partial-not-for-runtime-verification
May 25, 2026
Merged

fix(persona/PO-QA): tighten MET vs PARTIAL — runtime steps don't downgrade#237
jacsamell merged 1 commit into
mainfrom
fix/poqa-partial-not-for-runtime-verification

Conversation

@jacsamell
Copy link
Copy Markdown
Contributor

@jacsamell jacsamell commented May 25, 2026

Summary

Anti-pattern Jacob caught: PO/QA grading an AC PARTIAL because "X is wired correctly but is a post-apply verification step from the test plan, not assertable from the diff".

That isn't a partial state — the code change is complete; only the runtime / deploy / manual step lives outside the diff. Calling it PARTIAL silently holds the auto-approve gate (per auto_approve.py: any PO/QA PARTIAL = hesitation = drop to COMMENT).

Change

Adds explicit anchor definitions to the PO/QA persona's grading bar:

  • MET = code wiring is right. Runtime / deploy / manual-verify steps that aren't in the diff DO NOT downgrade.
  • PARTIAL = the code itself has a concrete gap (handler missing the call; conditional misses an enum case; field not wired).
  • MISSED = code change doesn't address the AC at all.

Plus an explicit anti-pattern guard near the bottom: "If you find yourself writing 'wired correctly but not assertable from the diff' — that's MET with whatever caveat about test-plan steps lives in the body. Never PARTIAL."

Test plan

  • 506/506 tests pass (persona is a prompt string, no code behaviour change)
  • mypy clean
  • ruff clean
  • Scope: 1 file (judge_personas.py), one section edit

🤖 Generated with Claude Code

Summary

Updated the PO/QA persona grading prompt to establish explicit definitions for MET, PARTIAL, and MISSED acceptance criteria verdicts. The change clarifies that runtime, deployment, or manual-verification steps existing outside the code diff do not downgrade a verdict from MET to PARTIAL.

Key clarifications

  • MET: Code implementation is correct; runtime steps beyond the diff don't downgrade the verdict
  • PARTIAL: Concrete code-level gaps only (e.g., missing handler calls, incomplete field wiring, missing enum cases)
  • MISSED: AC unaddressed in code

The update includes an explicit anti-pattern guard: graders noting "wired correctly but not assertable from the diff" should mark MET with caveats in the comment, not PARTIAL. This prevents the gate from being incorrectly blocked when code changes are complete and only non-diff test-plan steps remain.

Change scope

Single file (judge_personas.py), 7 lines added. Persona text only; no runtime behaviour change.

Review Change Stack

… don't downgrade

Anti-pattern observed: PO/QA grading an AC PARTIAL because 'SSM
param population is wired correctly but is a post-apply
verification step from the test plan, not assertable from the
diff'.

That is not a partial state. The code change is complete; only the
runtime / deploy / manual step lives outside the diff. Calling it
PARTIAL silently holds the auto-approve gate for no real reason
(per the gate's existing rule: any PO/QA PARTIAL = hesitation =
drop to COMMENT, see auto_approve.py).

Tighten the persona's grading bar with explicit anchor definitions:
- MET = code wiring is right. Runtime / deploy / manual-verify
  steps that aren't in the diff DO NOT downgrade.
- PARTIAL = the code itself has a concrete gap (handler missing
  the call; conditional misses an enum case; field not wired).
- MISSED = code change doesn't address the AC at all.

Plus an explicit anti-pattern guard: 'If you find yourself writing
"wired correctly but not assertable from the diff" — that's MET
with a caveat in the body. Never PARTIAL.'

Tests: 506/506 still pass. mypy + ruff clean. Personas are prompt
strings — no behaviour change in code, just the prompt the PO/QA
judge sees.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 25, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 87389fba-5ef4-4c26-84e1-744d95e177e5

📥 Commits

Reviewing files that changed from the base of the PR and between 62dbc6e and 924e673.

📒 Files selected for processing (1)
  • python/cube/core/judge_personas.py
📜 Recent review details
🔇 Additional comments (1)
python/cube/core/judge_personas.py (1)

365-370: LGTM!


Walkthrough

This PR updates the _PRODUCT_OWNER_QA prompt template in python/cube/core/judge_personas.py by adding 7 lines of detailed grading-bar definitions. The change introduces explicit criteria for MET, PARTIAL, and MISSED acceptance grades, clarifies how deploy and manual-verification steps should be treated during grading, and reinforces that grading should centre on code-change assertions rather than general reviewer caveats.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

  • aetheronhq/agent-cube#155: Both PRs update python/cube/core/judge_personas.py prompt text to tighten how judges grade findings based on verifiable, cited code facts (main PR via MET/PARTIAL/MISSED grading-bar rules; retrieved PR via "Verification discipline" downgrade-to-question requirements).
  • aetheronhq/agent-cube#152: Both PRs modify the judge-persona prompts in python/cube/core/judge_personas.py, specifically the Product Owner QA persona's grading/decision expectations for MET/PARTIAL/MISSED (within the larger 5-seat persona panel + AC/UX-capable persona overhaul).
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title directly and specifically describes the main change: clarifying that runtime/deploy verification steps outside the diff should not cause a PARTIAL grade (downgrade) when the code change itself is complete.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@the-agent-cube the-agent-cube Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Agent Cube Review

✅ All 5/5 judges approved this PR.

✅ Auto-approved by cube — 5/5 judges APPROVED, zero review blockers.

Per-judge:

  • judge_1: APPROVED
  • judge_2: APPROVED
  • judge_3: APPROVED
  • judge_4: APPROVED
  • judge_5: APPROVED

CI runs in parallel and continues to gate merge; cube does not race it for approval.


🤖 Agent Cube Peer Review

@jacsamell jacsamell merged commit 0853888 into main May 25, 2026
2 checks passed
@jacsamell jacsamell deleted the fix/poqa-partial-not-for-runtime-verification branch May 25, 2026 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant