fix(pr-review): eliminate stale legacy worktree + signal HEAD to resumed judges by jacsamell · Pull Request #176 · aetheronhq/agent-cube

jacsamell · 2026-05-19T12:55:38Z

Symptom

Even after #175 landed (--force fetch), cube kept posting the same 10+ findings against pre-rewrite code on aetheronhq/aetheron-connect-v2#1266. Roy's diagnosis: cube is anchored on diff line numbers from an earlier commit.

Two more bugs surfaced

1. Duplicate worktrees, only one synced

PR reviews historically used `~/.cube/worktrees//pr-/` (legacy, created by `_prefetch_worktrees`). PR #172 added a second one at `pr-review-/` for the new file-routing flow. Both existed in parallel; only the new one had #175's --force + HEAD verification.

Worse: `build_peer_review_prompt` told judges to Read from the LEGACY path while `_apply_file_scope` pointed them at the NEW path. Same prompt, two locations, judges sometimes picked the stale one.

Confirmed live: both worktrees existed on PR #1266; the legacy one was at fda636a3 (6 commits behind the actual head 62e9fbd1).

2. Resumed judges had stale session memory

Sessions persist across panel runs (token-efficient). On the next run, the LLM continues its prior conversation including its prior reasoning about file contents and line numbers. Even when the resumed judge Reads fresh files, its reasoning anchors on the old memory. The previous "RE-REVIEW" directive was too soft to displace stale memory.

Fixes

Skip `_prefetch_worktrees` in PR-review mode when a dedicated `review_worktree` was passed. The new sync is authoritative.
Pipe `review_worktree` through `build_peer_review_prompt` so the prompt's worktree path matches the file-scope block. One location.
Replace the meek "verify commit" hint with a hard "CURRENT HEAD" callout: names the SHA, says "your prior reasoning may be out of date", lists 4 explicit steps (re-Read fresh, verify each prior finding still applies, add new ones, drop the rest).

Test plan

Force-push a PR, re-run `cube pr-review ` (no `--fresh`) — judges receive the HEAD callout, re-raise findings only if they're still real
Only one worktree exists per PR review (no legacy `pr-/` duplicate created)
Prompt's "Code Location" path matches `_apply_file_scope`'s path

🤖 Generated with Claude Code

Context

Judges posted stale findings after PRs were force-pushed. Two root causes: (1) duplicate worktrees — legacy pr-/ worktrees could remain unsynced while pr-review-/ (the new path) received --force updates, and prompts pointed judges to the stale legacy path; (2) resumed judges retained stale session memory across panel runs so prior reasoning could override fresh reads.

Changes

python/cube/automation/judge_panel.py

When a dedicated review_worktree is provided for PR-review runs (with a LOCAL: winner), skip the legacy _prefetch_worktrees sync path.
Compute HEAD SHA from the provided worktree (git rev-parse HEAD) and pass review_worktree into prompt construction so prompt paths match the file-scope used by _apply_file_scope.

python/cube/automation/prompts.py

Add worktree_path_override: Optional[Path] = None to build_peer_review_prompt(...).
Replace the previous soft "VERIFY COMMIT" hint with a hard "CURRENT HEAD" callout that names the SHA, warns prior reasoning may be out of date, and lists four explicit steps: re-read fresh files at HEAD, verify each prior finding still applies, add new findings, and drop invalid ones.
When worktree_path_override is given, instruct judges to read from that exact path.

python/cube/commands/peer_review.py

Restore the single per-PR worktree path (~/.cube/worktrees/<repo>/pr-<n>) instead of pr-review-<n>/, delete orphan pr-review-<n>/ directories and run git worktree prune in commits to ensure one canonical worktree per PR.

Impact

Eliminates stale duplicate-worktree behaviour by ensuring a single, caller-synced worktree receives the --force fetch and HEAD verification.
Prompts now point judges at the same worktree path used for file scoping.
Resumed judges are explicitly forced to re-evaluate prior findings against the current HEAD, reducing false positives from stale session memory.

Test plan

Force-push a PR and run cube pr-review <n> (without --fresh): judges should see the HEAD callout and only re-raise findings that still apply.
Confirm only one worktree exists per PR review (no legacy duplicate).
Confirm prompt "Code Location" path matches _apply_file_scope's path.

…med judges Symptom: cube panel kept posting findings against pre-rewrite code on aetheronhq/aetheron-connect-v2#1266 even after PR #175's --force fetch landed. Same 10+ findings against lines/tables that no longer exist. Two more layers of the bug surfaced: (1) Duplicate worktrees, only one synced. PR reviews historically used `~/.cube/worktrees/<project>/pr-<n>/` (legacy, created by `_prefetch_worktrees`). PR #172 added a second one at `pr-review-<n>/` for the new file-routing flow. Both worktrees existed in parallel; only the new one had the --force fetch + HEAD verification from #175. The other stayed pinned to whatever it was last synced to. Worse: `build_peer_review_prompt` told judges to Read from the LEGACY path, while `_apply_file_scope` told them to Read from the NEW path. Same prompt, two locations. Confirmed live: both worktrees existed on PR #1266; the legacy one was stuck at fda636a3 (6 commits behind 62e9fbd1). (2) Resumed judges had stale session memory. Sessions persist across panel runs. On the next run, the LLM continues its prior conversation — including its prior reasoning about file contents, line numbers, function shapes. Even when the resumed judge Reads fresh files, its reasoning anchors on the old memory. The previous "RE-REVIEW" hint was too soft to displace it. Fixes: * Skip `_prefetch_worktrees` in PR-review mode when a dedicated `review_worktree` was passed. The new sync is authoritative; the legacy sync just made a second stale copy. * Pipe `review_worktree` through `build_peer_review_prompt` so the prompt's worktree path matches the file-scope block. One location. * Replace the meek "verify commit" hint with a hard "CURRENT HEAD" callout that names the SHA, says "your prior reasoning may be out of date", and lists 4 explicit steps: Re-Read fresh, verify each prior finding still applies, add new ones, drop the rest. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

coderabbitai · 2026-05-19T12:58:07Z

Caution

Review failed

Pull request was closed or merged during review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 664d17dd-5365-4a5e-b984-fa3dc3dd6fb1

📥 Commits

Reviewing files that changed from the base of the PR and between 621edf2 and 5845ca1.

📒 Files selected for processing (1)

python/cube/commands/peer_review.py

📜 Recent review details

🔇 Additional comments (1)

python/cube/commands/peer_review.py (1)

230-235: LGTM!

Walkthrough

The PR updates judge-panel PR-review logic to use a caller-provided dedicated review worktree instead of relying on potentially stale synced duplicates. It derives the PR HEAD SHA from that worktree directly, standardises the per-PR worktree path, and routes the path into peer-review prompts which instruct judges to re-read all code at the current HEAD before reviewing.

Changes

Dedicated review worktree integration

Layer / File(s)	Summary
Prompt parameter and current HEAD verification `python/cube/automation/prompts.py`	`build_peer_review_prompt` signature extended with optional `worktree_path_override` parameter; docstring updated to describe how the override guides judges to the correct worktree path; prompt text replaces prior commit-verification block with expanded "CURRENT HEAD" section that instructs re-reading all files at current HEAD, removing invalidated findings, and running a `git rev-parse HEAD` sanity check.
Worktree resolution and prompt wiring `python/cube/automation/judge_panel.py`, `python/cube/commands/peer_review.py`	`launch_judge_panel` now conditionally handles PR review: when `review_type != "ui-review"` and a `review_worktree` is provided alongside a `LOCAL:` form winner, it derives the PR HEAD SHA by running `git rev-parse HEAD` in that worktree, bypassing the legacy `_prefetch_worktrees` path; `_ensure_pr_review_worktree` now reuses the unified `pr-<pr_number>` per-repo worktree directory. The derived worktree path is routed to `build_peer_review_prompt` via the new `worktree_path_override` argument, ensuring judges use the caller-provided worktree.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

aetheronhq/agent-cube#175: Both PRs modify PR peer-review worktree setup and address stale worktree HEAD correctness.
aetheronhq/agent-cube#172: Both PRs change judge-panel and prompt flows to accept a caller-provided review_worktree and adjust prompt instructions accordingly.
aetheronhq/agent-cube#155: Related changes to peer/judge review prompt verification discipline and reading code facts from the correct HEAD.

Poem

🐰 I hopped to the worktree that you gave,
I sniffed the HEAD, I read each brave save,
"Re-read the files!" I softly plea,
Fresh code, fresh findings — that's the key.
🥕✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly addresses the primary fix: eliminating stale legacy worktrees and adding HEAD signalling to judges, matching the core objectives of consolidating duplicate worktrees and improving judge context.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

python/cube/automation/judge_panel.py (1)

518-518: ⚡ Quick win

Move subprocess import to module level.

Importing inside a conditional block is non-idiomatic Python. Module-level imports improve readability and make dependencies easier to track.

♻️ Proposed refactor

At the top of the file, after the existing imports (around line 3):

 import asyncio
+import subprocess
 from datetime import datetime

Then remove line 518:

         if is_pr_review_with_dedicated_worktree:
             # Compute SHA from the worktree the caller synced.
-            import subprocess

             sha_result = subprocess.run(

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@python/cube/automation/judge_panel.py` at line 518, Move the local "import
subprocess" out of the conditional and add it to the module-level imports in
python/cube/automation/judge_panel.py alongside the other top imports, then
delete the in-function/conditional import site (the inline "import subprocess"
currently present) so the module consistently uses the top-level subprocess
symbol; run linting/tests to ensure no circular import or side-effect issues
after the change.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@python/cube/automation/judge_panel.py`:
- Around line 520-527: The code silently sets worktree_head_sha to None when the
git call fails; change this to fail-fast or at least emit a clear warning: after
subprocess.run(...) check sha_result.returncode and if non-zero either raise an
exception (so the caller of the function in judge_panel.py fails fast) or call
the module logger to log an explicit error/warning including sha_result.stderr
and the review_worktree path, instead of continuing with worktree_head_sha=None;
update any callers that expect a string accordingly so the HEAD verification
block (which uses worktree_head_sha) always receives a valid SHA or the flow is
aborted with an explanatory error.

---

Nitpick comments:
In `@python/cube/automation/judge_panel.py`:
- Line 518: Move the local "import subprocess" out of the conditional and add it
to the module-level imports in python/cube/automation/judge_panel.py alongside
the other top imports, then delete the in-function/conditional import site (the
inline "import subprocess" currently present) so the module consistently uses
the top-level subprocess symbol; run linting/tests to ensure no circular import
or side-effect issues after the change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7e2077f8-bd70-4b16-a06e-1bf46588afd4

📥 Commits

Reviewing files that changed from the base of the PR and between 09bf3d4 and 621edf2.

📒 Files selected for processing (2)

python/cube/automation/judge_panel.py
python/cube/automation/prompts.py

📜 Review details

🔇 Additional comments (5)

python/cube/automation/prompts.py (4)

4-4: LGTM!

386-397: LGTM!

398-410: LGTM!

413-433: LGTM!

python/cube/automation/judge_panel.py (1)

601-608: LGTM!

coderabbitai · 2026-05-19T13:01:33Z

+            sha_result = subprocess.run(
+                ["git", "rev-parse", "HEAD"],
+                cwd=review_worktree,
+                capture_output=True,
+                text=True,
+                timeout=10,
+            )
+            worktree_head_sha = sha_result.stdout.strip() if sha_result.returncode == 0 else None


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Silent failure defeats the HEAD verification fix.

If git rev-parse HEAD fails (e.g., corrupted worktree, not a git repository), worktree_head_sha is set to None, converted to "" at line 606, and the entire verification block is skipped in the prompt. Judges don't receive the explicit "CURRENT HEAD" instructions to re-read all files, allowing stale session memory to persist — one of the two bugs this PR aims to fix.

Raise an error when SHA extraction fails, or at minimum log a warning.

🛡️ Proposed fix to fail fast when SHA extraction fails

sha_result = subprocess.run( ["git", "rev-parse", "HEAD"], cwd=review_worktree, capture_output=True, text=True, timeout=10, ) - worktree_head_sha = sha_result.stdout.strip() if sha_result.returncode == 0 else None + if sha_result.returncode != 0: + raise RuntimeError( + f"Failed to extract HEAD SHA from review worktree {review_worktree}. " + f"Cannot verify code freshness. stderr: {sha_result.stderr}" + ) + worktree_head_sha = sha_result.stdout.strip()

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@python/cube/automation/judge_panel.py` around lines 520 - 527, The code silently sets worktree_head_sha to None when the git call fails; change this to fail-fast or at least emit a clear warning: after subprocess.run(...) check sha_result.returncode and if non-zero either raise an exception (so the caller of the function in judge_panel.py fails fast) or call the module logger to log an explicit error/warning including sha_result.stderr and the review_worktree path, instead of continuing with worktree_head_sha=None; update any callers that expect a string accordingly so the HEAD verification block (which uses worktree_head_sha) always receives a valid SHA or the flow is aborted with an explanatory error.

…<n>/ PR #172 invented `pr-review-<n>/` as a parallel worktree to hold the file-routing flow's check-out. The original `pr-<n>/` (managed by `_prefetch_worktrees` for the cube panel + `_get_cli_review_worktrees` for CodeRabbit) kept existing alongside it. Result: two worktrees per PR, only one kept synced by the new code, prompts pointing at the wrong one, judges Reading stale content. Reverting the path so the file-routing flow reuses `pr-<n>/`. Combined with the #176 skip-`_prefetch_worktrees`-when-already-synced logic, this gives one worktree per PR with the new --force fetch + HEAD verification applied to it. CodeRabbit and cube judges both read from the same place. `pr-review-<n>/` orphan dirs deleted from the local cache; `git worktree prune` cleaned the stale refs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

coderabbitai Bot reviewed May 19, 2026

View reviewed changes

jacsamell merged commit 7740fa4 into main May 19, 2026
0 of 2 checks passed

jacsamell deleted the fix/pr-review-stale-worktree-and-head-signal branch May 19, 2026 13:20

coderabbitai Bot mentioned this pull request May 19, 2026

fix: judge cwd + shared cube state across worktrees (rebase of #182+#183) #184

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(pr-review): eliminate stale legacy worktree + signal HEAD to resumed judges#176

fix(pr-review): eliminate stale legacy worktree + signal HEAD to resumed judges#176
jacsamell merged 2 commits into
mainfrom
fix/pr-review-stale-worktree-and-head-signal

jacsamell commented May 19, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 19, 2026 •

edited

Loading

Review failed

Review ran into problems

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jacsamell commented May 19, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Symptom

Two more bugs surfaced

1. Duplicate worktrees, only one synced

2. Resumed judges had stale session memory

Fixes

Test plan

Context

Changes

Impact

Test plan

Uh oh!

coderabbitai Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Review ran into problems

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jacsamell commented May 19, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 19, 2026 •

edited

Loading