Skip to content

feat(panel): per-judge file routing for PR reviews (~3-5x token savings)#172

Merged
jacsamell merged 3 commits into
mainfrom
feat/per-judge-file-routing
May 19, 2026
Merged

feat(panel): per-judge file routing for PR reviews (~3-5x token savings)#172
jacsamell merged 3 commits into
mainfrom
feat/per-judge-file-routing

Conversation

@jacsamell
Copy link
Copy Markdown
Contributor

@jacsamell jacsamell commented May 19, 2026

Summary

The 5-seat panel currently makes every judge reason over every file in the PR. For mid-sized PRs that's mostly wasted tokens — the Frontend judge doesn't need to see SQL migrations, the Backend judge doesn't need to read `.tsx`. Route files to the judges whose lens they fall in; let cross-cutting judges (PO/QA, Principal, Security) keep seeing everything because their scope genuinely spans the PR.

What landed

  • automation/file_routing.py — pure-heuristic classifier (no AI calls). Files get tagged with one or more lenses (frontend / backend / infra / other). `files_for_persona(persona, files)` returns the subset that persona should focus on. Unknown personas safely fall back to all files.
  • PR review worktree — `_ensure_pr_review_worktree(pr, head, base)` syncs `~/.cube/worktrees//pr-review-/` to `origin/` and enumerates touched files via `git diff --name-only`. Reused across panel runs.
  • JudgeInfo extension — carries scoped files, worktree path, total PR file count.
  • `_apply_file_scope()` prepends a "Files in your lens (N of M)" section + worktree path to each judge's prompt, instructing them to use Read on absolute paths instead of pulling the whole PR diff.
  • Empty-lens guard — Backend judge on a frontend-only PR (or vice versa) is dropped from the panel entirely. The gate already tolerates absent decisions.
  • Cross-cutting routing stays loose — Security sees everything because security findings live in client code too (XSS, exposed secrets in browser code, auth bypasses in frontend routing).
  • 33 unit tests pin classifier behaviour so the routing rules don't silently drift.

Expected token savings

  • Mid-sized PR (12 backend / 4 frontend / 2 infra files):
    • Backend judge: same 14 files (was 18 before; .tsx + .css dropped)
    • Frontend judge: 4 files (was 18) — ~78% less input on this seat
    • PO/QA, Principal, Security: unchanged (still see all 18)
  • Frontend-only PR: Backend judge dropped entirely (one full seat saved)
  • Backend-only PR: Frontend judge dropped entirely (one full seat saved)

Risk

  • Heuristic could mis-route (e.g. a `.ts` file in `src/` not under `components/` defaults to backend). Test suite locks down the cases we care about; new patterns are easy to add.
  • Cross-cutting personas (PO/QA, Principal, Security) are intentionally unrouted — they still see everything. No behaviour change for them.
  • Personas not in the map fall back to "all files" — custom org personas are safe.

Test plan

  • Run `cube pr-review` on a mid-sized PR; confirm Backend judge prompt does NOT list `.tsx` files
  • Confirm Frontend judge prompt lists only `.tsx` / `.css` / frontend-dir files
  • Confirm PO/QA / Principal / Security prompts still list all PR files
  • Run on a docs-only PR; confirm Backend + Frontend judges are SKIPPED, only cross-cutting judges run
  • Confirm `~/.cube/worktrees//pr-review-/` exists and is checked out to `origin/` after the run

🤖 Generated with Claude Code

Per‑Judge File Routing for PR Reviews — Updated Summary

Overview

  • Routes PR files to panel judges using deterministic folder‑prefix heuristics so each seat sees only files in their lens (frontend / backend / shared / infra / other). Cross‑cutting and unknown personas still see all files. Judges with zero scoped files are dropped (empty‑lens guard), reducing tokens for scoped seats and enabling whole‑seat savings when a judge is skipped.

Key changes

  • python/cube/automation/file_routing.py

    • New folder‑prefix rule set (_FOLDER_RULES) with first‑match wins; replaced previous extension‑based logic.
    • classify(path) normalises paths (POSIX separators, lowercasing, removes leading "./") and returns a single lens via FileClassification(lens).
    • Adds LENSES = ("frontend","backend","shared","infra","other"), PERSONA_LENS_MAP mapping personas → frozenset(lenses) or None (cross‑cutting).
    • files_for_persona(persona, files) returns a deterministic, sorted subset; lens_summary(files) returns per‑lens counts.
    • Includes tests and a normalization fix (use removeprefix behavior, not lstrip).
  • python/cube/automation/judge_panel.py

    • Adds _apply_file_scope() to prepend a "FILES IN YOUR LENS (N of M)" block with scoped file list, review worktree path and read/diff instructions; emits a SKIPPED decision instruction for empty lenses.
    • launch_judge_panel(...) gains review_worktree and pr_files parameters, computes scoped_files per JudgeInfo, injects scoped_files/review_worktree/total_pr_files into JudgeInfo, and filters out seats whose lens yields zero files (preserving cross‑cutting/generic seats as intended).
  • python/cube/commands/peer_review.py

    • Adds _ensure_pr_review_worktree(pr_number, head_branch, base_branch) to fetch refs and reuse/create a worktree at ~/.cube/worktrees//pr-review- checked out at origin/, and enumerate PR‑touched files via git diff --name-only --diff-filter=AMR.
    • PR review flow now syncs the worktree up front, prints worktree path and routed counts (warns on enumeration failure), and passes review_worktree + pr_files into launch_judge_panel (falls back to full diff on failure).
  • python/cube/models/types.py

    • JudgeInfo extended with optional scoped_files: Optional[list[str]], review_worktree: Optional[Path], total_pr_files: Optional[int] for per‑judge context.

Testing

  • tests/core/test_file_routing.py: 33 unit tests covering classify(), files_for_persona(), lens_summary(); verifies folder rules, case‑insensitivity, leading ./ handling, first‑match precedence, persona filtering, empty‑lens behaviour, deterministic sorting and lens counts.
  • tests/cli/test_adapters.py: minor adjustments removing --full-auto from expected codex invocations.

Impact & risks

  • Expected substantial token savings for scoped judges (example: frontend judge input ~78% smaller in a mixed PR); entire seat savings when a judge is dropped.
  • Risk: heuristic mis‑routing possible. Mitigations: unit tests, clear first‑match rules, easily extendable prefix rules; cross‑cutting and unknown personas preserve previous behaviour.

Notes

  • Behavioural change: classification is now single‑lens per file (not multi‑lens) and routing is folder‑prefix based with deterministic ordering. Minor wording tweak avoids confusing "remaining 0 files" messaging when a lens covers the whole PR.

Review Change Stack

The 5-seat panel currently makes every judge reason over every file in
the PR. For mid-sized PRs that's mostly wasted tokens — the Frontend
judge doesn't need to see SQL migrations, the Backend judge doesn't
need to read `.tsx`. Route files to the judges whose lens they fall in;
let cross-cutting judges (PO/QA, Principal, Security) keep seeing
everything because their scope genuinely spans the PR.

What landed:

1. `automation/file_routing.py` — pure-heuristic classifier. Each file
   gets one or more lens tags (frontend / backend / infra / other).
   `files_for_persona(persona, files)` returns the subset a persona
   should focus on. Unknown personas safely fall back to all files.

2. PR review worktree. `_ensure_pr_review_worktree(pr, head, base)`
   syncs `~/.cube/worktrees/<project>/pr-review-<n>/` to `origin/<head>`
   and enumerates touched files via `git diff --name-only`. Reused
   across panel runs.

3. `JudgeInfo` carries the routing context — scoped files, worktree
   path, total PR file count.

4. `_apply_file_scope()` prepends a "Files in your lens (N of M)"
   section + worktree path to each judge's prompt, instructing them
   to use Read directly on absolute paths instead of pulling the
   whole PR diff.

5. Empty-lens guard: if a frontend-only PR routes zero files to the
   Backend judge (or vice versa), that judge is dropped from the
   panel entirely. The gate already tolerates absent decisions.

6. Cross-cutting routing stays loose: Security sees everything because
   security findings live in client code too (XSS, exposed secrets in
   browser code, auth bypasses in frontend routing).

7. Unit tests pin down classifier behaviour so the routing rules don't
   silently drift (33 cases).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 19, 2026

Walkthrough

Adds deterministic single-lens file classification and persona-to-lens routing, injects per-judge file-scope instructions into judge prompts, syncs a per-PR worktree and enumerates touched files, and updates tests and types to carry scoped review context.

Changes

File-scoped PR review routing

Layer / File(s) Summary
File routing module and tests
python/cube/automation/file_routing.py, tests/core/test_file_routing.py
Adds classify() returning a single lens per path, defines ordered _FOLDER_RULES and LENSES, maps personas to allowed single-lens sets in PERSONA_LENS_MAP, implements files_for_persona() (sorted scoped subset) and lens_summary() (counts per lens), with tests validating classification, persona visibility, sorting and counts.
JudgeInfo type extensions
python/cube/models/types.py
Extends JudgeInfo with optional fields scoped_files, review_worktree, and total_pr_files to carry per-judge file scope and worktree context.
Judge panel file scope application
python/cube/automation/judge_panel.py
Extends launch_judge_panel to accept review_worktree and pr_files, computes lens breakdowns, filters judge seats by persona-scoped files (preserving pinned generic seats), and injects a # FILE SCOPE block or SKIPPED directive into persona prompts.
PR review worktree syncing and file enumeration
python/cube/commands/peer_review.py
Adds _ensure_pr_review_worktree to fetch refs and create/reset a per-PR detached worktree under ~/.cube/worktrees/..., enumerates touched files using git diff --name-only --diff-filter=AMR, and wires review_worktree/pr_files into _run_pr_review to pass into the judge panel.
Adapter test CLI adjustments
tests/cli/test_adapters.py
Removes --full-auto from expected codex exec arguments and updates stdin-related assertions to expect None.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • aetheronhq/agent-cube#153: Modifies peer-review flow in python/cube/commands/peer_review.py and related review orchestration.

"I'm a rabbit with a routing map,
I hop through folders, quiet and apt,
Frontend, backend, shared or infra,
I nudge each judge to the files they know,
Hooray — neat scopes for reviews that flow!"

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately captures the main change: introducing per-judge file routing for PR reviews to achieve significant token savings, which aligns with the core objective of scoping files to judges based on their personas.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
python/cube/automation/judge_panel.py (1)

529-545: 💤 Low value

Judge skipping logic may be too narrow.

The current condition only skips judges when jconfig.persona in {"backend-generic", "frontend-generic"}. If a new scoped persona is added to PERSONA_LENS_MAP in the future, it won't be skipped when its lens has zero files—the check would need updating. Consider deriving the "skippable" set from PERSONA_LENS_MAP (personas with non-None values) rather than hardcoding the persona names.

🔧 Proposed approach using PERSONA_LENS_MAP
         kept: list = []
         for jconfig in judge_configs:
             scoped = files_for_persona(jconfig.persona, pr_files)
-            if not scoped and jconfig.persona in {"backend-generic", "frontend-generic"}:
+            # Skip lens-scoped personas (non-None in PERSONA_LENS_MAP) when they have no files.
+            # Cross-cutting personas (None) always see all files so scoped is never empty for them.
+            from .file_routing import PERSONA_LENS_MAP
+            is_lens_scoped = jconfig.persona in PERSONA_LENS_MAP and PERSONA_LENS_MAP.get(jconfig.persona) is not None
+            if not scoped and is_lens_scoped:
                 print_info(f"  ⏭️  Skipping {jconfig.label} — no files in lens ({jconfig.persona})")
                 continue
             kept.append(jconfig)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@python/cube/automation/judge_panel.py` around lines 529 - 545, The skip logic
currently hardcodes {"backend-generic","frontend-generic"} so new scoped
personas won't be skipped when their lens has no files; import PERSONA_LENS_MAP
and compute the skippable set dynamically (e.g., skippable = {persona for
persona, lens in PERSONA_LENS_MAP.items() if lens is not None}) and replace the
hardcoded membership check in the loop that iterates over judge_configs (which
uses pr_files, files_for_persona, lens_summary, and jconfig.persona) with a
check against that derived skippable set, preserving the existing print_info
skip message and leaving cross-cutting personas (with None lens) unskipped.
tests/core/test_file_routing.py (1)

18-63: 💤 Low value

Multi-lens test assertion could be stricter.

The test test_classification_can_be_multi_lens uses an or condition which passes if only one lens matches. Since apps/web/Dockerfile should match both frontend (via the apps/.../(web|...)/ pattern) and infra (via the Dockerfile pattern), consider asserting both are present to truly verify multi-lens behaviour.

🔧 Proposed stricter assertion
     def test_classification_can_be_multi_lens(self):
         # A Dockerfile in a frontend directory hits both
         result = classify("apps/web/Dockerfile")
-        assert "frontend" in result.lenses or "infra" in result.lenses
-        # At minimum it must have at least one lens
-        assert len(result.lenses) >= 1
+        assert "frontend" in result.lenses, f"Expected 'frontend' in {result.lenses}"
+        assert "infra" in result.lenses, f"Expected 'infra' in {result.lenses}"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/core/test_file_routing.py` around lines 18 - 63, The test
test_classification_can_be_multi_lens is too lax (uses an "or"); update it to
require both lenses for apps/web/Dockerfile by calling
classify("apps/web/Dockerfile") and asserting that "frontend" in result.lenses
and "infra" in result.lenses (and optionally assert len(result.lenses) >= 2) so
the test verifies true multi-lens behaviour; modify the assertions in
test_classification_can_be_multi_lens accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@python/cube/commands/peer_review.py`:
- Around line 187-258: The _ensure_pr_review_worktree function can leave a
corrupted/stale worktree when "git worktree add" or "git reset --hard" fail;
update it to detect those failures and clean up/recreate the worktree before
returning so subsequent runs don't operate on stale content: when add fails and
the target worktree path exists, remove the partial directory (worktree) and log
a clear warning with remediation guidance; when reset fails on an existing
worktree, attempt a safe recovery sequence (e.g., run "git worktree remove" or
delete the worktree directory and re-run worktree creation after a fresh git
fetch) and if recovery fails return empty files and log the error; ensure all
changes reference _ensure_pr_review_worktree, the subprocess calls for
["git","worktree","add"...] and ["git","reset","--hard",
f"origin/{head_branch}"] so maintainers can find and modify the exact spots.

---

Nitpick comments:
In `@python/cube/automation/judge_panel.py`:
- Around line 529-545: The skip logic currently hardcodes
{"backend-generic","frontend-generic"} so new scoped personas won't be skipped
when their lens has no files; import PERSONA_LENS_MAP and compute the skippable
set dynamically (e.g., skippable = {persona for persona, lens in
PERSONA_LENS_MAP.items() if lens is not None}) and replace the hardcoded
membership check in the loop that iterates over judge_configs (which uses
pr_files, files_for_persona, lens_summary, and jconfig.persona) with a check
against that derived skippable set, preserving the existing print_info skip
message and leaving cross-cutting personas (with None lens) unskipped.

In `@tests/core/test_file_routing.py`:
- Around line 18-63: The test test_classification_can_be_multi_lens is too lax
(uses an "or"); update it to require both lenses for apps/web/Dockerfile by
calling classify("apps/web/Dockerfile") and asserting that "frontend" in
result.lenses and "infra" in result.lenses (and optionally assert
len(result.lenses) >= 2) so the test verifies true multi-lens behaviour; modify
the assertions in test_classification_can_be_multi_lens accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 09325aed-9ee7-4cbb-a0be-43be597ec955

📥 Commits

Reviewing files that changed from the base of the PR and between c55a3a8 and 95d69ac.

📒 Files selected for processing (6)
  • python/cube/automation/file_routing.py
  • python/cube/automation/judge_panel.py
  • python/cube/commands/peer_review.py
  • python/cube/models/types.py
  • tests/cli/test_adapters.py
  • tests/core/test_file_routing.py
📜 Review details
🧰 Additional context used
🪛 Ruff (0.15.13)
python/cube/commands/peer_review.py

[error] 203-203: subprocess call: check for execution of untrusted input

(S603)


[error] 204-204: Starting a process with a partial executable path

(S607)


[error] 216-216: subprocess call: check for execution of untrusted input

(S603)


[error] 217-217: Starting a process with a partial executable path

(S607)


[error] 227-227: subprocess call: check for execution of untrusted input

(S603)


[error] 228-228: Starting a process with a partial executable path

(S607)


[error] 240-240: subprocess call: check for execution of untrusted input

(S603)


[error] 241-247: Starting a process with a partial executable path

(S607)

tests/core/test_file_routing.py

[warning] 66-74: Mutable default value for class attribute

(RUF012)

python/cube/automation/judge_panel.py

[warning] 131-131: Docstring contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?

(RUF002)

🔇 Additional comments (18)
tests/cli/test_adapters.py (1)

153-164: LGTM!

Also applies to: 173-175, 191-204

python/cube/automation/file_routing.py (7)

1-12: LGTM!


14-18: LGTM!


20-59: LGTM!


62-87: LGTM!


90-107: LGTM!


110-131: LGTM!


133-143: LGTM!

tests/core/test_file_routing.py (3)

1-16: LGTM!


65-121: LGTM!


123-140: LGTM!

python/cube/models/types.py (1)

36-45: LGTM!

python/cube/automation/judge_panel.py (4)

109-123: LGTM!


126-192: LGTM!


470-492: LGTM!


719-744: LGTM!

python/cube/commands/peer_review.py (2)

317-324: LGTM!


409-425: LGTM!

Comment on lines +187 to +258
def _ensure_pr_review_worktree(pr_number: int, head_branch: str, base_branch: str) -> tuple[Path, list[str]]:
"""Sync a dedicated review worktree to origin/<head_branch>.

Returns the worktree path AND the list of files this PR touches (relative
to repo root), so judges can be routed to a subset of them via
``file_routing.files_for_persona``.

The worktree is reused across panel runs for the same PR — judges read
files directly via Read instead of reasoning over diff blobs.
"""
import subprocess

project = Path(PROJECT_ROOT).name
worktree = Path.home() / ".cube" / "worktrees" / project / f"pr-review-{pr_number}"

# Make sure origin has the head branch.
fetch = subprocess.run(
["git", "fetch", "origin", head_branch, base_branch],
cwd=PROJECT_ROOT,
capture_output=True,
text=True,
timeout=120,
)
if fetch.returncode != 0:
# Not fatal — branch may be a fork; we fall back to repo refs.
print_warning(f"git fetch origin {head_branch}: {fetch.stderr.strip()}")

if worktree.exists():
# Re-sync existing worktree to latest origin/<head>.
reset = subprocess.run(
["git", "reset", "--hard", f"origin/{head_branch}"],
cwd=worktree,
capture_output=True,
text=True,
timeout=60,
)
if reset.returncode != 0:
print_warning(f"Could not reset PR review worktree: {reset.stderr.strip()}")
else:
worktree.parent.mkdir(parents=True, exist_ok=True)
add = subprocess.run(
["git", "worktree", "add", "--detach", str(worktree), f"origin/{head_branch}"],
cwd=PROJECT_ROOT,
capture_output=True,
text=True,
timeout=120,
)
if add.returncode != 0:
print_warning(f"Could not create PR review worktree: {add.stderr.strip()}")
return worktree, []

# Enumerate files this PR touches (added/modified/renamed; not deleted —
# judges can't Read a deleted file).
diff = subprocess.run(
[
"git",
"diff",
"--name-only",
"--diff-filter=AMR",
f"origin/{base_branch}...origin/{head_branch}",
],
cwd=worktree,
capture_output=True,
text=True,
timeout=30,
)
if diff.returncode != 0:
print_warning(f"git diff --name-only failed: {diff.stderr.strip()}")
return worktree, []

files = [line.strip() for line in diff.stdout.splitlines() if line.strip()]
return worktree, files
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Consider handling git worktree cleanup on persistent failures.

When git worktree add fails (line 234), the function returns an empty file list, which is a reasonable fallback. However, if the worktree directory exists in a corrupted state (e.g., partial creation), subsequent runs might also fail. Consider adding a cleanup step or at least logging guidance when the worktree path exists but git operations fail.

Additionally, when git reset --hard fails on an existing worktree (line 223), execution continues but pr_files enumeration may operate on stale content. This could lead to judges reviewing outdated file contents while the file list reflects the new PR state.

🔧 Proposed fix to handle stale worktree
     if worktree.exists():
         # Re-sync existing worktree to latest origin/<head>.
         reset = subprocess.run(
             ["git", "reset", "--hard", f"origin/{head_branch}"],
             cwd=worktree,
             capture_output=True,
             text=True,
             timeout=60,
         )
         if reset.returncode != 0:
             print_warning(f"Could not reset PR review worktree: {reset.stderr.strip()}")
+            # Worktree may be stale - fall back to no file routing
+            return worktree, []
🧰 Tools
🪛 Ruff (0.15.13)

[error] 203-203: subprocess call: check for execution of untrusted input

(S603)


[error] 204-204: Starting a process with a partial executable path

(S607)


[error] 216-216: subprocess call: check for execution of untrusted input

(S603)


[error] 217-217: Starting a process with a partial executable path

(S607)


[error] 227-227: subprocess call: check for execution of untrusted input

(S603)


[error] 228-228: Starting a process with a partial executable path

(S607)


[error] 240-240: subprocess call: check for execution of untrusted input

(S603)


[error] 241-247: Starting a process with a partial executable path

(S607)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@python/cube/commands/peer_review.py` around lines 187 - 258, The
_ensure_pr_review_worktree function can leave a corrupted/stale worktree when
"git worktree add" or "git reset --hard" fail; update it to detect those
failures and clean up/recreate the worktree before returning so subsequent runs
don't operate on stale content: when add fails and the target worktree path
exists, remove the partial directory (worktree) and log a clear warning with
remediation guidance; when reset fails on an existing worktree, attempt a safe
recovery sequence (e.g., run "git worktree remove" or delete the worktree
directory and re-run worktree creation after a fresh git fetch) and if recovery
fails return empty files and log the error; ensure all changes reference
_ensure_pr_review_worktree, the subprocess calls for ["git","worktree","add"...]
and ["git","reset","--hard", f"origin/{head_branch}"] so maintainers can find
and modify the exact spots.

Folder-based routing matches how monorepos are actually organized —
`apps/web/` IS the frontend, `apps/api/` IS the backend — and sidesteps
extension coin-flips ("is this `.ts` file backend or frontend?").

Refactor:
- Replace regex-per-extension lookup with `_FOLDER_RULES`: an ordered
  list of (prefix, lens) pairs. First-matching prefix wins; everything
  unmatched falls into the `other` lens.
- Built-in rules cover aetheron-connect-v2 directly: `apps/web/` →
  frontend, `apps/api|workloads|email-worker|migrations-lambda` →
  backend, `packages/` and `apps/e2e/` → shared, `infra|infrastructure|
  terraform|.github/workflows` → infra.
- New `shared` lens replaces "files seen by both lens-scoped judges".
  Backend judge gets {backend, infra, shared, other}, Frontend judge
  gets {frontend, shared}.
- `classify()` now returns a single lens (not a set) — folder prefixes
  are mutually exclusive by construction.
- Tests pivot from extension-based assertions to v2-style paths.

Fixed a normalization bug along the way: `lstrip("./")` was stripping
the leading dot from `.github/...` paths, sending them to `other`.
Switched to `removeprefix("./")` which only trims a literal "./".

Routing impact on a realistic 17-file v2 PR (4 frontend / 6 backend /
3 shared / 2 infra / 2 other):
- Frontend judge: 7/17 files (41%)  — 59% scope cut
- Backend judge:  13/17 files (76%) — 24% scope cut
- Cross-cutting:  17/17 files (100%) — unchanged

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/core/test_file_routing.py (1)

82-97: ⚡ Quick win

Convert class-level SAMPLE from mutable list to immutable tuple.

SAMPLE is a class-level list that is only read in tests, never mutated. Converting to a tuple eliminates the risk of accidental in-place edits leaking state across tests and is consistent with best practices for shared test fixtures.

Proposed fix
-    SAMPLE = [
+    SAMPLE = (
         # Frontend
         "apps/web/src/components/Booking.tsx",
         "apps/web/src/styles/main.css",
         # Backend
         "apps/api/src/routes/booking.ts",
         "apps/workloads/src/agent-runner.ts",
         "apps/migrations-lambda/migrations/202401_init.sql",
         # Shared
         "packages/contracts/src/booking.ts",
         "apps/e2e/tests/booking.spec.ts",
         # Infra
         ".github/workflows/ci.yml",
         # Other
         "README.md",
-    ]
+    )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/core/test_file_routing.py` around lines 82 - 97, Change the class-level
mutable list SAMPLE to an immutable tuple by replacing the bracketed list
literal with a parenthesized tuple literal so tests read from an immutable
sequence; update the declaration of SAMPLE (the constant used in
tests/core/test_file_routing.py) to use a tuple of the same string entries and
leave all usages unchanged to prevent accidental in-place edits leaking state
across tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/core/test_file_routing.py`:
- Around line 82-97: Change the class-level mutable list SAMPLE to an immutable
tuple by replacing the bracketed list literal with a parenthesized tuple literal
so tests read from an immutable sequence; update the declaration of SAMPLE (the
constant used in tests/core/test_file_routing.py) to use a tuple of the same
string entries and leave all usages unchanged to prevent accidental in-place
edits leaking state across tests.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a962e994-7cb7-4409-84f4-09919fa84303

📥 Commits

Reviewing files that changed from the base of the PR and between 95d69ac and 81db859.

📒 Files selected for processing (2)
  • python/cube/automation/file_routing.py
  • tests/core/test_file_routing.py
📜 Review details
🧰 Additional context used
🪛 Ruff (0.15.13)
tests/core/test_file_routing.py

[warning] 82-97: Mutable default value for class attribute

(RUF012)

🔇 Additional comments (3)
python/cube/automation/file_routing.py (1)

1-174: LGTM!

tests/core/test_file_routing.py (2)

1-81: LGTM!


99-182: LGTM!

Cross-cutting personas (PO/QA, Principal, Security) own every file by
design. The previous intro emitted "The remaining 0 files are out of
scope" — confusing for the judge and a tiny bit comedic. Branch on
whether the lens covers everything and emit a clear phrasing instead:

  The PR touches **3 file(s)** and **all of them fall in your lens**
  (cross-cutting persona).

Verified live against PR aetheronhq/aetheron-connect-v2#1310 — Frontend
judge still gets its 7-of-9 scope, Security gets the new cross-cutting
intro.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
python/cube/automation/judge_panel.py (2)

731-735: 💤 Low value

Redundant import of files_for_persona.

files_for_persona is already imported at line 542 within the earlier if pr_files is not None block. While Python's import caching makes this functionally correct, consolidating the import would be cleaner.

♻️ Suggested approach

Move the import at line 542 to the top of the if pr_files is not None block at line 541, then remove the duplicate import at line 733:

     if pr_files is not None:
         from .file_routing import files_for_persona, lens_summary
-
-        counts = lens_summary(pr_files)
+        counts = lens_summary(pr_files)

Then at line 732-735:

         if pr_files is not None:
-            from .file_routing import files_for_persona
-
             scoped_files = files_for_persona(jconfig.persona, pr_files)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@python/cube/automation/judge_panel.py` around lines 731 - 735, The duplicate
local import of files_for_persona in the block that sets scoped_files is
redundant; consolidate by importing files_for_persona once at the start of the
pr_files handling block (the earlier if pr_files is not None branch where
jconfig.persona and pr_files are used) and remove the second import inside the
scoped_files assignment; ensure you still call
files_for_persona(jconfig.persona, pr_files) to set scoped_files and that
file_routing.files_for_persona remains referenced correctly.

138-138: 💤 Low value

Potential issue with falsy zero value in or expression.

If total_pr_files is explicitly set to 0, this expression will fall back to len(files) instead. While a PR with zero files is unlikely, using an explicit None check is clearer and avoids the edge case.

♻️ Suggested fix
-    total = judge_info.total_pr_files or len(files)
+    total = judge_info.total_pr_files if judge_info.total_pr_files is not None else len(files)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@python/cube/automation/judge_panel.py` at line 138, The current assignment
"total = judge_info.total_pr_files or len(files)" treats 0 as falsy and
incorrectly substitutes len(files); change it to explicitly check for None so an
intentional 0 is preserved. Locate the assignment to total in judge_panel.py and
replace the or-expression with a None-check using judge_info.total_pr_files (for
example: use a conditional expression or an if/else) so that if
judge_info.total_pr_files is None you use len(files), otherwise use
judge_info.total_pr_files.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@python/cube/automation/judge_panel.py`:
- Around line 731-735: The duplicate local import of files_for_persona in the
block that sets scoped_files is redundant; consolidate by importing
files_for_persona once at the start of the pr_files handling block (the earlier
if pr_files is not None branch where jconfig.persona and pr_files are used) and
remove the second import inside the scoped_files assignment; ensure you still
call files_for_persona(jconfig.persona, pr_files) to set scoped_files and that
file_routing.files_for_persona remains referenced correctly.
- Line 138: The current assignment "total = judge_info.total_pr_files or
len(files)" treats 0 as falsy and incorrectly substitutes len(files); change it
to explicitly check for None so an intentional 0 is preserved. Locate the
assignment to total in judge_panel.py and replace the or-expression with a
None-check using judge_info.total_pr_files (for example: use a conditional
expression or an if/else) so that if judge_info.total_pr_files is None you use
len(files), otherwise use judge_info.total_pr_files.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2f42afd5-a999-4c48-a8a7-7865cf8d1224

📥 Commits

Reviewing files that changed from the base of the PR and between 81db859 and 9212e9d.

📒 Files selected for processing (1)
  • python/cube/automation/judge_panel.py
📜 Review details
🧰 Additional context used
🪛 Ruff (0.15.13)
python/cube/automation/judge_panel.py

[warning] 131-131: Docstring contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?

(RUF002)

🔇 Additional comments (6)
python/cube/automation/judge_panel.py (6)

6-6: LGTM!


109-123: LGTM!


141-201: LGTM!


488-501: LGTM!


538-554: LGTM!


737-754: LGTM!

@jacsamell jacsamell merged commit 77b127f into main May 19, 2026
1 of 2 checks passed
@jacsamell jacsamell deleted the feat/per-judge-file-routing branch May 19, 2026 04:33
jacsamell added a commit that referenced this pull request May 19, 2026
…med judges

Symptom: cube panel kept posting findings against pre-rewrite code on
aetheronhq/aetheron-connect-v2#1266 even after PR #175's --force fetch
landed. Same 10+ findings against lines/tables that no longer exist.

Two more layers of the bug surfaced:

(1) Duplicate worktrees, only one synced.
    PR reviews historically used `~/.cube/worktrees/<project>/pr-<n>/`
    (legacy, created by `_prefetch_worktrees`). PR #172 added a
    second one at `pr-review-<n>/` for the new file-routing flow.
    Both worktrees existed in parallel; only the new one had the
    --force fetch + HEAD verification from #175. The other stayed
    pinned to whatever it was last synced to.

    Worse: `build_peer_review_prompt` told judges to Read from the
    LEGACY path, while `_apply_file_scope` told them to Read from the
    NEW path. Same prompt, two locations.

    Confirmed live: both worktrees existed on PR #1266; the legacy
    one was stuck at fda636a3 (6 commits behind 62e9fbd1).

(2) Resumed judges had stale session memory.
    Sessions persist across panel runs. On the next run, the LLM
    continues its prior conversation — including its prior reasoning
    about file contents, line numbers, function shapes. Even when the
    resumed judge Reads fresh files, its reasoning anchors on the old
    memory. The previous "RE-REVIEW" hint was too soft to displace it.

Fixes:
  * Skip `_prefetch_worktrees` in PR-review mode when a dedicated
    `review_worktree` was passed. The new sync is authoritative; the
    legacy sync just made a second stale copy.
  * Pipe `review_worktree` through `build_peer_review_prompt` so the
    prompt's worktree path matches the file-scope block. One location.
  * Replace the meek "verify commit" hint with a hard "CURRENT HEAD"
    callout that names the SHA, says "your prior reasoning may be out
    of date", and lists 4 explicit steps: Re-Read fresh, verify each
    prior finding still applies, add new ones, drop the rest.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
jacsamell added a commit that referenced this pull request May 19, 2026
…<n>/

PR #172 invented `pr-review-<n>/` as a parallel worktree to hold the
file-routing flow's check-out. The original `pr-<n>/` (managed by
`_prefetch_worktrees` for the cube panel + `_get_cli_review_worktrees`
for CodeRabbit) kept existing alongside it. Result: two worktrees per
PR, only one kept synced by the new code, prompts pointing at the
wrong one, judges Reading stale content.

Reverting the path so the file-routing flow reuses `pr-<n>/`. Combined
with the #176 skip-`_prefetch_worktrees`-when-already-synced logic,
this gives one worktree per PR with the new --force fetch + HEAD
verification applied to it. CodeRabbit and cube judges both read
from the same place.

`pr-review-<n>/` orphan dirs deleted from the local cache; `git
worktree prune` cleaned the stale refs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
jacsamell added a commit that referenced this pull request May 19, 2026
…med judges (#176)

* fix(pr-review): eliminate stale legacy worktree + signal HEAD to resumed judges

Symptom: cube panel kept posting findings against pre-rewrite code on
aetheronhq/aetheron-connect-v2#1266 even after PR #175's --force fetch
landed. Same 10+ findings against lines/tables that no longer exist.

Two more layers of the bug surfaced:

(1) Duplicate worktrees, only one synced.
    PR reviews historically used `~/.cube/worktrees/<project>/pr-<n>/`
    (legacy, created by `_prefetch_worktrees`). PR #172 added a
    second one at `pr-review-<n>/` for the new file-routing flow.
    Both worktrees existed in parallel; only the new one had the
    --force fetch + HEAD verification from #175. The other stayed
    pinned to whatever it was last synced to.

    Worse: `build_peer_review_prompt` told judges to Read from the
    LEGACY path, while `_apply_file_scope` told them to Read from the
    NEW path. Same prompt, two locations.

    Confirmed live: both worktrees existed on PR #1266; the legacy
    one was stuck at fda636a3 (6 commits behind 62e9fbd1).

(2) Resumed judges had stale session memory.
    Sessions persist across panel runs. On the next run, the LLM
    continues its prior conversation — including its prior reasoning
    about file contents, line numbers, function shapes. Even when the
    resumed judge Reads fresh files, its reasoning anchors on the old
    memory. The previous "RE-REVIEW" hint was too soft to displace it.

Fixes:
  * Skip `_prefetch_worktrees` in PR-review mode when a dedicated
    `review_worktree` was passed. The new sync is authoritative; the
    legacy sync just made a second stale copy.
  * Pipe `review_worktree` through `build_peer_review_prompt` so the
    prompt's worktree path matches the file-scope block. One location.
  * Replace the meek "verify commit" hint with a hard "CURRENT HEAD"
    callout that names the SHA, says "your prior reasoning may be out
    of date", and lists 4 explicit steps: Re-Read fresh, verify each
    prior finding still applies, add new ones, drop the rest.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* revert(pr-review): use the original pr-<n>/ worktree, drop pr-review-<n>/

PR #172 invented `pr-review-<n>/` as a parallel worktree to hold the
file-routing flow's check-out. The original `pr-<n>/` (managed by
`_prefetch_worktrees` for the cube panel + `_get_cli_review_worktrees`
for CodeRabbit) kept existing alongside it. Result: two worktrees per
PR, only one kept synced by the new code, prompts pointing at the
wrong one, judges Reading stale content.

Reverting the path so the file-routing flow reuses `pr-<n>/`. Combined
with the #176 skip-`_prefetch_worktrees`-when-already-synced logic,
this gives one worktree per PR with the new --force fetch + HEAD
verification applied to it. CodeRabbit and cube judges both read
from the same place.

`pr-review-<n>/` orphan dirs deleted from the local cache; `git
worktree prune` cleaned the stale refs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
jacsamell added a commit that referenced this pull request May 21, 2026
…fallback (#197)

Rich treats every `[anything]` in printed strings as a markup tag. Judge
output, file paths, dedupe agent text, and CodeRabbit feedback all carry
literal brackets (`[/path/foo.ts]`, `[bug]`, `[ ]`) that look like closing
tags to the parser. Any interpolation site that forgot `escape()` was a
latent panel killer — MarkupError bubbled mid-render, the background task
exited 0, and the GitHub review never got posted.

Per-site escapes (PR #193, partially #196) didn't scale: in the last 2
weeks we added 4 new high-surface print sites (#172 file routing, #180
reply-chain rendering, #191 git-delta injection, #196 missing-judge
list) and the crash kept jumping to whichever one had unescaped content.
Three panel runs on pr-1361 crashed on judge_3 then judge_4 from
different render paths.

Architectural fix: subclass Console once, catch MarkupError, re-render
with markup disabled and dynamic content pre-escaped. One chokepoint.
All existing styled cube output keeps working — only crash-prone inputs
trip the escaped fallback (literal brackets, no tag styling for THAT
print). Future print sites are safe-by-default — no more whack-a-mole.

Applied to:
- `cube.core.output.console` / `console_err` (covers ~95% of cube prints)
- `cube.core.base_layout.BaseLayout.console` (Live render path)

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant