Skip to content

Commit f6c80ac

Browse files
authored
fix(skill): ban untracked deferrals in /review skill (#568)
* feat: add maintenance skills — deps-audit, bench-check, test-health, housekeep Four recurring maintenance routines as Claude Code skills: - /deps-audit: vulnerability scanning, staleness, unused deps, license checks - /bench-check: benchmark regression detection against saved baselines - /test-health: flaky test detection, dead tests, coverage gap analysis - /housekeep: clean worktrees, dirt files, sync main, prune branches * fix(bench-check): capture stderr, guard division-by-zero, commit baseline - Replace 2>/dev/null with output=$(... 2>&1) + exit_code check on all four benchmark invocations so error messages are captured and recorded - Add division-by-zero guard in Phase 3: when baseline == 0, mark delta as "N/A — baseline was zero" (informational only, not a regression) - Add git add + git commit step in Phase 5 so the baseline file is actually committed after each save, matching the documented rule * fix(deps-audit): run npm ci after revert, document tokenizer skip reason - After reverting package.json + package-lock.json on --fix test failure, also run `npm ci` to resync node_modules/ with the restored lock file; without this the manifest is reverted but installed packages are not - Add explanatory comment on @anthropic-ai/tokenizer skip-list entry clarifying it is a peer dependency of @anthropic-ai/sdk and may be required at runtime without an explicit import in our code * fix(housekeep): guard Phase 5 in source repo, fix stale-worktree criterion - Phase 5 (Update Codegraph): add source-repo guard that skips the self-update logic when running inside the codegraph source repo; comparing the dev version to the published release and running npm install is a no-op since codegraph is not one of its own deps - Phase 1b stale-worktree criterion: replace "created more than 7 days ago" (not determinable via git worktree list) with "last commit on the branch is more than 7 days old AND branch has no commits ahead of origin/main", using `git log -1 --format=%ci <branch>` * fix: address Round 3 Greptile review feedback * fix: move deps-audit stash to Phase 0, before npm commands modify manifests * fix: capture flaky-detection loop output to per-run files for comparison * fix: always require confirmation for stale worktree removal * fix: use parsed threshold in baseline.json, guard --compare-only on first run * fix(deps-audit): track stash creation to avoid operating on wrong entry When Phase 0 stash push is a no-op (manifests unchanged), Phase 7 was calling stash drop/pop on the wrong entry. Track STASH_CREATED exit code and branch on it: use git checkout when no stash exists. * fix(test-health): use mktemp for flaky-run directory to avoid concurrent corruption Replace hardcoded /tmp/test-health-runs/ with mktemp -d so parallel sessions get isolated directories. Add cleanup at end of analysis. * fix(bench-check): add save-baseline verdict path, fix em-dash, use explicit commit paths Add 4th verdict path for --save-baseline when baseline already exists. Replace corrupted em-dash character in N/A string. Change commit command to use explicit file paths per project convention. * docs(roadmap): update Phase 5 TypeScript migration with accurate progress Phase 5 was listed as "2 of 7 complete" with outdated pre-Phase 3 file paths. Updated to reflect actual state: 32 of 269 source modules migrated (~12%). Steps 5.3-5.5 now list exact migrated/remaining files with verified counts (5.3=8, 5.4=54, 5.5=175, total=237 JS-only files). Added note about 14 stale .js counterparts of already-migrated .ts files needing deletion. * fix: deps-audit success path should keep npm changes, not revert (#565) When STASH_CREATED=1 and tests pass, the npm audit fix changes are good — no action needed. Previously it ran git checkout to discard them, which undid the successful fix. * fix: bench-check use git add + diff --cached to detect new files (#565) git diff --quiet ignores untracked files, so on the first run when baseline.json and history.ndjson are newly created, the commit was skipped. Stage first with git add, then check with --cached. * fix: housekeep require confirmation before branch deletion (#565) Branch deletion now asks for user confirmation before each delete, consistent with worktree removal in Phase 1c. * fix: scope git diff --cached to bench-check files only (#565) * fix: use json-summary reporter to match coverage-summary.json output (#565) * fix: capture stash ref by name to avoid position-based targeting (#565) * fix: remove unreachable Phase 5 subphases since source-repo guard always skips (#565) * fix: use dynamic threshold variable in bench-check Phase 6 report template (#565) * fix: address open review items in maintenance skills (#565) - bench-check: add timeout 300 wrappers to all 4 benchmark invocations with exit code 124 check for timeout detection - bench-check: add explicit COMPARE_ONLY guard at Phase 5 entry - housekeep: fix grep portability — use grep -cE instead of GNU \| syntax - test-health: add timeout 180 wrapper in flaky detection loop - test-health: fix find command -o precedence with grouping parentheses * fix: add COVERAGE_ONLY guards to Phase 2 and Phase 4 in test-health * fix: add regression skip guard to bench-check Phase 5, expand deps-audit search dirs * fix: add empty-string guard for stat size check in housekeep (#565) When both stat variants (GNU and BSD) fail, $size is empty and the arithmetic comparison errors out. Add a [ -z "$size" ] && continue guard so the loop skips files whose size cannot be determined. * fix: add BASELINE SAVED verdict path and clarify if/else-if in bench-check (#565) Phase 6: when SAVE_ONLY or first-run (no prior baseline), write a shortened report with "Verdict: BASELINE SAVED" instead of the full comparison report. Phases 1a-1d: replace ambiguous "If timeout / If non-zero" with explicit "If timeout / Else if non-zero" so the two conditions are clearly mutually exclusive. * docs(roadmap): mark Phase 4 complete, update Phase 5 progress (5 of 7) Phase 4 (Resolution Accuracy) had all 6 sub-phases merged but status still said "In Progress". Phase 5 (TypeScript Migration) had 5.3-5.5 merged via PRs #553, #554, #555, #566 but was listed with stale counts. Updated both to reflect actual state: Phase 4 complete, Phase 5 at 5/7 with 76 of 283 modules migrated (~27%). * docs(roadmap): correct Phase 5 progress — 5.3/5.4/5.5 still in progress Previous commit incorrectly marked 5.3-5.5 as complete. In reality 76 of 283 src files are .ts (~27%) while 207 remain .js (~73%). PRs #553, #554, #555, #566 migrated a first wave but left substantial work in each step: 4 leaf files, 39 core files, 159 orchestration files. Updated each step with accurate migrated/remaining counts. * fix(skill): ban untracked deferrals in /review skill The /review skill allowed replying "acknowledged as follow-up" to reviewer comments without tracking them anywhere. These deferrals get lost — nobody revisits PR comment threads after merge. Now: if a fix is genuinely out of scope, the skill must create a GitHub issue with the follow-up label before replying. The reply must include the issue link. A matching rule in the Rules section reinforces the ban. * fix(skill): add --repo flag, multi-endpoint reply for deferrals, and scope guidance (#568) * fix(skill): guard follow-up label creation before gh issue create (#568) gh issue create --label "follow-up" fails if the label doesn't exist in the repo. Add a gh label create guard step that is safe to re-run. * feat(skill): parallelize /review with one subagent per PR * fix: correct heredoc terminator indentation in review skill (#568) * fix(skill): capture gh issue create output before referencing issue number gh issue create prints the new issue URL to stdout — capture it and extract the number so reply templates can reference it unambiguously. * fix(skill): surface follow-up issues in review result format and summary table Add "Issues Created" field to the subagent result format and an "Issues" column to the Step 3 summary table, so deferred out-of-scope items are visible in the final report. * fix(skill): require verbatim rule propagation to subagents * fix(skill): align Issues field name between 2i result and Step 3 table (#568)
1 parent 4596719 commit f6c80ac

1 file changed

Lines changed: 65 additions & 29 deletions

File tree

.claude/skills/review/SKILL.md

Lines changed: 65 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -26,21 +26,30 @@ Record each PR's number, branch, base, merge status, and CI state.
2626

2727
---
2828

29-
## Step 2: Process Each PR
29+
## Step 2: Launch Parallel Subagents
3030

31-
For **each** open PR, perform the following steps in order. Process PRs one at a time to avoid cross-contamination.
31+
Each PR is independent work — **launch one Agent subagent per PR, all in parallel.** Use `isolation: "worktree"` so each agent gets its own copy of the repo with no cross-PR contamination.
3232

33-
### 2a. Switch to the PR branch
33+
Pass each agent the full PR processing instructions (Steps 2a–2i below) along with the PR number, branch, base, and current state from Step 1. The agent prompt must include **all** the rules from the Rules section at the bottom of this skill — copy them **verbatim**, do not paraphrase or summarize.
3434

35-
Ensure the working tree is clean before switching to avoid cross-PR contamination:
36-
37-
```bash
38-
if [ -n "$(git status --porcelain)" ]; then
39-
git stash push -m "pre-checkout stash"
40-
fi
4135
```
36+
For each PR, launch an Agent with:
37+
- description: "Review PR #<number>"
38+
- isolation: "worktree"
39+
- prompt: <the full PR processing instructions below, with PR details filled in>
40+
```
41+
42+
Launch **all** PR agents in a single message (one tool call per PR) so they run concurrently. Do NOT wait for one to finish before starting the next.
43+
44+
Each agent will return a result summary. Collect all results for the final summary table in Step 3.
45+
46+
---
47+
48+
## PR Processing Instructions (for each subagent)
49+
50+
The following steps are executed by each subagent for its assigned PR.
4251

43-
Then check out the PR branch:
52+
### 2a. Check out the PR branch
4453

4554
```bash
4655
gh pr checkout <number>
@@ -120,27 +129,36 @@ For **each** review comment — including minor suggestions, nits, style feedbac
120129
2. **Read the relevant code** at the file and line referenced.
121130
3. **Make the change.** Even if the comment is marked as "nit" or "suggestion" or "minor" — address it. The goal is zero outstanding comments.
122131
4. **If you disagree** with a suggestion (e.g., it would introduce a bug or contradicts project conventions), do NOT silently ignore it. Reply to the comment explaining why you chose a different approach.
123-
5. **If the fix is genuinely out of scope** for this PR (e.g., it affects a different module not touched by this PR, or requires a design decision beyond the PR's purpose), you MUST create a GitHub issue to track it before replying. Never reply with "acknowledged as follow-up" or "noted for later" without a tracked issue — untracked deferrals get lost and nobody will ever revisit them.
132+
5. **If the fix is genuinely out of scope** for this PR, you MUST create a GitHub issue to track it before replying. Never reply with "acknowledged as follow-up" or "noted for later" without a tracked issue — untracked deferrals get lost and nobody will ever revisit them. "Genuinely out of scope" means the fix touches a different module not in the PR's diff, requires an architectural decision beyond the PR's mandate, or would introduce unrelated risk. Fixing a variable name, adding a null check, or adjusting a string in a file already in the diff is NOT out of scope — just do it.
124133

125134
```bash
126-
# Create a tracking issue for the deferred item
127-
gh issue create \
135+
# Ensure the follow-up label exists (safe to re-run)
136+
gh label create "follow-up" --color "0e8a16" --description "Deferred from PR review" --repo optave/codegraph 2>/dev/null || true
137+
138+
# Create a tracking issue for the deferred item and capture the issue number
139+
issue_url=$(gh issue create \
140+
--repo optave/codegraph \
128141
--title "follow-up: <concise description of what needs to be done>" \
129-
--body "$(cat <<'EOF'
130-
Deferred from PR #<number> review.
142+
--body "$(cat <<-'EOF'
143+
Deferred from PR #<number> review.
131144
132-
**Original reviewer comment:** https://github.com/optave/codegraph/pull/<number>#discussion_r<comment-id>
145+
**Original reviewer comment:** <use the correct permalink format for the comment type: inline review comment → `https://github.com/optave/codegraph/pull/<number>#discussion_r<comment-id>`, top-level review body → `https://github.com/optave/codegraph/pull/<number>#pullrequestreview-<review-id>`, issue-style comment → `https://github.com/optave/codegraph/issues/<number>#issuecomment-<comment-id>`>
133146
134-
**Context:** <why this is out of scope for the current PR and what the fix entails>
135-
EOF
147+
**Context:** <why this is out of scope for the current PR and what the fix entails>
148+
EOF
136149
)" \
137-
--label "follow-up"
150+
--label "follow-up")
151+
issue_number=$(echo "$issue_url" | grep -oE '[0-9]+$')
138152
```
139153

140-
Then reply to the reviewer comment referencing the issue:
154+
Then reply to the reviewer comment referencing the issue (using `$issue_number` captured above). Use the same reply mechanism as step 6 below — inline PR review comments use `/pulls/<number>/comments/<comment-id>/replies`, top-level review bodies and issue-style comments use `/issues/<number>/comments`:
141155
```bash
156+
# For inline PR review comments:
142157
gh api repos/optave/codegraph/pulls/<number>/comments/<comment-id>/replies \
143-
-f body="Out of scope for this PR — tracked in #<issue-number>"
158+
-f body="Out of scope for this PR — tracked in #$issue_number"
159+
# For top-level review bodies or issue-style comments:
160+
gh api repos/optave/codegraph/issues/<number>/comments \
161+
-f body="Out of scope for this PR — tracked in #$issue_number"
144162
```
145163
6. **Reply to each comment** explaining what you did. The reply mechanism depends on where the comment lives:
146164

@@ -183,7 +201,7 @@ After addressing all comments for a PR:
183201

184202
### 2g. Re-trigger reviewers
185203

186-
**Greptile:** Before re-triggering, check if your last reply to Greptile already has a positive emoji reaction (👍, ✅, 🎉, etc.) from `greptileai`. A positive reaction means Greptile is satisfied with your fix — do NOT re-trigger in that case, move on. Only re-trigger if there is no positive reaction on your last comment:
204+
**Greptile:** Before re-triggering, check if your last reply to Greptile already has a positive emoji reaction (thumbs up, check, party, etc.) from `greptileai`. A positive reaction means Greptile is satisfied with your fix — do NOT re-trigger in that case, move on. Only re-trigger if there is no positive reaction on your last comment:
187205

188206
```bash
189207
# Check reactions on your most recent comment to see if Greptile already approved
@@ -211,21 +229,39 @@ After re-triggering:
211229
1. Wait for the new reviews to come in (check after a reasonable interval).
212230
2. Fetch new comments again (repeat Step 2d).
213231
3. If there are **new** comments from Greptile or Claude, go back to Step 2e and address them.
214-
4. **Repeat this loop for a maximum of 3 rounds.** If after 3 rounds there are still actionable comments, mark the PR as "needs human review" in the summary table and move to the next PR.
232+
4. **Repeat this loop for a maximum of 3 rounds.** If after 3 rounds there are still actionable comments, mark the PR as "needs human review" in the result.
215233
5. Verify CI is still green after all changes.
216234

235+
### 2i. Return result
236+
237+
At the end of processing, the subagent MUST return a structured result with these fields so the main agent can build the summary table:
238+
239+
```
240+
PR: #<number>
241+
Branch: <branch-name>
242+
Conflicts: resolved | none
243+
CI: green | red | pending
244+
Comments Addressed: <count>
245+
Issues: <comma-separated list of #<n> follow-up issues, or "none">
246+
Reviewers Re-triggered: <list>
247+
Status: ready | needs-work | needs-human-review | skipped
248+
Notes: <any issues encountered>
249+
```
250+
217251
---
218252

219-
## Step 3: Summary
253+
## Step 3: Collect Results and Summarize
220254

221-
After processing all PRs, output a summary table:
255+
After **all** subagents complete, collect their results and output a summary table:
222256

223257
```
224-
| PR | Branch | Conflicts | CI | Comments Addressed | Reviewers Re-triggered | Status |
225-
|----|--------|-----------|----|--------------------|----------------------|--------|
226-
| #N | branch | resolved/none | green/red | N comments | greptile, claude | ready/needs-work |
258+
| PR | Branch | Conflicts | CI | Comments Addressed | Issues | Reviewers Re-triggered | Status |
259+
|----|--------|-----------|----|--------------------|--------|----------------------|--------|
260+
| #N | branch | resolved/none | green/red | N comments | #X, #Y or none | greptile, claude | ready/needs-work |
227261
```
228262

263+
If any subagent failed or returned an error, note it in the Status column as `agent-error` with the failure reason.
264+
229265
---
230266

231267
## Rules
@@ -234,7 +270,7 @@ After processing all PRs, output a summary table:
234270
- **Never force-push** unless fixing a commit message that fails commitlint. Amend + force-push is the only way to fix a pushed commit title (messages are part of the SHA). This is safe on feature branches. For all other problems, fix with a new commit.
235271
- **Address ALL comments from ALL reviewers** (Claude, Greptile, and humans), even minor/nit/optional ones. Leave zero unaddressed. Do not only respond to one reviewer and skip another.
236272
- **Always reply to comments** explaining what was done. Don't just fix silently. Every reviewer must see a reply on their feedback.
237-
- **Don't re-trigger Greptile if already approved.** If your last reply to a Greptile comment has a positive emoji reaction (👍, ✅, 🎉) from `greptileai`, it's already satisfied — skip re-triggering.
273+
- **Don't re-trigger Greptile if already approved.** If your last reply to a Greptile comment has a positive emoji reaction from `greptileai`, it's already satisfied — skip re-triggering.
238274
- **Only re-trigger Claude** if you addressed Claude's feedback specifically.
239275
- **No co-author lines** in commit messages.
240276
- **No Claude Code references** in commit messages or comments.

0 commit comments

Comments
 (0)