Skip to content

feat(review-pr): add per-finding confidence scoring model#17

Open
Sayt-0 wants to merge 2 commits into
mainfrom
feat/confidence-scoring
Open

feat(review-pr): add per-finding confidence scoring model#17
Sayt-0 wants to merge 2 commits into
mainfrom
feat/confidence-scoring

Conversation

@Sayt-0

@Sayt-0 Sayt-0 commented Jun 24, 2026

Copy link
Copy Markdown
Member

Summary

Adds a precise, multi-criteria confidence scoring model for the PR reviewer. Each verified finding gets a 0 to 100 confidence score, a band, and a posting disposition, computed deterministically from several signals rather than a subjective guess.

The model lives in two synchronized surfaces:

Surface Role
src/score-confidence/ (pure module + CLI + unit tests) Single source of truth for the model
review-pr/agents/pr-review.yaml "Confidence Scoring" section Strict lookup table the orchestrator applies inline (the gitignored dist/ is not available at agent runtime)

Criteria

Criterion Source Role
Verdict (CONFIRMED / LIKELY / DISMISSED) verifier Primary agreement signal
evidence_strength (direct / circumstantial / speculative) verifier (new field) Pattern / snippet match strength
context_completeness (full / partial / none) verifier (new field) Whether the verifier saw the code it needed
Severity concordance derived Drafter vs verifier severity rank distance
Scope (in_diff and in_changed_code) derived Hard gate
Category / severity verifier Posting policy only (security floor, high-severity always-post)

Scale and threshold

  • Score 0 to 100 from a precomputed verdict x evidence x context table (CONFIRMED 70 / LIKELY 40 base; evidence +18 / +8 / -4; context +12 / +4 / -10), plus concordance (+5 / 0 / -8), clamped.
  • Bands: strong >= 80, moderate 55..79, weak 30..54, negligible < 30.
  • Default posting threshold is 55 (the moderate-band floor, so the band and the post cutoff cannot drift apart).
  • Only CONFIRMED can reach strong (LIKELY tops out at 75), a property the unit tests pin.

Posting policy (first match wins; the 5-comment cap is applied last)

Rule Behavior
Out-of-scope / DISMISSED non-security not surfaced
Security floor (CONFIRMED/LIKELY) always inline, never auto-suppressed, cap-exempt
High-severity (CONFIRMED/LIKELY) always inline, cap-exempt
Band strong or moderate inline (subject to the cap)
Band weak, non-forced lower-confidence summary list (not silently dropped)
Negligible band, verifier severity medium summary (visibility floor)
DISMISSED security dismissed-security audit line (human-reviewable)
Non-forced inline overflow past 5 demoted to the summary list

Maps to the requested design

Requested Implemented
Verifier agreement score verdict + drafter/verifier severity concordance
Pattern match strength evidence_strength field
Context completeness context_completeness field
Scale (high/medium/low or 0-100) 0 to 100 plus four named bands
Default threshold 55 (moderate floor)
Precise, multiple criteria deterministic lookup table; six criteria; 84 unit tests pin every value

How it was hardened

A design review (three independent lenses: calibration, security policy, LLM reproducibility) locked the constants, replacing error-prone post-hoc caps with the lookup table and removing band dead-zones. An adversarial verification pass then confirmed and fixed three defects:

Defect Severity Resolution
Verifier escalating severity could silently drop a medium-severity finding (concordance is non-monotonic in severity) major Medium-severity visibility floor: a still-believed medium finding is kept in the summary, never dropped
category was the only enum not validated, so a misspelled value silently disabled the security floor minor Validate category via assertEnum
Severity-disagreement can move a borderline finding from inline to summary minor Documented as intended: the finding stays visible, and confidence legitimately reflects assessor agreement (a different axis from severity)

TS-to-prompt numeric consistency (all 18 table cells, bands, threshold, posting precedence, schema) was verified clean.

Compatibility with #15

#15 ("always post a review comment even with zero findings") and this change touch distinct regions of pr-review.yaml and are complementary: an empty inline set yields a APPROVE assessment label while the summary and audit sections still go in the review body, and the review is still posted.

Validation

Check Result
pnpm build pass
pnpm test (554 tests, 84 new) pass
tsc --noEmit pass
biome ci pass
actionlint pass

Note on runtime placement

The model is applied by the orchestrator prompt (mirroring the tested TS module) rather than invoked as a dist bundle at agent runtime, because dist/ is gitignored and not present in the agent's working tree. If invoking the compiled scorer at runtime is preferred, that is a larger pipeline change (staging the bundle plus a permission) and can be done separately.

Score each verified finding 0-100 from the verifier verdict, evidence
strength, context completeness, drafter/verifier severity concordance,
and scope. Bands (strong/moderate/weak/negligible) with a default
posting threshold of 55 gate inline comments; security and high-severity
CONFIRMED/LIKELY findings are always posted, weak-band findings go to a
visible lower-confidence summary instead of being dropped, and a
medium-severity floor keeps a still-believed finding visible.

The model is implemented and unit-tested in src/score-confidence
(single source of truth) and mirrored in the orchestrator prompt as a
strict lookup table. The verifier now emits evidence_strength and
context_completeness.
Comment thread src/score-confidence/index.ts Fixed
CodeQL js/insecure-temporary-file: the CLI defaulted its output to a
hardcoded /tmp path. Default to stdout instead (composable, no fixed
temp file) and write to a file only when the caller passes an explicit
output path.
@Sayt-0 Sayt-0 enabled auto-merge (squash) June 24, 2026 21:10
@Sayt-0 Sayt-0 requested a review from derekmisler June 24, 2026 21:10

@docker-agent docker-agent left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟢 APPROVE

The confidence scoring model looks correct. The lookup table arithmetic, band boundaries, clamping, posting policy precedence, enum validation, and CLI wiring all check out. The TypeScript module and the YAML mirror are consistent. No bugs were found in the code introduced by this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants