docker · Sayt-0 · Jun 24, 2026 · Jun 24, 2026
diff --git a/AGENTS.md b/AGENTS.md
@@ -43,6 +43,11 @@ Anything else here (workflows under `.github/workflows/`, scripts, tests) exists
 │   │   ├── index.ts                 # CLI entry → bundled to dist/filter-diff.js
 │   │   ├── filter-diff.ts           # Core filterDiff() pure function + applyFilter() I/O wrapper.
 │   │   └── __tests__/
+│   ├── score-confidence/            # Per-finding confidence scoring for the PR review pipeline.
+│   │   ├── index.ts                 # CLI entry → bundled to dist/score-confidence.js
+│   │   ├── score-confidence.ts      # Core scoreFinding()/scoreFindings() pure functions + posting policy.
+│   │   │                            #   Source of truth for the model mirrored in pr-review.yaml.
+│   │   └── __tests__/
 │   ├── score-risk/                  # Per-file risk scoring for the PR review pipeline.
 │   │   ├── index.ts                 # CLI entry → bundled to dist/score-risk.js
 │   │   ├── score-risk.ts            # Core scoreFiles() pure function.
@@ -167,6 +172,7 @@ The action runs untrusted input (PR titles, bodies, comments, diffs) through an
   - `pull_request` action `review_requested` when `github.event.requested_reviewer.login == 'docker-agent'`
   - `@docker-agent` mentions on PR/issue comments — these run the `.github/actions/mention-reply` handler (sets `should-reply` and builds the context prompt) and then the `review-pr/mention-reply` sub-action (referenced from a pinned SHA, not present as a local path on every commit). The `pr-review-mention-reply.yaml` agent handles the actual reply.
 - Diffs over 1500 lines are **chunked at file boundaries** in `review-pr/action.yml` (see "Split diff into chunks"). Per-file **risk scoring** (security paths, line counts, error-handling patterns) prioritizes verifier attention.
+- Per-finding **confidence scoring** assigns each verified finding a precise 0–100 score (band: strong/moderate/weak/negligible) from the verifier's `verdict`, `evidence_strength`, and `context_completeness`, plus drafter↔verifier severity concordance and scope. `src/score-confidence/score-confidence.ts` is the **single source of truth** for the model (weights, bands, threshold, posting policy); the "Confidence Scoring" section of `review-pr/agents/pr-review.yaml` mirrors it as a strict lookup table so the orchestrator can apply it inline (the gitignored `dist/` is not available at agent runtime). Change one, change both — the unit tests pin every value. Security and high-severity CONFIRMED/LIKELY findings are always posted regardless of score; weak-band findings are surfaced in a summary rather than silently dropped.
 - Stale review threads on lines no longer in the diff are auto-resolved via GraphQL `resolveReviewThread`. Threads with no `<!-- docker-agent-review -->` marker are never touched.
 
 ### Workflows (`.github/workflows/`)

diff --git a/review-pr/README.md b/review-pr/README.md
@@ -284,6 +284,8 @@ but the error check happens after this line accesses `user.ID`.
 
 Consider moving the nil check before accessing user properties.
 
+confidence: strong (92/100)
+
 <!-- docker-agent-review -->
 ```
 
@@ -298,9 +300,19 @@ When no issues are found:
 ### Review Pipeline
 
 ```
-AGENTS.md + PR Diff → Drafter (hypotheses) → Verifier (confirm) → Post Comments
+AGENTS.md + PR Diff → Drafter (hypotheses) → Verifier (confirm + evidence signals)
+                    → Confidence score (0–100) → Post Comments
 ```
 
+Each verified finding gets a precise **confidence score** (0–100) and a band
+(strong / moderate / weak / negligible), computed deterministically from the
+verifier's verdict, evidence strength, and context completeness, plus the
+drafter↔verifier severity agreement. High-confidence findings are posted as
+inline comments (labelled with their confidence); lower-confidence findings are
+listed separately rather than dropped. Security and high-severity findings are
+always surfaced regardless of score. The model is implemented and unit-tested in
+[`src/score-confidence/`](../src/score-confidence/score-confidence.ts).
+
 ### Learning System
 
 When you reply to a review comment:

diff --git a/review-pr/agents/evals/confidence-scoring-1.json b/review-pr/agents/evals/confidence-scoring-1.json
@@ -0,0 +1,29 @@
+{
+  "id": "f0c1e2d3-4a5b-6c7d-8e9f-0a1b2c3d4e5f",
+  "title": "Confidence scoring - per-finding score, band, and security floor (run 1)",
+  "evals": {
+    "setup": "apk add --no-cache github-cli",
+    "relevance": [
+      "The agent ran 'echo $GITHUB_ACTIONS' before performing the review to detect the output mode",
+      "The agent output the review to the console as formatted markdown instead of posting via gh api",
+      "The drafter response is valid JSON containing a 'findings' array, a 'summary' field, and 'review_complete' set to true",
+      "At least one finding flags the missing redirect_uri validation as a security concern",
+      "The verifier returned a JSON response with a 'verdicts' array containing one verdict per finding, and each verdict includes an 'evidence_strength' value (direct, circumstantial, or speculative) and a 'context_completeness' value (full, partial, or none)",
+      "Each finding posted in the console output is labelled with a confidence band (strong, moderate, weak, or negligible) and a numeric score out of 100",
+      "The security finding about redirect_uri validation is surfaced in the review regardless of its confidence score (security findings are never auto-suppressed)",
+      "The review assessment label is '🔴 CRITICAL' or '🟡 NEEDS ATTENTION' because there is at least one confirmed or likely security/high-severity finding"
+    ]
+  },
+  "messages": [
+    {
+      "message": {
+        "agentName": "",
+        "message": {
+          "role": "user",
+          "content": "Review the following PR.\n\n## PR Information\n- **Title**: Add optional redirect URI to OAuth authorization flow\n- **Author**: jeanlaurent\n- **Branch**: custom-redirect-url → main\n- **Files Changed**: 6\n\n## PR Description\nAdds an optional redirect_uri field to GetAuthorizationURLRequest so callers can override the default OAuth callback URL. This allows apps to use custom URI schemes (e.g., myapp://auth/callback) for the OIDC login flow.\n\n### Changes\n- proto: Added optional redirect_uri field to GetAuthorizationURLRequest\n- auth/oidc: AuthorizationURL() accepts a redirectURI parameter, falls back to configured default when empty\n- auth/service: Reads redirect_uri from the request and passes it through\n- generated code: Regenerated Go and TypeScript protobuf files\n\n## Diff\n\nNote: Generated protobuf files (auth.pb.go, auth_pb.ts) are omitted — only hand-written code is shown.\n\n```diff\ndiff --git a/api/auth/v1/auth.proto b/api/auth/v1/auth.proto\nindex df6bf369..54dfc78a 100644\n--- a/api/auth/v1/auth.proto\n+++ b/api/auth/v1/auth.proto\n@@ -25,6 +25,11 @@ message GetAuthorizationURLRequest {\n   // Optional state parameter for CSRF protection.\n   // If not provided, the server will generate one.\n   optional string state = 1;\n+\n+  // Optional redirect URI for the OAuth callback.\n+  // If not provided, the server will use the configured default redirect URI.\n+  // This allows mobile apps to use custom URI schemes (e.g., myapp://auth/callback).\n+  optional string redirect_uri = 2;\n }\n \n // GetAuthorizationURLResponse is the response message containing the authorization URL.\n@@ -53,6 +58,10 @@ message GetLogoutURLResponse {\n message ExchangeTokenRequest {\n   // The authorization code received from the OIDC provider.\n   string code = 1;\n+\n+  // Optional redirect URI that was used in the authorization request.\n+  // Must match the redirect_uri used in GetAuthorizationURL for the OAuth flow to succeed.\n+  optional string redirect_uri = 2;\n }\n \ndiff --git a/backend/internal/platformd/auth/oidc.go b/backend/internal/platformd/auth/oidc.go\nindex 0e14ad7e..c4c96499 100644\n--- a/backend/internal/platformd/auth/oidc.go\n+++ b/backend/internal/platformd/auth/oidc.go\n@@ -65,9 +65,14 @@ func NewOIDCClient(ctx context.Context, cfg *Config) (*OIDCClient, error) {\n }\n \n // AuthorizationURL builds the authorization URL for the OIDC login flow.\n-func (c *OIDCClient) AuthorizationURL(state string) string {\n+// If redirectURI is provided, it will be used instead of the configured default.\n+func (c *OIDCClient) AuthorizationURL(state string, redirectURI string) string {\n \tcfg := c.oauth2Config\n-\tcfg.RedirectURL = c.redirectURI\n+\tif redirectURI != \"\" {\n+\t\tcfg.RedirectURL = redirectURI\n+\t} else {\n+\t\tcfg.RedirectURL = c.redirectURI\n+\t}\n \treturn cfg.AuthCodeURL(state)\n }\n \n@@ -92,10 +97,16 @@ type TokenResponse struct {\n }\n \n // ExchangeCode exchanges an authorization code for tokens.\n-func (c *OIDCClient) ExchangeCode(ctx context.Context, code string) (*TokenResponse, error) {\n+// If redirectURI is provided, it will be used instead of the configured default.\n+// The redirect URI must match the one used in the authorization request.\n+func (c *OIDCClient) ExchangeCode(ctx context.Context, code string, redirectURI string) (*TokenResponse, error) {\n \t// Set the redirect URI for this specific exchange\n \tcfg := c.oauth2Config\n-\tcfg.RedirectURL = c.redirectURI\n+\tif redirectURI != \"\" {\n+\t\tcfg.RedirectURL = redirectURI\n+\t} else {\n+\t\tcfg.RedirectURL = c.redirectURI\n+\t}\n \n \ttoken, err := cfg.Exchange(ctx, code)\n \tif err != nil {\n\ndiff --git a/backend/internal/platformd/auth/service.go b/backend/internal/platformd/auth/service.go\nindex c2a95279..e3e355c9 100644\n--- a/backend/internal/platformd/auth/service.go\n+++ b/backend/internal/platformd/auth/service.go\n@@ -82,8 +82,11 @@ func (s *Service) GetAuthorizationURL(\n \t\t}\n \t}\n \n-\t// Build the authorization URL using the configured redirect URI\n-\tauthURL := s.oidcClient.AuthorizationURL(state)\n+\t// Get redirect URI from request, or use configured default\n+\tredirectURI := msg.GetRedirectUri()\n+\n+\t// Build the authorization URL\n+\tauthURL := s.oidcClient.AuthorizationURL(state, redirectURI)\n \n \treturn connect.NewResponse(&authv1.GetAuthorizationURLResponse{\n \t\tAuthorizationUrl: authURL,\n@@ -138,8 +141,11 @@ func (s *Service) ExchangeToken(\n \t\treturn nil, connect.NewError(connect.CodeInvalidArgument, ErrCodeRequired)\n \t}\n \n-\t// Exchange the code for Docker tokens using the configured redirect URI\n-\ttokenResp, err := s.oidcClient.ExchangeCode(ctx, code)\n+\t// Get redirect URI from request, or use configured default\n+\tredirectURI := msg.GetRedirectUri()\n+\n+\t// Exchange the code for Docker tokens\n+\ttokenResp, err := s.oidcClient.ExchangeCode(ctx, code, redirectURI)\n \tif err != nil {\n \t\tif errors.Is(err, ErrTokenExchange) {\n \t\t\treturn nil, connect.NewError(connect.CodeInvalidArgument, err)\n```",
+          "created_at": "2026-02-18T14:00:00-05:00"
+        }
+      }
+    }
+  ]
+}