Skip to content

MCP hardening, rerank god-class dampening, config/IaC labeling + roadmap sync#9

Merged
denfry merged 3 commits into
mainfrom
chore/mcp-hardening-and-rerank
Jun 14, 2026
Merged

MCP hardening, rerank god-class dampening, config/IaC labeling + roadmap sync#9
denfry merged 3 commits into
mainfrom
chore/mcp-hardening-and-rerank

Conversation

@denfry

@denfry denfry commented Jun 14, 2026

Copy link
Copy Markdown
Owner

Roadmap follow-through across three chunks (Distribution intentionally out of scope — needs maintainer accounts/creds). No commit touches .skill_version stamps.

A — Roadmap sync + docs

  • docs/ROADMAP.md: M10 (MCP bridge) marked shipped (was planned); reconciled the technical (M0–M10) vs product (M0–M13) milestone numbering instead of claiming one is canonical.
  • "Trust model in 60 seconds" callout, identical in README.md and docs/SECURITY.md (closes PRODUCT_UPGRADE_PLAN §7).
  • docs/ARCHITECTURE.md §8: the schema_version claim is now true (was asserted but unimplemented).
  • PRODUCT_UPGRADE_PLAN.md + CHANGELOG.md updated to mark only what actually shipped.

B — MCP hardening

  • Every tool payload (success and the no-index/error path) wrapped in a stable envelope {schema_version: 1, tool: <name>}.
  • Golden snapshots for all 7 tools (tests/golden/mcp_*.json + tests/test_mcp_golden.py); schema_version/tool asserted explicitly so a golden can't freeze a wrong contract version.
  • Fix: the MCP server failed to import on mcp>=1.27 + pydantic>=2.10 (FastMCP auto-built a structured-output schema from the -> str return annotation and raised at import time). Tools now register as unstructured (structured_output=False where supported; older mcp detected and unaffected) — same text-content wire contract.

C — Large features (behind the benchmark gate)

  • Reranker: dampened the god-class in_degree tiebreak — linear bonus (saturated by in_degree 10) → logarithmic curve + lower cap. Validated no-regression on the public benchmark (Recall@k / MRR / nDCG unchanged) plus a targeted regression test that fails under the old linear rule. Real-repo gain still tracked under M12.5.
  • Config/IaC labeling: Dockerfile/Containerfile, Terraform (.tf/.tfvars), HCL, INI (.ini/.cfg/.conf/.properties), Makefiles now get a real language label (Tier-C, FTS-only). Already indexed as unknown text; labeling surfaces them in stats and lets agents scope to config.
  • Typed framework edges (M13): design spec docs/superpowers/specs/2026-06-14-typed-framework-edges-design.md — the documented-first deliverable the repo requires before this risky graph work can land (schema, confidence/provenance, resolver architecture, benchmark gate, phasing). Implementation not started.

Verification

  • 360 passed, 6 skipped, 1 failed — the single failure is the pre-existing libtorchcodec/FFmpeg environment issue (fails identically on main, confirmed via git stash).
  • Coverage gate 82.48% ≥ 80%; ruff clean.
  • Commits ordered C → B → A so each stays green/bisectable (the MCP search golden depends on the rerank change).

🤖 Generated with Claude Code

denfry and others added 3 commits June 14, 2026 12:37
Roadmap chunk C (large features, landed behind the benchmark gate).

- rerank: replace the linear in_degree bonus (saturated by in_degree=10, gave
  100-caller "god classes" the full bonus) with a logarithmic curve + lower cap,
  so centrality stays a tiebreak instead of floating god classes above relevant
  low-degree matches. Validated as no-regression on the public benchmark
  (Recall@k/MRR/nDCG unchanged) plus a targeted regression test.
- discovery/classify: label Dockerfile/Containerfile, Terraform (.tf/.tfvars),
  HCL, INI (.ini/.cfg/.conf/.properties) and Makefiles (Tier-C, FTS-only). These
  were already FTS-indexed as unknown text; labeling surfaces them in stats and
  lets agents scope searches to config.
- docs: typed-framework-edges design spec (M13 documented-first deliverable);
  LANGUAGES.md Tier-C row updated.
- regenerate tests/golden/search_token.json (one score shifted; order unchanged).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… on mcp>=1.27

Roadmap chunk B (MCP hardening).

- wrap every tool payload (success and error) in {schema_version: 1, tool: <name>};
  closes the docs/MCP.md follow-ups and makes the ARCHITECTURE.md schema_version
  claim true.
- golden snapshots for all 7 tools (tests/golden/mcp_*.json + test_mcp_golden.py);
  schema_version/tool asserted explicitly so a golden can't freeze a wrong version.
- fix: MCP server failed to import on mcp>=1.27 + pydantic>=2.10 (FastMCP built a
  structured-output schema from the `-> str` return annotation and raised). Register
  tools as unstructured (structured_output=False where supported; older mcp detected).
- golden_utils: mask package_version so the healthcheck golden survives version bumps.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Roadmap chunk A (roadmap sync + docs).

- docs/ROADMAP.md: M10 MCP bridge marked shipped (was "planned"); reconcile the
  technical vs product milestone numbering instead of claiming one is canonical.
- "Trust model in 60 seconds" callout, identical in README.md and docs/SECURITY.md.
- PRODUCT_UPGRADE_PLAN.md: mark shipped items (schema_version, golden snapshots,
  in_degree dampening, config/IaC labeling, trust-model doc).
- CHANGELOG.md: Unreleased entries for chunks A/B/C and the MCP import fix.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@denfry denfry merged commit 479d1e0 into main Jun 14, 2026
10 checks passed
@denfry denfry deleted the chore/mcp-hardening-and-rerank branch June 14, 2026 09:41

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 44e03d4a98

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

# Config / IaC (Tier C: line-chunk + FTS, no tree-sitter spec). These were already
# indexed as unknown-language text; labeling them surfaces infra files in `stats`
# and lets agents scope searches to config without a tree-sitter grammar.
".tf": "terraform",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Include line-parser labels in stats

Adding these Tier-C labels does not make them visible in codebase-index stats/MCP index_stats: stats_payload() builds its per-language list only from repo.treesitter_coverage(), whose SQL filters WHERE f.parser = 'treesitter', while parser_for('terraform') and the other new config labels return line. In a repo with .tf/Dockerfile/INI files, the files are counted only in the aggregate and the new languages are omitted from the advertised per-language stats, so the shipped config/IaC labeling is incomplete unless stats also aggregates line-parser languages.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant