Skip to content

fix(retrieval): repair RRF fusion, ranking, FTS recall, and freshness#11

Merged
denfry merged 2 commits into
mainfrom
fix/retrieval-ranking-fusion-schema-v2
Jun 14, 2026
Merged

fix(retrieval): repair RRF fusion, ranking, FTS recall, and freshness#11
denfry merged 2 commits into
mainfrom
fix/retrieval-ranking-fusion-schema-v2

Conversation

@denfry

@denfry denfry commented Jun 14, 2026

Copy link
Copy Markdown
Owner

Summary

Ten correctness/ranking fixes across the hybrid retrieval pipeline, found in a deep read of retrieval/, graph/, storage/, and indexer/freshness. Plus a schema bump (v1→v2) so symbol names are FTS-indexed.

Ranking & fusion

  • RRF rescaled by k — fused scores were ~w/k (≈0.017), an order of magnitude below the reranker's bounded bonuses, so rerank was silently the primary ranker and RRF a tiebreak. Scaling by k is a pure monotonic transform (order unchanged) that puts fused scores and bonuses on the same O(1) scale.
  • Coarse (path, line-bucket) fusion key — different retrievers report different line ranges for the same place, so the old exact (path, start, end) key almost never coincided and cross-source agreement never fired. agreeing_sources now counted at file granularity.
  • Scale-invariant confidence (relative gap, not absolute thresholds).
  • Per-file diversification — ≤3 hits/file on a page; overflow pushed to the tail, nothing dropped. With bucketing this removes the "same small file returned six times at different line slivers" noise (visible in the regenerated search_token.json golden).

Recall & indexing

  • FTS stopword filteringhow/does/the/… are dropped before the MATCH, so NL queries no longer AND-in filler that code chunks never contain.
  • Symbol names FTS-indexed — denormalized chunks.symbol_names mirrored verbatim by the sync triggers (external-content-safe; a live join could corrupt the index after a symbol cascade). Bumps SCHEMA_VERSION 1→2 — older indexes stay readable; index/update detect the mismatch and rebuild.
  • Damped centrality fallback — symbols with a non-unique name never get a resolved in_degree; they now get a half-capped bonus from a name-reference count. Precise in_degree still wins.

Correctness

  • Word-boundary test demotioncontest/, latest.py, testimonials.tsx are no longer treated as tests.
  • Language-aware import resolutionimport './base' from a .ts file resolves to base.ts, not a same-named base.py.
  • Content-aware freshness — a bare touch (mtime only, identical bytes) is a no-op for update, so it no longer reports the index stale.

Cleanup

  • Removed the dead legacy lexical path in searchers.py (fts_response/fts_search/second Candidate/_confidence/_fallbacks/_trim).

Migration

Schema v1→v2 (added column). Rebuild is automatic on the next index/update; no user action needed. Recorded in CHANGELOG.md (Unreleased).

Testing

  • 373 passed, 6 skipped. Tests added/updated for every change; goldens regenerated (search_token.json, mcp_search_code.json only).
  • ruff + mypy clean on all changed modules.
  • Note: test_real_local_embed_shape fails locally due to a torch/torchcodec native-DLL issue unrelated to this change (no edited module is involved); excluded from the tally above.

🤖 Generated with Claude Code

Ten correctness/ranking fixes across the hybrid retrieval pipeline, plus a
schema bump for symbol-name FTS indexing.

- fusion: scale RRF by k so fused scores are O(1) (rerank bonuses were ~10x
  the entire fused signal, making rerank the de-facto ranker); merge on a
  coarse (path, line-bucket) key so cross-source hits actually reinforce, and
  count agreeing_sources at file granularity.
- pipeline: scale-invariant relative-gap confidence; per-file diversification
  (<=3 hits/file on a page, overflow pushed to the tail, nothing dropped).
- searchers: drop stopwords before building the FTS MATCH (NL queries like
  "how does auth work" no longer AND-in filler and zero out recall); remove the
  dead legacy lexical path (fts_response/fts_search/second Candidate/etc.).
- schema/storage: denormalized chunks.symbol_names mirrored by the FTS triggers
  (external-content-safe), so symbol names are searchable. Bumps SCHEMA_VERSION
  1 -> 2; older indexes stay readable and index/update rebuild on mismatch.
- rerank: word-boundary test demotion (no more contest/latest false positives);
  damped name-reference fallback for symbols with no resolved in_degree.
- graph: language-aware import resolution (prefer the importer's own extension).
- freshness: content-aware (mtime -> size -> sha) so a bare touch isn't stale.

Tests added/updated for every change; goldens regenerated. ruff + mypy clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: be8417c74d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

# schema.sql is applied with IF NOT EXISTS (old tables/triggers persist).


def peek_schema_version(path: Path | str) -> int:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep enable_vectors on Database

Adding this top-level helper here closes the Database class before the existing enable_vectors method, so enable_vectors is now nested inside peek_schema_version and Database no longer has that attribute. Any embeddings-enabled build or vector search path that calls db.enable_vectors() (for example indexing with embeddings.enabled = true or query-time vector search) will now raise AttributeError before creating/loading the vector tables.

Useful? React with 👍 / 👎.

Comment on lines +54 to +55
if (source, key) in seen:
continue

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve distinct symbols in fusion buckets

Because the duplicate check keys only on (source, path, 40-line bucket), symbol-only results for multiple matching definitions in the same small file/window are discarded after the first one, not merely down-ranked. For example, two matching functions in the same 40-line bucket from the symbol retriever collapse to one result, so --mode symbol can lose valid definitions; include a symbol-specific discriminator or limit this dedupe to repeated chunk-style hits.

Useful? React with 👍 / 👎.

The previous commit inserted module-level `peek_schema_version` between
`_guard_version` and `enable_vectors`, which silently nested `enable_vectors`
inside it — so `Database.enable_vectors` no longer existed. Local mypy passed
only on a stale cache and the vector tests were skipped (no sqlite_vec locally),
so CI was first to catch it (`service.py:86: "Database" has no attribute
"enable_vectors"`). Move `peek_schema_version` after the class.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@denfry denfry merged commit 77f867c into main Jun 14, 2026
10 checks passed
@denfry denfry deleted the fix/retrieval-ranking-fusion-schema-v2 branch June 14, 2026 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant