fix(retrieval): repair RRF fusion, ranking, FTS recall, and freshness by denfry · Pull Request #11 · denfry/codebase-index

denfry · 2026-06-14T17:36:04Z

Summary

Ten correctness/ranking fixes across the hybrid retrieval pipeline, found in a deep read of retrieval/, graph/, storage/, and indexer/freshness. Plus a schema bump (v1→v2) so symbol names are FTS-indexed.

Ranking & fusion

RRF rescaled by k — fused scores were ~w/k (≈0.017), an order of magnitude below the reranker's bounded bonuses, so rerank was silently the primary ranker and RRF a tiebreak. Scaling by k is a pure monotonic transform (order unchanged) that puts fused scores and bonuses on the same O(1) scale.
Coarse (path, line-bucket) fusion key — different retrievers report different line ranges for the same place, so the old exact (path, start, end) key almost never coincided and cross-source agreement never fired. agreeing_sources now counted at file granularity.
Scale-invariant confidence (relative gap, not absolute thresholds).
Per-file diversification — ≤3 hits/file on a page; overflow pushed to the tail, nothing dropped. With bucketing this removes the "same small file returned six times at different line slivers" noise (visible in the regenerated search_token.json golden).

Recall & indexing

FTS stopword filtering — how/does/the/… are dropped before the MATCH, so NL queries no longer AND-in filler that code chunks never contain.
Symbol names FTS-indexed — denormalized chunks.symbol_names mirrored verbatim by the sync triggers (external-content-safe; a live join could corrupt the index after a symbol cascade). Bumps SCHEMA_VERSION 1→2 — older indexes stay readable; index/update detect the mismatch and rebuild.
Damped centrality fallback — symbols with a non-unique name never get a resolved in_degree; they now get a half-capped bonus from a name-reference count. Precise in_degree still wins.

Correctness

Word-boundary test demotion — contest/, latest.py, testimonials.tsx are no longer treated as tests.
Language-aware import resolution — import './base' from a .ts file resolves to base.ts, not a same-named base.py.
Content-aware freshness — a bare touch (mtime only, identical bytes) is a no-op for update, so it no longer reports the index stale.

Cleanup

Removed the dead legacy lexical path in searchers.py (fts_response/fts_search/second Candidate/_confidence/_fallbacks/_trim).

Migration

Schema v1→v2 (added column). Rebuild is automatic on the next index/update; no user action needed. Recorded in CHANGELOG.md (Unreleased).

Testing

373 passed, 6 skipped. Tests added/updated for every change; goldens regenerated (search_token.json, mcp_search_code.json only).
ruff + mypy clean on all changed modules.
Note: test_real_local_embed_shape fails locally due to a torch/torchcodec native-DLL issue unrelated to this change (no edited module is involved); excluded from the tally above.

🤖 Generated with Claude Code

Ten correctness/ranking fixes across the hybrid retrieval pipeline, plus a schema bump for symbol-name FTS indexing. - fusion: scale RRF by k so fused scores are O(1) (rerank bonuses were ~10x the entire fused signal, making rerank the de-facto ranker); merge on a coarse (path, line-bucket) key so cross-source hits actually reinforce, and count agreeing_sources at file granularity. - pipeline: scale-invariant relative-gap confidence; per-file diversification (<=3 hits/file on a page, overflow pushed to the tail, nothing dropped). - searchers: drop stopwords before building the FTS MATCH (NL queries like "how does auth work" no longer AND-in filler and zero out recall); remove the dead legacy lexical path (fts_response/fts_search/second Candidate/etc.). - schema/storage: denormalized chunks.symbol_names mirrored by the FTS triggers (external-content-safe), so symbol names are searchable. Bumps SCHEMA_VERSION 1 -> 2; older indexes stay readable and index/update rebuild on mismatch. - rerank: word-boundary test demotion (no more contest/latest false positives); damped name-reference fallback for symbols with no resolved in_degree. - graph: language-aware import resolution (prefer the importer's own extension). - freshness: content-aware (mtime -> size -> sha) so a bare touch isn't stale. Tests added/updated for every change; goldens regenerated. ruff + mypy clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: be8417c74d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-14T17:41:07Z

+        # schema.sql is applied with IF NOT EXISTS (old tables/triggers persist).
+
+
+def peek_schema_version(path: Path | str) -> int:


Keep enable_vectors on Database

Adding this top-level helper here closes the Database class before the existing enable_vectors method, so enable_vectors is now nested inside peek_schema_version and Database no longer has that attribute. Any embeddings-enabled build or vector search path that calls db.enable_vectors() (for example indexing with embeddings.enabled = true or query-time vector search) will now raise AttributeError before creating/loading the vector tables.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-14T17:41:07Z

+            if (source, key) in seen:
+                continue


Preserve distinct symbols in fusion buckets

Because the duplicate check keys only on (source, path, 40-line bucket), symbol-only results for multiple matching definitions in the same small file/window are discarded after the first one, not merely down-ranked. For example, two matching functions in the same 40-line bucket from the symbol retriever collapse to one result, so --mode symbol can lose valid definitions; include a symbol-specific discriminator or limit this dedupe to repeated chunk-style hits.

Useful? React with 👍 / 👎.

The previous commit inserted module-level `peek_schema_version` between `_guard_version` and `enable_vectors`, which silently nested `enable_vectors` inside it — so `Database.enable_vectors` no longer existed. Local mypy passed only on a stale cache and the vector tests were skipped (no sqlite_vec locally), so CI was first to catch it (`service.py:86: "Database" has no attribute "enable_vectors"`). Move `peek_schema_version` after the class. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

chatgpt-codex-connector Bot reviewed Jun 14, 2026

View reviewed changes

denfry merged commit 77f867c into main Jun 14, 2026
10 checks passed

denfry deleted the fix/retrieval-ranking-fusion-schema-v2 branch June 14, 2026 17:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(retrieval): repair RRF fusion, ranking, FTS recall, and freshness#11

fix(retrieval): repair RRF fusion, ranking, FTS recall, and freshness#11
denfry merged 2 commits into
mainfrom
fix/retrieval-ranking-fusion-schema-v2

denfry commented Jun 14, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 14, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		# schema.sql is applied with IF NOT EXISTS (old tables/triggers persist).


		def peek_schema_version(path: Path \| str) -> int:

Conversation

denfry commented Jun 14, 2026

Summary

Ranking & fusion

Recall & indexing

Correctness

Cleanup

Migration

Testing

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant