Skip to content

feat(router): make KNN a first-class classifier with a persisted, curated corpus#10652

Draft
richiejp wants to merge 1 commit into
mudler:masterfrom
richiejp:feat/knn-first-class-router
Draft

feat(router): make KNN a first-class classifier with a persisted, curated corpus#10652
richiejp wants to merge 1 commit into
mudler:masterfrom
richiejp:feat/knn-first-class-router

Conversation

@richiejp

@richiejp richiejp commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

Description

Promote KNN search to a primary request router rather than just a
cache for the classifier.

This is WIP

Notes for Reviewers

Add classifier: knn — similarity-weighted voting over labelled
example prompts. Unlike score/colbert it needs no classifier model:
label knowledge lives in a corpus seeded and curated through the
admin API, so routing decisions are deterministic, auditable, and
grounded in graded experience rather than a model's opinion.

Epistemic gate: corpus entries below knn.similarity_threshold cannot
vote; when none clears it the classifier activates no labels and the
router uses the fallback — a prompt unlike all labelled experience is
treated as undecidable, not guessed. Decisions record
nearest_similarity (also on fallback rows) so admins can see how far
the nearest labelled experience was; the Routing tab explains
out-of-corpus fallbacks and shows per-label corpus counts.

Persistence: one JSONL file per router under
/router-corpus (text, labels, vector, embedder
fingerprint). The file is the source of truth; the local-store index
is rebuilt from it at classifier build time and stays a pure
in-memory index. Entries recorded under a different embedding model
re-embed on load. Also corrects the docs' false claim that
local-store collections persist — the embedding cache never survived
restarts (and still doesn't); the corpus does.

Corpus input is API-only by design (entries may contain example user
content): POST /api/router/{name}/corpus seeds (labels validated
against declared policies, embedded server-side, indexed
immediately), GET .../corpus/stats inspects — label counts only,
entry texts are never returned by any surface — DELETE .../corpus
wipes. Admin-gated like the sibling router endpoints, and exposed as
MCP tools (seed_router_corpus / get_router_corpus_stats /
clear_router_corpus) in both the httpapi and inproc clients with
coverage-test route mappings.

Plumbing: VectorStore gains SearchK (top-K was hardcoded to 1);
local-store gets InsertBatch/Delete as optional fast paths;
RouterConfig gains a knn block (embedding_model, k,
similarity_threshold, vote_threshold, store_name) with meta-registry
fields; the classifier dropdown now offers knn and the
previously-missing colbert; embedding_cache is ignored (with a
warning) for knn — it IS an embedding-KNN lookup; the stale
/api/instructions intelligent-routing entry is rewritten (it
described a classifier that no longer exists); swagger regenerated.

Tests: KNN vote/gate specs with hand-computed vote shares, corpus
manager suite (restart reload without re-embedding, fingerprint
re-embed, dedupe, hostile store names), middleware specs (corpus
routing, gate fallback, config validation, cache-wrap refusal),
corpus endpoint specs pinning the texts-never-returned contract, MCP
catalog + route-mapping gates, and a Playwright spec for corpus
stats and the out-of-corpus decision detail.

Assisted-by: Claude:claude-fable-5 [Claude Code]

Signed commits

  • Yes, I signed my commits.

…ated corpus

Add `classifier: knn` — similarity-weighted voting over labelled
example prompts. Unlike score/colbert it needs no classifier model:
label knowledge lives in a corpus seeded and curated through the
admin API, so routing decisions are deterministic, auditable, and
grounded in graded experience rather than a model's opinion.

Epistemic gate: corpus entries below knn.similarity_threshold cannot
vote; when none clears it the classifier activates no labels and the
router uses the fallback — a prompt unlike all labelled experience is
treated as undecidable, not guessed. Decisions record
nearest_similarity (also on fallback rows) so admins can see how far
the nearest labelled experience was; the Routing tab explains
out-of-corpus fallbacks and shows per-label corpus counts.

Persistence: one JSONL file per router under
<data path>/router-corpus (text, labels, vector, embedder
fingerprint). The file is the source of truth; the local-store index
is rebuilt from it at classifier build time and stays a pure
in-memory index. Entries recorded under a different embedding model
re-embed on load. Also corrects the docs' false claim that
local-store collections persist — the embedding cache never survived
restarts (and still doesn't); the corpus does.

Corpus input is API-only by design (entries may contain example user
content): POST /api/router/{name}/corpus seeds (labels validated
against declared policies, embedded server-side, indexed
immediately), GET .../corpus/stats inspects — label counts only,
entry texts are never returned by any surface — DELETE .../corpus
wipes. Admin-gated like the sibling router endpoints, and exposed as
MCP tools (seed_router_corpus / get_router_corpus_stats /
clear_router_corpus) in both the httpapi and inproc clients with
coverage-test route mappings.

Plumbing: VectorStore gains SearchK (top-K was hardcoded to 1);
local-store gets InsertBatch/Delete as optional fast paths;
RouterConfig gains a knn block (embedding_model, k,
similarity_threshold, vote_threshold, store_name) with meta-registry
fields; the classifier dropdown now offers knn and the
previously-missing colbert; embedding_cache is ignored (with a
warning) for knn — it IS an embedding-KNN lookup; the stale
/api/instructions intelligent-routing entry is rewritten (it
described a classifier that no longer exists); swagger regenerated.

Tests: KNN vote/gate specs with hand-computed vote shares, corpus
manager suite (restart reload without re-embedding, fingerprint
re-embed, dedupe, hostile store names), middleware specs (corpus
routing, gate fallback, config validation, cache-wrap refusal),
corpus endpoint specs pinning the texts-never-returned contract, MCP
catalog + route-mapping gates, and a Playwright spec for corpus
stats and the out-of-corpus decision detail.

Assisted-by: Claude:claude-fable-5 [Claude Code]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant