feat(l3): dedicated l3Llm config slot for abstraction pass by chiefmojo · Pull Request #1959 · MemTensor/MemOS

chiefmojo · 2026-06-21T23:47:31Z

Problem

L3 abstraction runs on the main llm. On cheap models (e.g. gemini-2.5-flash-lite) this over-extracts and truncates the JSON output, producing 'constraints'/'inference' must be an array errors. The slot is quality-sensitive and runs off the turn-response path — a slower but more capable model here improves world-model quality with no impact on latency.

Changes

Adds an l3Llm top-level config slot using the same SkillEvolverSchema shape as skillEvolver. When blank (the default), behavior is identical to today — deps.l3Llm ?? deps.llm falls through to the main LLM. When set, the L3 clustering → abstraction subscriber uses the dedicated client.

l3Llm:
  provider: openai_compatible
  endpoint: https://openrouter.ai/api/v1
  model: anthropic/claude-sonnet-4-5
  apiKey: sk-or-...

defaults.ts — blank default (inherits main llm)
schema.ts — l3Llm: SkillEvolverSchema entry
memory-core.ts — bootstrap client (mirrors reflectLlm path; errors log, don't crash)
deps.ts / types.ts — PipelineDeps.l3Llm and PipelineHandle.l3Llm fields
orchestrator.ts — threads the field through to the handle
orchestrator.test.ts — two tests: dedicated client threads through, null when unconfigured

Config example

Operators running a budget model as their main llm can now point L3 at a stronger model:

llm:
  provider: openai_compatible
  model: gemini-2.5-flash-lite  # cheap, fast, good for scoring

l3Llm:
  provider: openai_compatible
  model: anthropic/claude-sonnet-4-5  # slow ok — L3 is async
  apiKey: sk-or-...

L3 abstraction runs on the main `llm`, which on cheap models (gemini-2.5-flash-lite) over-extracts and truncates the JSON, producing "'constraints'/'inference' must be an array" failures — Violet logged 180 such failures and produced only 2 world-model facts total. Add an `l3Llm` config slot (same SkillEvolverSchema shape as `skillEvolver`). Blank inherits the main `llm` (zero behavior change); set explicitly to run the clustering → world-model pass on a stronger model. L3 is async/off the turn-response path, so a slower-but-correct model has no impact on companion latency. Wiring mirrors skillEvolver: schema + blank default + secret redaction + bootstrap client + PipelineDeps/PipelineHandle field + orchestrator handle passthrough, consumed at the L3 subscriber attach (`deps.l3Llm ?? deps.llm`).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(l3): dedicated l3Llm config slot for abstraction pass#1959

feat(l3): dedicated l3Llm config slot for abstraction pass#1959
chiefmojo wants to merge 1 commit into
MemTensor:dev-20260604-v2.0.19from
chiefmojo:feat/l3-dedicated-llm

chiefmojo commented Jun 21, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

chiefmojo commented Jun 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Changes

Config example

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

chiefmojo commented Jun 21, 2026 •

edited

Loading