Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .console/log.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
# Log

## 2026-05-21 — Update ADR-0003 to reference CI design

Added "Related" section to ADR-0003 documenting the relationship between
tiered cognition and the continuous improvement schema: trace data compatibility
(LineageAttempt.replay_metadata feeds cognition_summary), refinement as a
bounded-cognition amortization strategy, and the explicit non-introduction of
a CognitionTier enum (consistent with ADR-0003 D1 / ADR-0002 G1).

## 2026-05-21 — Wire CI coordinator into board_worker call-site

board_worker/main.py: after planning, check bundle.proposal.continuous_improvement.
Expand Down
35 changes: 35 additions & 0 deletions docs/architecture/adr/0003-tiered-cognition-experimental-rails.md
Original file line number Diff line number Diff line change
Expand Up @@ -260,6 +260,41 @@ Out of scope:
cross-file reasoning, novel architecture decisions — frontier
cognition continues to matter for those.

## Related

### Continuous improvement schema (2026-05-21)

The continuous improvement extension (see
[docs/design/continuous-improvement/design.md](../../design/continuous-improvement/design.md))
introduces evaluation-driven refinement as a complementary axis to tiered
cognition. Where tiered cognition asks *which model should run this node?*,
the CI schema asks *did this execution improve the target metric, and should
we retry with a variation?*

The two interact in two concrete ways:

**Trace data.** Each CI attempt produces a `LineageAttempt` with a
`replay_metadata` dict that includes `runtime_binding_model`. This is exactly
the per-invocation provenance that ADR-0003 D2 targets — it feeds directly into
`cognition_summary.nodes_by_model` once that telemetry is landed. A CI run with
`max_attempts=3` across two model tiers gives three comparable data points with
consistent goal text and evaluation criteria, which is the paired-run evidence
G4 requires before a routing rule is safe to write.

**Refinement as a bounded-cognition strategy.** CI's `RefinementPolicy` is
structurally similar to the "plan once, execute many" amortization pattern
described in the Context section. A single OC evaluation run (strong model)
sets the `EvaluationSpec` baseline and strategy; each attempt can use a cheaper
or local model (`ImprovementStrategy.constraints` propagated into the
WorkerHandoff). The scoring loop provides the feedback signal that lets
bounded-cognition attempts improve without re-engaging frontier planning for
every retry.

The CI schema does not introduce a `CognitionTier` enum — consistent with
ADR-0003 D1 and ADR-0002 G1. Tier selection remains a workflow-level concern;
the CI spec carries `strategy.constraints` (policy) and the runtime binding
carries the actual model used (observability).

## Why this ADR exists at all

The architecture is unusually close to enabling tiered-cognition
Expand Down
Loading