AIN-303 · routing_outcomes.source (0036) by hizrianraz · Pull Request #99 · ainfera-ai/api

hizrianraz · 2026-05-29T06:07:10Z

source discriminator (prod|synthetic|shadow) enforcing INVARIANT 1's synthetic wall in SQL. Additive, reversible.

Note

Low Risk
Schema-only additive migration with safe backfill default; no application or routing logic changes in this diff.

Overview
Adds a source column on routing_outcomes so prod routing-policy refits can exclude synthetic and shadow rows in SQL (INVARIANT 1).

The migration is additive: NOT NULL with server_default='prod' (existing rows treated as prod traffic), a CHECK on prod / synthetic / shadow, and ix_routing_outcomes_source for filtered corpus reads. Downgrade drops the index, constraint, and column.

^{Reviewed by Cursor Bugbot for commit 0e9ceb1. Bugbot is set up for automated code reviews on this repo. Configure here.}

…l (0036) Adds a `source` discriminator (prod|synthetic|shadow, NOT NULL default 'prod', CHECK + index) so the synthetic cold-start loop's rows can never feed a prod routing-policy promotion — prod refits filter source='prod'. Existing 147 real rows backfill to 'prod'. Additive; Disc #12 intact. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

linear-code · 2026-05-29T06:07:14Z

AIN-303 [Labs] Synthetic cold-start — bootstrap methodology on Spark Day 0 (planted-solution validation)

Bootstrap the L14.2 methodology on synthetic data the day Spark lands (2026-05-29)

Lets the cadence (judge → LinUCB refit → replay → promote) start running Day 0, BEFORE real agent traffic, and run daily through the SG-incorp/payments wait window until real volume arrives post-launch. Resolves the cold-start chicken-and-egg (no volume → no moat) for the waiting period.

Design (canonical): ainfera-ai vault methodology/synthetic-coldstart-2026-05-28.md. Aulë prompt: ainfera-os prompts/synthetic-coldstart-spark-day0-2026-05-29.md. Generator validated locally: 24 agents, 8 living models, 4000 outcomes, 60 cells, ~48% completion + a planted answer key.

What it is / is NOT (Disc #12 + research integrity)

IS FOR: pipeline validation, q_prior calibration, cell coverage, exploration warmup.
IS NOT FOR: production policy promotion, the preprint headline number, or any moat/traction claim. Production promote = REAL outcomes only. Synthetic deltas are circular (we generate/judge/route/score against our own baseline) → pipeline QA, not science.

Planted-solution design (why it's rigorous)

Synthetic world has a KNOWN ground-truth optimal routing (planted_optimum.json answer key). Logging policy is deliberately suboptimal so the bandit has room to learn. SANITY GATE: learned policy must converge toward the planted optimum. Pipeline bug → caught Day 0 on Spark, not after weeks of real traffic.

Provenance separation (protects the clean corpus we just purged)

Every synthetic row tagged source='synthetic_coldstart_v1', synthetic=TRUE, seed, affinity_version, judge_model='planted_oracle'.
Lands in its OWN table routing_outcomes_synthetic (mirrors shape) — NEVER the real routing_outcomes (the clean 101-row moat corpus stays pristine).
Cadence runs WARMUP mode against synthetic; production active_policy_version promotes on real only. Droppable when real volume arrives.

Waves (Aulë, from Spark Day 0)

Additive migration: CREATE routing_outcomes_synthetic (mirror + source/synthetic/affinity/difficulty/completed/hallucinated/refused cols + judge_queue/policy_cell indexes). Alembic, not hand-applied. Depends on AIN-301 (training_runs + ainfera_labs).
Load generator output (seed 42, n≥4000) → synthetic table only; verify provenance + cell coverage.
Point cadence at synthetic in warmup mode (planted_oracle rows pre-labeled).
Validation gate: convergence report vs planted_optimum (start target 70% of covered cells); training_runs row cadence='synthetic_warmup'.
Run daily through the wait window; flip to real corpus post-launch; keep synthetic as regression fixture; drop when stable.

Acceptance

routing_outcomes_synthetic exists (additive); real corpus untouched (still 101)
≥4000 synthetic outcomes, all provenance-tagged, ~60 cells
cadence green on synthetic; convergence report produced; training_runs row written
production promote proven to ignore synthetic

Depends on: AIN-301 (Labs DB), AIN-294 (Spark provisioning). Blocks: meaningful AIN-288 delta on real (synthetic is not the delta).

Review in Linear

hizrianraz merged commit 5a78650 into main May 29, 2026

hizrianraz deleted the hizrianraz/ain-303-outcome-source branch May 29, 2026 06:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AIN-303 · routing_outcomes.source (0036)#99

AIN-303 · routing_outcomes.source (0036)#99
hizrianraz merged 1 commit into
mainfrom
hizrianraz/ain-303-outcome-source

hizrianraz commented May 29, 2026 •

edited by cursor Bot

Loading

Uh oh!

linear-code Bot commented May 29, 2026 •

edited

Loading

Bootstrap the L14.2 methodology on synthetic data the day Spark lands (2026-05-29)

What it is / is NOT (Disc #12 + research integrity)

Planted-solution design (why it's rigorous)

Provenance separation (protects the clean corpus we just purged)

Waves (Aulë, from Spark Day 0)

Acceptance

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hizrianraz commented May 29, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

linear-code Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bootstrap the L14.2 methodology on synthetic data the day Spark lands (2026-05-29)

What it is / is NOT (Disc #12 + research integrity)

Planted-solution design (why it's rigorous)

Provenance separation (protects the clean corpus we just purged)

Waves (Aulë, from Spark Day 0)

Acceptance

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hizrianraz commented May 29, 2026 •

edited by cursor Bot

Loading

linear-code Bot commented May 29, 2026 •

edited

Loading