Skip to content

AIN-303 · routing_outcomes.source (0036)#99

Merged
hizrianraz merged 1 commit into
mainfrom
hizrianraz/ain-303-outcome-source
May 29, 2026
Merged

AIN-303 · routing_outcomes.source (0036)#99
hizrianraz merged 1 commit into
mainfrom
hizrianraz/ain-303-outcome-source

Conversation

@hizrianraz
Copy link
Copy Markdown
Contributor

@hizrianraz hizrianraz commented May 29, 2026

source discriminator (prod|synthetic|shadow) enforcing INVARIANT 1's synthetic wall in SQL. Additive, reversible.


Note

Low Risk
Schema-only additive migration with safe backfill default; no application or routing logic changes in this diff.

Overview
Adds a source column on routing_outcomes so prod routing-policy refits can exclude synthetic and shadow rows in SQL (INVARIANT 1).

The migration is additive: NOT NULL with server_default='prod' (existing rows treated as prod traffic), a CHECK on prod / synthetic / shadow, and ix_routing_outcomes_source for filtered corpus reads. Downgrade drops the index, constraint, and column.

Reviewed by Cursor Bugbot for commit 0e9ceb1. Bugbot is set up for automated code reviews on this repo. Configure here.

…l (0036)

Adds a `source` discriminator (prod|synthetic|shadow, NOT NULL default 'prod',
CHECK + index) so the synthetic cold-start loop's rows can never feed a prod
routing-policy promotion — prod refits filter source='prod'. Existing 147 real
rows backfill to 'prod'. Additive; Disc #12 intact.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@hizrianraz hizrianraz merged commit 5a78650 into main May 29, 2026
@linear-code
Copy link
Copy Markdown

linear-code Bot commented May 29, 2026

AIN-303 [Labs] Synthetic cold-start — bootstrap methodology on Spark Day 0 (planted-solution validation)

Bootstrap the L14.2 methodology on synthetic data the day Spark lands (2026-05-29)

Lets the cadence (judge → LinUCB refit → replay → promote) start running Day 0, BEFORE real agent traffic, and run daily through the SG-incorp/payments wait window until real volume arrives post-launch. Resolves the cold-start chicken-and-egg (no volume → no moat) for the waiting period.

Design (canonical): ainfera-ai vault methodology/synthetic-coldstart-2026-05-28.md. Aulë prompt: ainfera-os prompts/synthetic-coldstart-spark-day0-2026-05-29.md. Generator validated locally: 24 agents, 8 living models, 4000 outcomes, 60 cells, ~48% completion + a planted answer key.

What it is / is NOT (Disc #12 + research integrity)

  • IS FOR: pipeline validation, q_prior calibration, cell coverage, exploration warmup.
  • IS NOT FOR: production policy promotion, the preprint headline number, or any moat/traction claim. Production promote = REAL outcomes only. Synthetic deltas are circular (we generate/judge/route/score against our own baseline) → pipeline QA, not science.

Planted-solution design (why it's rigorous)

Synthetic world has a KNOWN ground-truth optimal routing (planted_optimum.json answer key). Logging policy is deliberately suboptimal so the bandit has room to learn. SANITY GATE: learned policy must converge toward the planted optimum. Pipeline bug → caught Day 0 on Spark, not after weeks of real traffic.

Provenance separation (protects the clean corpus we just purged)

  • Every synthetic row tagged source='synthetic_coldstart_v1', synthetic=TRUE, seed, affinity_version, judge_model='planted_oracle'.
  • Lands in its OWN table routing_outcomes_synthetic (mirrors shape) — NEVER the real routing_outcomes (the clean 101-row moat corpus stays pristine).
  • Cadence runs WARMUP mode against synthetic; production active_policy_version promotes on real only. Droppable when real volume arrives.

Waves (Aulë, from Spark Day 0)

  1. Additive migration: CREATE routing_outcomes_synthetic (mirror + source/synthetic/affinity/difficulty/completed/hallucinated/refused cols + judge_queue/policy_cell indexes). Alembic, not hand-applied. Depends on AIN-301 (training_runs + ainfera_labs).
  2. Load generator output (seed 42, n≥4000) → synthetic table only; verify provenance + cell coverage.
  3. Point cadence at synthetic in warmup mode (planted_oracle rows pre-labeled).
  4. Validation gate: convergence report vs planted_optimum (start target 70% of covered cells); training_runs row cadence='synthetic_warmup'.
  5. Run daily through the wait window; flip to real corpus post-launch; keep synthetic as regression fixture; drop when stable.

Acceptance

  • routing_outcomes_synthetic exists (additive); real corpus untouched (still 101)
  • ≥4000 synthetic outcomes, all provenance-tagged, ~60 cells
  • cadence green on synthetic; convergence report produced; training_runs row written
  • production promote proven to ignore synthetic

Depends on: AIN-301 (Labs DB), AIN-294 (Spark provisioning). Blocks: meaningful AIN-288 delta on real (synthetic is not the delta).

Review in Linear

@hizrianraz hizrianraz deleted the hizrianraz/ain-303-outcome-source branch May 29, 2026 06:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant