Skip to content

wesleysimplicio/simplicio-loop

Repository files navigation

🔁 simplicio-tasks — The Universal Looping AI Orchestrator

simplicio-loop

Stars 11 skills 5 source adapters 11 runtimes 48 extension points Up to 96% fewer tokens License Join the Simplicio Discord

TL;DR · 11 Skills · Source Adapters · 11 Runtimes · The Loop · Token Economy · Capture Engine · Install

🌍 Languages:
🇬🇧 English | 🇧🇷 Português | 🇪🇸 Español | 🇫🇷 Français | 🇩🇪 Deutsch | 🇮🇹 Italiano | 🇯🇵 日本語 | 🇰🇷 한국어 | 🇨🇳 简体中文 | 🇷🇺 Русский | 🇵🇱 Polski | 🇹🇷 Türkçe | 🇳🇱 Nederlands | 🇮🇳 हिन्दी | 🇸🇦 العربية


⚡ TL;DR

simplicio-tasks is a runtime-agnostic super-plugin — one autonomous looping orchestrator (invoked as /simplicio-tasks) plus five satellite skills — that turns any strong LLM (Claude, Codex, Copilot, Gemini, Cursor, local models) into a self-driving worker. You point it at a body of work — "finish all the open issues", "clear the CI queue", "drain the Jira board" — and it runs the whole lifecycle on its own:

discover → understand → decide → act → verify → correct → record → repeat

It discovers work from any source (GitHub Issues, Jira, Azure DevOps, agentsview sessions, and more), dedups, auto-scales an agent fleet to your machine, implements each item through a quality loop that runs the code (not just compiles it), opens PRs, resolves CI/review feedback, merges, and keeps watching 24/7 for new work — all behind safety gates and a hard cost kill-switch.

/simplicio-tasks finish all open issues
→ identity + pre-flight (kill-switch, auth, watcher)
→ discover 50 issues · dedup · build dependency DAG
→ autoscale fleet = 14 · pipeline implement→review→merge
→ each item: read body+ACs → orient code → plan → edit → run → verify → PR
→ merge · close with evidence · rollback if main breaks
→ keep looping every ~2 min until the queue is dry (evidence-gated, never a false "done")

Three things make it different: it is a super-plugin of focused skills, it runs the same protocol on 11 runtimes, and it does all of this with aggressive, honest token economy.

simplicio-loop — the whole system at a glance: 6 core skills, 5 satellites, 5 accelerators, 48 extension points, 11 runtimes, up to 96% fewer tokens


📘 Official capability record (v3.10.2)

The complete, official roster of what simplicio-tasks ships — every capability below is real, runnable, and tested (python3 scripts/check.py: claims-audit 4/4 + 28 tests). Each links to its deep section and its worker.

Capability What it does Proof / worker Details
🎬 Video evidence (video_evidence) Records the real browser session as moving proof a UI change works (Playwright, default); renders a deterministic captioned MP4 with hyperframes for an explicit explainer request (/simplicio-tasks make a video of screen X) scripts/video_evidence.py · BLOCKED (never fake-pass) without the toolchain § Video evidence
🧠 Attempt memory + stall detector A durable run-journal (.orchestrator/loop/journal.jsonl) + a stall detector so the loop changes strategy instead of oscillating; incremental triage (since) reads only the delta each turn scripts/loop_journal.py · selftest 9/9 § Anti-oscillation
🧭 Repo conventions (repo_conventions) Learns the repo's own playbook — mines git history + merged PRs + static config into .orchestrator/conventions.json so every new branch/commit/PR mirrors the team's established style; worktree-per-item isolation is the default scripts/repo_conventions.py · selftest 19/19 § The full flow
🔒 Fail-closed safety gate (action_gate) A PreToolUse/git-pre-push hook that mechanically blocks force-push, history rewrite, mass-delete, destructive DDL, infra teardown, and secret-laden commits/pushes — Step 5 made executable, not prose hooks/action_gate.py · selftest 15/15 § Safety
🔬 Local verification A test suite (worker selftests + an e2e of the loop driver proving evidence-gated exit) + a claims-audit (referenced scripts exist · counts consistent · _bundle ≡ source) — all local, no paid CI scripts/check.py · scripts/claims_audit.py · tests/ § Tests & local checks
Honest savings The savings line is now evidence-gated, not mandatory — a number is shown only with a measured receipt (clamp/signatures/cache/deterministic_edit/ledger); never fabricated token-economy contract § Token economy

Two loop modes make termination explicit: converge (a single hard task — ends on the evidence-gated <promise> or a stall escalation) vs drain (a queue — ends when the source re-query stays empty K rounds). Both still obey the universal exits (promise+evidence, max_iterations, budget, STOP).

Loop scoring across this line of work: 7.5 (strong design, unproven) → 9 (attempt memory + anti-oscillation) → 9.5 (reproducible local proof) → ~10 (enforced safety + complete loop semantics). The verification infra now catches the project's own regressions as it grows.


🧠 The 11 skills & accelerators

The orchestrator core + five satellites + five accelerators/integrations. Each satellite is optional — when loaded, the orchestrator delegates to it (richer + cheaper); when absent, the inline protocol covers 100%. Accelerators are auto-detected — present = used, absent = LLM fallback.

# Capability Absorbs What it does Token impact
1 🔁 simplicio-tasks The orchestrator loop: 48 extension points, dual-path router, self-audit convergence Core
2 ♾️ simplicio-loop ralph-loop Hardened Ralph loop: evidence-gated <promise> exit, max_iterations cap Loop drive
3 🧱 simplicio-orient rtk + caveman Terminal-first execution, output-reduction catalog, tee-cache, signatures-read L0 deterministic
4 🔥 simplicio-review thermos Parallel adversarial review on distinct rubrics → deduped verdict Quality gate
5 🗜️ simplicio-compress caveman Output + memory compression, fail-closed transform_guard 40-60% fewer
6 🎓 simplicio-learn teaching Post-run retrospective → durable, deduped lessons in memory Smarter each run
7 🧭 Understand Anything Egonex-AI Knowledge graph orient: semantic search, guided tours, dependency graph L0 zero tokens
8 📊 agentsview kenn-io Session analytics, cost tracking, stalled-session discovery L1 SQL only
9 LMCache LMCache KV cache between loop turns — 40-70% TTFT reduction on local models GPU time ↓
10 🗜️ Simplicio capture engine engine/simplicio_engine.py (native, stdlib-only; savings-schema compatible with the OSS headroom project) Transparent capture proxy: forwards to the real provider, measures + deterministically compresses, writes proxy_savings.json deterministic
11 🎬 video_evidence Playwright (default) · hyperframes (on request) Records the real session as moving proof of a UI change (Playwright); renders a deterministic captioned MP4 explainer with hyperframes when the video IS the deliverable Evidence producer

Each skill lives under .claude/skills/; each accelerator has a reference doc under .claude/skills/simplicio-tasks/references/ (the video producer: video-evidence.md, worker scripts/video_evidence.py).


📡 Source adapters

The orchestrator discovers work from any source via pluggable adapters. Each exposes six verbs: list_ready, get_details, claim, update_status, attach_evidence, close.

Source Adapter Purpose
GitHub Issues/PRs gh CLI (native) Primary work-item source
Jira / Asana / ClickUp / Linear / Notion host connector Board/project management
Trello / Azure DevOps az boards adapter Azure work tracking
agentsview sessions scripts/agentsview_adapter.py Stalled session recovery + cost observability
Local files / CI queue filesystem / CI API Internal work tracking

See each adapter's reference doc under .claude/skills/simplicio-tasks/references/.


🌐 11 runtimes, one protocol

One universal skill core + one set of hooks drives every runtime. An adapter is thin: it tells a runtime where to load the skills, how to arm the loop, and how to bind native speed. The skill names no runtime; the runtime detects the skill.

Runtime Skill load Loop drive Native bind
Claude Code .claude/skills/ + plugin Stop hook MCP
Codex AGENTS.md self-paced MCP / adapter
VS Code (Copilot) copilot-instructions.md tasks MCP
Cursor .cursor-plugin/ stop+afterAgentResponse MCP / rules
Antigravity rules / AGENTS.md self-paced MCP
Kiro .kiro/steering/ specs MCP
OpenCode AGENTS.md self-paced MCP
Gemini GEMINI.md self-paced MCP / adapter
Aider CONVENTIONS.md self-paced — (LLM fallback)
Hermes native recall native loop native
OpenClaw plugin SDK native scheduler native

The promise: same protocol, same gates, same safety on all 11 — only the speed differs. orient_clamp.py (token economy) works on every runtime with zero wiring. See adapters/MATRIX.md.


🗺️ The full flow — from demand to delivery

Every layer the orchestrator acts on, in order — from reading the demand (issues, tasks, assigns) to delivering merged, evidenced work, then looping 24/7 for more.

flowchart TD
  subgraph SRC["1 · Demand sources (any adapter)"]
    direction LR
    S1["GitHub Issues / PRs / CI"]
    S2["Jira · Azure DevOps · Linear · ClickUp · Notion · agentsview · Understand Anything (orient)"]
    S3["Assigns · TODO/FIXME · CVE · local files · LMCache (inference accelerator)"]
  end
  SRC --> PF
  subgraph PF["2 · Pre-flight gates"]
    direction LR
    P1["cost kill-switch budget · agentsview cost check"]
    P2["source auth + scopes"]
    P3["arm 24/7 watcher"]
  end
  PF --> DISC
  subgraph DISC["3 · Discover + normalize"]
    direction LR
    D1["source_adapter: list metadata only"]
    D2["normalize to canonical schema"]
    D3["dedup id+title+fingerprint+branch/PR"]
    D4["dependency DAG"]
  end
  DISC --> INTK
  subgraph INTK["4 · Deep intake (per item)"]
    direction LR
    I1["body + ALL comments"]
    I2["extract acceptance criteria"]
    I3["orient code · signatures-only reads or Understand Anything knowledge graph"]
    I4["plan + AC checklist + complexity"]
  end
  INTK --> RT{"5 · Route"}
  RT -->|"small and every item complexity at most 3"| FAST["Fast-path: solo, one targeted test"]
  RT -->|"large queue or any medium+"| POOL
  subgraph POOL["6 · Continuous worker pool (autoscaled, conflict-aware)"]
    direction LR
    W1["claim · branch · worktree if overlap"]
    W2["deterministic_edit"]
    W3["quality loop: edit-lint-test-fix"]
  end
  FAST --> QG
  POOL --> QG
  subgraph QG["7 · Quality gates"]
    direction LR
    Q1["AC gate = real DoD"]
    Q2["WORKS not just compiles · web_verify (Playwright) · video_evidence (Playwright recording · hyperframes on request)"]
    Q3["adversarial review · thermos rubrics"]
  end
  QG --> SG
  subgraph SG["8 · Safety gates (non-negotiable)"]
    direction LR
    G1["secret-scan"]
    G2["irreversible-op human gate"]
    G3["4-state verdict · attestation"]
  end
  SG --> DEL
  subgraph DEL["9 · Deliver"]
    direction LR
    L1["commit · push · Draft PR"]
    L2["close in-source + evidence"]
    L3["verify reality, not self-report"]
  end
  DEL --> FB
  subgraph FB["10 · Feedback loop to merge-ready"]
    direction LR
    F1["CI fail -> fix root cause"]
    F2["review comments -> adjust"]
    F3["branch behind main -> additive rebase"]
  end
  FB -->|"merged and closed"| DONE(["done + evidence + measured savings (only if a receipt exists)"])
  WATCH["11 · 24/7 watcher · simplicio-loop evidence-gated promise · max-iterations cap · cost kill-switch · LMCache KV cache warm"]
  FB -. "poll new work / comments / checks" .-> WATCH
  DONE -. "idle until new work" .-> WATCH
  WATCH -. "re-feed the goal" .-> DISC
Loading

🔁 The loop

The Evidence-Gated Loop is the core mechanism. It re-feeds the same goal each turn so the agent sees its own prior work. Exit is ONLY via:

  1. Evidence-gated <promise> — the turn that emits the promise MUST also carry concrete proof (passing test, merged PR, closed-item re-query). A promise with no evidence = ignored.
  2. max_iterations cap — hard safety backstop
  3. Budget kill-switchdaily_usd_ceiling halts the loop when spent
  4. STOP signal.orchestrator/STOP or channel command

Between turns, LMCache (when available) caches the KV state so re-feed costs near-zero prefill.

🧠 Attempt memory + stall detector (anti-oscillation)

A re-feed loop that remembers nothing oscillates — try X, fail, try X again — until the cap burns. simplicio-loop keeps a durable run-journal (.orchestrator/loop/journal.jsonl, append-only: iteration · action · hypothesis · gate · error-fingerprint) and a stall detector (scripts/loop_journal.py, deterministic + model-free):

  • Error fingerprint — the failing gate output is reduced to a stable hash with line numbers, paths, hex/uuids, timestamps and durations normalized away, so the same bug is recognized across turns even when the incidental text differs.
  • Stall = K identical-fingerprint failures in a row (default K=3). A changing fingerprint means the loop is moving (PROGRESS); the same one K times means it is spinning (STALLED).
  • On STALLED the loop does not re-feed the same goal — it names the dead-end actions to avoid, then switches strategy or escalates to the human gate with the fingerprint.
  • loop_journal.py resume is read at the top of every turn, so a fresh process continues without re-deriving prior attempts (real resume) and never retries a known dead-end.
loop_journal.py resume                       # what was tried + dead-ends to avoid
loop_journal.py record --iteration N --action "" --gate fail --gate-output test.log
loop_journal.py stall --k 3 --exit-code      # PROGRESS → re-feed · STALLED → switch/escalate

🎬 Video evidence — Playwright by default, hyperframes on request

The loop produces demo videos as proof a change works — two engines, one video_evidence extension point (worker scripts/video_evidence.py, contract references/video-evidence.md):

  1. Default — the normal evidence flow uses Playwright. After a UI change, video_evidence records the real browser session driving the screen (Playwright native video → .webm, → .mp4 with FFmpeg) — the strongest "works, not just compiles" receipt (Step 4b) and a valid evidence-gated <promise>.

    python3 scripts/video_evidence.py verify --url http://localhost:3000/login \
        --name login-demo --expect "Sign in" --issue 42 [--upload --pr 42]
  2. On request — a personalized explainer uses hyperframes. When the deliverable IS a video ("make an explainer video of screen X"), the orchestrator renders a deterministic, captioned slideshow of the web_verify screenshots with hyperframes (by HeyGen — "same input, same frames, same output", CI-reproducible, no API keys, local render via headless Chrome + FFmpeg).

    /simplicio-tasks make an explainer video of the system login screen
    → detect: video-creation request → web_verify captures the screens
    → video_evidence verify --engine hyperframes → deterministic MP4 → attached to the PR
    

Either engine: a video that never recorded/rendered yields BLOCKED, never a fake pass. Evidence is always a file path + boolean verdict — never video bytes in context (token economy).


📊 Token economy

Technique Savings
deterministic_edit (L0) 100% of edit tokens (file written mechanically, never by LLM)
Terminal-first execution Facts from shell, not LLM hallucination
Output-reduction catalog Caps per command type (CAP_ERRORS=20, CAP_WARNINGS=10, CAP_LIST=20) — orient_clamp.py
Tee+CCR cache on failure Never re-run a failed command — read the cached output
Signatures-only reads simplicio-cli signatures <file> — 870-line file → 65 lines (93% saved), bodies stripped
simplicio-compress Terse prose + one-time memory compaction
orient_clamp.py Clamp + tee on every shell command, zero wiring
Native response cache repeated deterministic (temp=0) request → served from cache, skips the LLM call (100% on hit) — simplicio-cli cache, on by default (SIMPLICIO_CACHE=0 to disable)
Simplicio capture proxy + MCP 60-95% fewer tokens on tool outputs via a transparent compression daemon

Savings only count on a verified-correct outcome. Baseline = the cheapest sensible non-orchestrated path to the same result. Savings reporting is evidence-gated, not mandatory: a savings figure is shown only when a turn actually ran an economy-producing command and the number traces to a measured receipt (clamp tee, signatures-read, cache hit, deterministic_edit, savings_ledger). No measured economy → no savings line; the orchestrator never fabricates a baseline or a percentage. See references/token-economy.md.

🔎 Running simplicio-tasks: economy vs measurement (per runtime)

Two different things happen when you call simplicio-tasks, and they behave differently per runtime:

  • Economy — compression, output clamps, signatures-only reads, deterministic_edit — applies every time the skill runs and loads simplicio-orient / simplicio-compress, on any runtime. It is the skill's behavior plus the hooks (strongest where hooks exist: orient_clamp.py auto-clamps on Claude and Cursor; elsewhere it is instruction-driven).
  • Measurement — the Token Monitor's live numbers — only counts traffic that flows through the capture proxy.
Runtime Economy (skill) Measurement (monitor)
Hermes automatic — already routed through the proxy (base_url → :8788)
Claude ✓ (skill + hooks) ✗ by default — Claude talks to api.anthropic.com directly; measured only once routed (simplicio-cli wrap claude, or ANTHROPIC_BASE_URL → http://127.0.0.1:8788)
Codex ✓ (skill) ✗ by default — simplicio-cli init codex adds the MCP tools but does not route LLM traffic; measured with simplicio-cli wrap codex or an OpenAI base-url pointing at the proxy

So: the savings happen on every runtime; the monitor tallies them automatically on Hermes, and on Claude/Codex after a one-time routing step (simplicio-cli wrap … / base-url → :8788). Without routing, the economy still applies — the monitor just won't count those tokens. scripts/simplicio-economy.sh wire does this routing for OpenAI-compatible clients at install time.

📈 Simplicio Token Monitor

A view of the savings you open when you want — only the capture is always-on:

  • Capture proxyalways-on (the one auto-started service; the wired clients need it reachable). It silently captures + measures Claude + Codex + Hermes in the background.
  • Web dashboardhttp://127.0.0.1:9090 — real-time token chart, savings gauge, the LLMs/runtimes and 141/144 providers (98%) we intercept, a live proxy log. Opens once on the first install so you see it works, then it's on-demand — re-open it any of these ways:
    • simplicio-loop dashboard — works from anywhere after the pip install (no repo path needed); simplicio-loop dashboard --stop to close, --no-browser to just start the server.
    • bash scripts/simplicio-economy.sh monitor (repo checkout) · … monitor stop to close.
    • just ask the agent — "open the token dashboard".
  • Menu-bar / tray widget — live tokens saved in the system tray (macOS rumps · Windows/Linux pystray). On-demand: bash scripts/simplicio-economy.sh tray · … tray stop.

Install auto-starts only the capture proxy (macOS launchd · Linux systemd · Windows Startup). The dashboard opens once on a fresh install (marker-guarded — a re-install/update never reopens it; opt out with SIMPLICIO_NO_DASHBOARD=1), and the tray never opens by itself — nothing is forced to stay open. Manage the stack: scripts/simplicio-economy.sh {status|up|monitor|tray|wire}. After install, capture runs without invoking the loop — see references/token-capture.md.

🛠️ The capture engine — one native module, every command

engine/simplicio_engine.py is the native Simplicio capture engine (stdlib-only, fail-open) — a full reimplementation of the upstream headroom surface with no external dependency. Run any command via the scripts/simplicio-engine wrapper (e.g. simplicio-engine doctor):

Command What it does
proxy the transparent capture proxy — routes each model to its real provider, compresses + measures + caches (no model swap)
doctor proxy reachability + lifetime savings
cache native response cache (stats/clear) — a repeated deterministic request is served from cache, skipping the LLM call
signatures signatures-only view of a source file (bodies stripped, ~93% fewer tokens to read code)
semantic reversible extractive (semantic-lite) compression
kompress ONNX semantic token-pruning via the real kompress-v2-base model
detect content-type detection + smart per-block routing
rag TF-IDF (or --ml embedding) retrieval over the CCR memory store
memory CCR compress-cache-retrieve store (remember/recall/forget/list/stats)
mcp native stdio MCP server (compress / retrieve / stats tools)
init / wrap register Simplicio into a client (Claude / Codex / Copilot / OpenClaw) · run a client with capture routing
report / audit / capture / evals savings report · audit a tree for compression opportunity · dry-run a request · compression regression gate

🧠 Optional real ML models — pip install "simplicio-loop[onnx]"

Four real, public (Apache-2.0) ONNX models run natively — the same models the upstream uses. Without the extra, the deterministic stdlib path covers everything; models download on first use.

Model Command Use
kompress-v2-base simplicio-cli kompress semantic token pruning
technique-router-onnx simplicio-cli router technique routing
all-MiniLM-L6-v2-onnx simplicio-cli embed · rag --ml embeddings + semantic RAG
siglip-image-encoder-onnx simplicio-cli image image-compression content verifier

⚙️ Native Rust performance core (optional)

rust/ ships four crates ported + rebranded from the upstream (Apache-2.0; NOTICE credits it): simplicio-core (compressors + smart-crusher), simplicio-py (PyO3 bindings), simplicio-proxy (axum reverse proxy), simplicio-parity (Rust↔Python parity harness). Build with maturin — the Python engine works fully without them; the crates only add native speed.


🏛️ Design pillars (in detail)

Four mechanisms sustain the orchestration power:

Pillar Focus Lives in
DAG + pipeline parallelism by dependency, staged per item references/orchestration.md (Step 3 pool + pipeline)
Isolation by worktree parallel edits without corrupting the tree, merge-gated references/orchestration.md
Adversarial verify panel of skeptics before "delivered" references/quality-safety-delivery.md · skill simplicio-review
Loop budget cap anti-infinite-loop, dual exit references/standing-loop-247.md · skill simplicio-loop

🚀 Install & use

git clone https://github.com/wesleysimplicio/simplicio-loop
cd simplicio-loop

# install for your runtime (omit <runtime> to auto-detect)
bash scripts/install.sh <runtime> [--global] [--minimal]        # macOS / Linux
pwsh scripts/install.ps1 <runtime> [-Global]                    # Windows
# <runtime> ∈ claude codex vscode cursor antigravity kiro opencode gemini aider hermes openclaw

Install is complete by default — it installs everything. One command sets up the whole stack: the two loop operators (simplicio-mapper + simplicio-cli, auto-handling PEP 668 / externally-managed Python and symlinking the binaries onto PATH), the full Python stack (the package + the [onnx] models backend: onnxruntime + huggingface_hub + tokenizers + pillow, so simplicio-cli kompress/router/embed/image work), the 6 skills + hooks with the loop's Stop hook wired, and the always-on capture proxy with Claude + Codex + Hermes routed and measured in the background. The dashboard opens once on a fresh install, then it's on-demand (simplicio-loop dashboard / simplicio-economy.sh monitor); the menu-bar tray never opens by itself — nothing is forced to stay open. Pass --minimal only for headless/CI to skip the heavy deps + the machine services. Verify any time: bash scripts/simplicio-economy.sh status.

Update

bash scripts/update.sh [<runtime>]    # git pull → reinstall skills/hooks/operators → restart services

update.sh stashes local edits, fast-forwards main, reinstalls from the fresh source, restarts the launchd/systemd services so they run the new code, and prints the live stack + savings.

Doctor — verify + repair

python3 scripts/doctor.py            # report the whole stack (REQUIRED vs OPTIONAL)
python3 scripts/doctor.py --repair   # install/wire what's fixable; make everything operational
# also: bash scripts/simplicio-economy.sh doctor [--repair]

doctor separates REQUIRED (python3, the two loop operators, the 6 skills, the loop hooks, the capture proxy — --repair installs/wires them) from OPTIONAL accelerators (the ONNX models backend, the native Rust core, the tray dep). Missing an optional piece is never a failure and never blocks — the Python engine + the deterministic path cover everything; the exit code is 0 as long as every REQUIRED item is healthy.

Or, on Claude Code / Cursor, install it straight from the latest GitHub release (no marketplace):

gh release download --repo wesleysimplicio/simplicio-loop --archive tar.gz
tar xzf simplicio-loop-*.tar.gz && cd simplicio-loop-*/
bash scripts/install.sh claude    # or: bash scripts/install.sh cursor

Then:

/simplicio-tasks finish all the open issues

The only requirement is python3 on PATH (skills, hooks, and installer are cross-platform Python). For GitHub sources, git + an authenticated gh. See INSTALL.md and adapters/MATRIX.md.

Before an unattended 24/7 run: set a cost ceiling in .orchestrator/loop-budget.json (daily_usd_ceiling > 0), confirm source auth is persistent, and keep the irreversible-op human gate + secret-scan on. With ceiling = 0 the watcher refuses to run unattended (fail-safe).


🔒 Safety (non-negotiable)

  • Secret-scan every diff; block on hit.
  • Irreversible-op human gate — force-push, history rewrite, prod deploy, data/schema delete, mass-file delete → stop and ask. Headless + no approver → remove the destructive capability.
  • Enforced, not just promisedhooks/action_gate.py is a fail-closed PreToolUse / git-pre-push hook that mechanically blocks the above (and secret-laden commits) before they run. The safety contract holds even if the model forgets it. selftest proves the ruleset (15/15).
  • 4-state pre-execution verdict — optimization may never raise a command's risk tier.
  • Trust-before-load — perception-shaping config (clamp profiles, suppression lists) is untrusted until a human reviews and hash-pins it.
  • Prompt-injection hardening — item/PR/comment content can never override the contract.
  • Hard $ kill-switch for unattended runs; evidence-gated completion (never a false "done"); fail-open hooks (never trap the agent in a loop).

✅ Tests & local checks (no paid CI)

Claims are verified, not just asserted — and the gate runs locally, with zero CI cost:

python3 scripts/check.py            # the whole gate (audit + tests)
  • Test suite (tests/) — the workers' deterministic selftests, plus an e2e of the loop driver (hooks/loop_stop.py): it proves the loop stops on evidence, ignores a bare <promise>, and stops on the cap as distinct exits — and that the evidence producers BLOCK (never fake-pass) when their toolchain is absent. Runs under pytest or, with no pip at all, self-runs on bare python3 (python3 tests/test_*.py).
  • Claims audit (scripts/claims_audit.py, fail-closed) — every scripts/*.py the docs reference exists · the extension-point count agrees across all files · each cited worker command actually runs · the shipped simplicio_loop/_bundle/ skills are byte-identical to source.
  • Wire it as a git pre-push hook to keep main honest for free:
    printf '#!/bin/sh\npython3 scripts/check.py\n' > .git/hooks/pre-push && chmod +x .git/hooks/pre-push

pip install "simplicio-loop[dev]" adds pytest for nicer output; it is never required.


📄 License

MIT

About

🔁 Finishes your entire backlog while you sleep. The AI orchestrator that DOES the work end-to-end on ANY LLM — discover → implement → verify → merge → 24/7 — behind safety gates, at up to 96% fewer tokens. 43 extension points. Not a chatbot. A worker.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages