[docs] Sandbox runtime metering — scoped-resource design#4783
Draft
mmabrouk wants to merge 2 commits into
Draft
Conversation
Design for metering agent-runner (Daytona) sandbox wall-time as a billable per-minute meter: capture runtime on the runner span, charge per-org in the tracing worker via check_entitlements, soft-gate at the invocation edge, and report to Stripe. Includes research grounding and an implementation checklist.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…ents Pivot the sandbox runtime metering design after verifying Daytona's API: there is no API-reachable per-sandbox usage/cost (dashboard-only, JWT-only /usage endpoint, 48h lag), and our sandboxes are ephemeral, so a Daytona-pull cron cannot be the billing source. Measurement stays in the runner; Daytona labels are repurposed for audit/reconciliation only. Reframe around a configurable, scoped resource (entitlement): gate folded into the cached auth check, trusted post-run report joined to the scope recorded at the gate (no new secret, attribution from the run record, idempotent on sandbox_id), project-scoped by default with user/agent scopes as later rungs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
We pay Daytona by the minute for the ephemeral VM that runs an agent, but that runtime never reaches our pricing surface — no meter, no per-plan limit, nothing reported to Stripe. This design adds sandbox wall-time as a first-class billable dimension, modeled as a configurable, scoped resource so it also lays the first rail for "limit usage per project / agent / user."
Documentation only. Adds a design under
docs/designs/sandbox-runtime-metering/(proposal, research, tasks). No code or schema changes.The model
A resource is a named, scoped entitlement with a quota — exactly what the EE
check_entitlements/Quota/Scope/meterslayer already provides. A request declares the resource it's about to consume; the system checks entitlement at the same cached point it already checks auth, and only then runs; the consumed minutes are booked after. Shipped project-scoped by default (Scope.PROJECT), withUSERavailable today and a newAGENTscope as an explicit phase 2.What changed since the first draft (and why)
The first draft pulled runtime out of the OTel trace pipeline. We then explored "tag sandboxes, let a cron pull usage from Daytona, never touch the run path." Verifying Daytona's API killed that approach for our workload:
/organizations/:id/usageendpoint is live quota snapshots, not cost, org/region-scoped, and currently JWT-only — not API-key callable (open issue daytonaio/daytona#4643).finally), so alist()cron finds them already gone, and there's nostartedAt/stoppedAtto reconstruct runtime from.So measurement stays in the runner (the only component that observes a full lifetime), and Daytona labels are repurposed for audit / leak-detection / reconciliation, not billing.
Design (three insertion points + an audit cron)
check_entitlements(resource, cache=True)folded into the cached auth check: "authenticated and entitled to run?" Returns 429 once the project is over its monthly minute budget. Recordsrun_id → resolved scopefor attribution.services/agent/src/engines/rivet.ts::runRivet()already bracketsSandboxAgent.start()→destroySandbox()(warmup included); captureruntimeMsand tag the sandbox withlabels.(run_id, sandbox_id, minutes)to an internal endpoint authed as the agent service's existing credential (not the admin key). Attribution comes from the run record (never the payload), so it can't bill arbitrary tenants and adds no new secret; idempotent onsandbox_id; charges via the same atomic, fail-opencheck_entitlementsevery meter uses.list()non-deleted sandboxes by label to flag orphaned/leaked VMs and sanity-check the 48h-lagged dashboard. Not a billing source.Everything else is the well-worn "add a counter" path (
extend-meters): one enum member, aQuota(scope=PROJECT, period=MONTHLY, strict=True)per plan, one Alembic enum migration (template exists), add the slug toREPORTSso the existing meters→Stripe cron flushes it (project rows roll up per org viaorganization_id), and/billing/usagesurfaces it.Semantics worth flagging
Post-paid: a run already in flight finishes and is billed; the gate reads the last-booked value, so this is a soft, slightly-lagged budget guardrail, not a hard real-time cutoff (
strictbounds overshoot to one run).Files
proposal.md— the resource model, the Daytona verdict, the gate/measure/account flow, registry/Stripe/DB steps, reconciliation cron, risks.research.md— grounding in current metering/billing/sandbox code (file:line) and the cited Daytona API findings.tasks.md— ordered checklist + open inputs (per-plan numbers, join-key/store, internal report auth, Supporttool_choicein Agenta prompt templates #4643 tracking, phase-2Scope.AGENT).Notes
tasks.md: therun_id → scopejoin store (durable vs Redis) and the exact existing internal credential for the report endpoint.🤖 Generated with Claude Code
https://claude.ai/code/session_01MdaZVVA8e9LHk2ZrsEJEBj
Generated by Claude Code