From e0258f3bf9c5d8260ca63df93646fc27eb4314bb Mon Sep 17 00:00:00 2001 From: acossa Date: Fri, 8 May 2026 19:34:27 +0200 Subject: [PATCH 1/5] docs(readme): publish WorkIt README --- README.md | 847 +++++++++++++++++++++++++----------------------------- 1 file changed, 387 insertions(+), 460 deletions(-) diff --git a/README.md b/README.md index 73f1885..d5ab1d8 100644 --- a/README.md +++ b/README.md @@ -5,14 +5,20 @@ SPDX-License-Identifier: Apache-2.0 # WorkIt -Structured concurrency for TypeScript systems that need owned async work, cancellation, cleanup, limits, and observability. +Structured concurrency for TypeScript systems that need owned async work: +bounded parallelism, cancellation, cleanup, retries, timeouts, budgets, +backpressure, worker offload, and observable task lifecycles. -Native `Promise` remains appropriate for one-off async values. WorkIt is intended for async work that needs ownership. +Native `Promise` is still the right tool for one asynchronous value. WorkIt is +for the next step: a request, batch, agent run, provider race, stream, or +background operation where related async tasks must live, fail, cancel, and +clean up together. [![License](https://img.shields.io/badge/license-Apache--2.0-blue)](LICENSE) [![Node](https://img.shields.io/badge/node-%3E%3D20.11-brightgreen)](package.json) [![Runtime deps](https://img.shields.io/badge/runtime%20dependencies-0-brightgreen)](package.json) -[![Coverage](https://img.shields.io/badge/coverage-100%25-brightgreen)](#verified-evidence) +[![Verification](https://img.shields.io/badge/verify-green-brightgreen)](#verified-evidence) +[![Article benches](https://img.shields.io/badge/article%20benches-19%2F19-brightgreen)](benchmarks/articles/) ## Install @@ -20,605 +26,526 @@ Native `Promise` remains appropriate for one-off async values. WorkIt is intende npm install @workit/core ``` -WorkIt currently targets Node.js server runtimes. Browser and edge runtimes resolve to an explicit unsupported-runtime boundary. +WorkIt targets Node.js server runtimes today. Browser and edge runtimes resolve +to an explicit unsupported-runtime boundary. -## Guide - -Use WorkIt by choosing the smallest primitive that owns the work you need: - -| Need | Use | -| --- | --- | -| One operation with child tasks | `group(async (task) => ...)` | -| A few task functions together | `run.all()`, `run.race()`, `run.any()`, `run.series()` | -| Bounded concurrency with ordered results | `run.pool(concurrency, tasks)` | -| Batch transforms over items | `work(items).inParallel(n).do(fn)` | -| Retry, timeout, fallback, or resource cleanup | `run.retry()`, `run.timeout()`, `run.fallback()`, `run.bracket()` | -| Streaming batches with backpressure | `work(items).stream()` | -| Producer-consumer coordination | `@workit/core/channel` | -| Snapshot diagnosis | `@workit/core/diagnostics` | -| AI tool/budget helper contracts | `@workit/core/ai` | -| OpenTelemetry bridge | `@workit/core/otel` | -| CPU or non-cooperative work boundary | `@workit/core/worker` | - -The main rule is: keep native `Promise` for single async values, and use WorkIt when the operation needs ownership, cancellation, cleanup, bounded concurrency, budgets, diagnostics, or observable task events. - -## Example +## Quick Start ```ts -import { group } from "@workit/core"; - -const result = await group(async (task) => { - const profile = task(async (ctx) => { - return await fetchProfile({ signal: ctx.signal }); - }, { name: "profile.load" }); - - const account = task(async (ctx) => { - return await fetchAccount({ signal: ctx.signal }); - }, { name: "account.load" }); - - task.background(async (ctx) => { - ctx.defer(() => flushAuditBuffer()); - await writeAuditEvent({ signal: ctx.signal }); - }, { name: "audit.write" }); +import { work } from "@workit/core"; - return { - profile: await profile, - account: await account, - }; -}); +const doubled = await work([1, 2, 3]) + .inParallel(2) + .do(async (value, _ctx) => value * 2); ``` -If an owned foreground task fails, WorkIt cancels sibling work, preserves the cancellation reason, and runs registered cleanup before the scope closes. - -## Why WorkIt Exists - -JavaScript promises model async values. They do not model ownership. - -In production systems, async work often needs more than `Promise.all()`: - -| Requirement | Raw Promise | WorkIt | -| --- | --- | --- | -| Owned task tree | Manual implementation | Provided by scope model | -| Cancel siblings on failure | Manual implementation | Scope cancellation | -| Typed cancellation reason | Manual implementation | `CancelReason` | -| Cleanup before scope closes | Manual implementation | `ctx.defer()` | -| Bounded concurrency | External helper or custom queue | `run.pool()` and `work().inParallel()` | -| Retry and timeout composition | Manual implementation | Task wrappers | -| Budget accounting | Manual implementation | Context budgets | -| Diagnostics | Manual implementation | Snapshot diagnostics | -| Safe telemetry export | Manual | Opt-in | - -Typical use cases include backend orchestration, agent task trees, RAG ingestion, provider races, batch processing, streaming transcription, worker offload, and cancellation-safe tool execution. - -## Use Cases And Non-Goals +The context parameter is available when the task needs cancellation, progress, +budgets, or scoped resources. It can be ignored for plain transformations. -Use WorkIt when: +## Why Ownership Matters -- multiple async tasks belong to one operation -- child failures should cancel sibling work -- cleanup must run before returning -- provider calls need timeout, retry, fallback, or racing -- batch work needs bounded concurrency and partial-result policy -- tool execution needs token, cost, or call budgets -- task events must be observable without leaking provider internals +Consider this batch helper: -Do not use WorkIt when: - -- a single `await fetch()` is enough -- you only need a tiny local semaphore -- you need distributed rate limiting or cluster reservoirs -- you need browser or edge runtime support today -- your task body cannot cooperate with cancellation and cannot be moved to a worker - -## Guarantees - -The Node.js runtime is designed around these contracts: - -- every non-detached task belongs to exactly one scope -- scopes wait for owned children before closing -- scope cancellation propagates through `AbortSignal` -- non-background child failure cancels sibling work -- deferred cleanup runs in last-in, first-out order -- retry sleeps and rate-limit waits remove abort listeners -- idempotency handles are pruned after task settlement -- `run.any()` and `run.race()` preserve parent cancellation reasons -- cleanup failures emit typed cleanup events -- budgets inherit through scope context with explicit shadowing -- telemetry export is opt-in, sampled, bounded, and circuit-broken -- worker offload rejects remote URLs, inline URLs, traversal paths, functions, symbols, and class instances - -## Verified Evidence - -The repository contains executable gates for runtime behavior, package behavior, supply-chain policy, and scale smoke tests. - -Current verification evidence: - -| Gate | Result | -| --- | --- | -| Unit tests | 214 tests passing | -| Coverage | 100% statements, branches, functions, lines | -| Runtime dependencies | 0 production dependencies | -| Public API exports | 7 locked package export paths | -| Public bundle | 29,255 B minified / 9,694 B gzip | -| Core group import | 14,175 B minified / 4,835 B gzip | -| Soak | 100,000 logical tasks, bounded concurrency | -| Stream memory | 1,000,000 logical items, bounded producer growth | -| Exporter stress | 100,000 events with bounded queue | -| Package consumer | ESM, CJS, TypeScript, framework fixtures | -| Security | headers, no-network, vulnerability, SBOM, release-policy gates | - -Run the full gate: +```ts +type BatchEvent = + | { type: "item:started"; item: T; attempt: number } + | { type: "item:retried"; item: T; attempt: number; error: unknown } + | { type: "item:completed"; item: T }; + +type BatchOptions = { + concurrency: number; + retries: number; + timeoutMs: number; + signal: AbortSignal; + events: { emit(event: BatchEvent): void }; + run: (item: T, options: { signal: AbortSignal }) => Promise; +}; + +function backoffMs(attempt: number): number { + return Math.min(1000 * 2 ** (attempt - 1), 10_000); +} -```sh -npm run verify -``` +function sleep(ms: number, signal?: AbortSignal): Promise { + if (signal?.aborted) return Promise.reject(signal.reason); -Run coverage: + return new Promise((resolve, reject) => { + let settled = false; -```sh -npm run test:coverage -``` + const finish = (): void => { + if (settled) return; + settled = true; + signal?.removeEventListener("abort", abort); + resolve(); + }; -Run public proof validation: + const abort = (): void => { + if (settled) return; + settled = true; + clearTimeout(timer); + reject(signal?.reason); + }; -```sh -npm run check:public-proof -``` + const timer = setTimeout(finish, ms); + signal?.addEventListener("abort", abort, { once: true }); + }); +} -Machine-readable reviewer evidence lives in `benchmarks/public-proof.json`. The public proof gate keeps that artifact aligned with the README, benchmark fixtures, migration guides, and runtime matrix. +async function runBatch( + items: readonly T[], + options: BatchOptions +): Promise { + const results = new Array(items.length); + let nextIndex = 0; + + async function worker(): Promise { + while (!options.signal.aborted) { + const index = nextIndex++; + if (index >= items.length) return; + + const item = items[index]; + for (let attempt = 1; attempt <= options.retries + 1; attempt++) { + const timeout = AbortSignal.timeout(options.timeoutMs); + const signal = AbortSignal.any([options.signal, timeout]); + + try { + options.events.emit({ type: "item:started", item, attempt }); + results[index] = await options.run(item, { signal }); + options.events.emit({ type: "item:completed", item }); + break; + } catch (error) { + if (attempt > options.retries || options.signal.aborted) throw error; + options.events.emit({ type: "item:retried", item, attempt, error }); + await sleep(backoffMs(attempt), options.signal); + } + } + } + } -## Core API + await Promise.all( + Array.from({ length: options.concurrency }, () => worker()) + ); -```ts -import { - group, - run, - work, - renderTree, - createBudget, - createContextKey, - CostBudget, - TokenBudget, - TelemetryBudget, -} from "@workit/core"; + return results; +} ``` -| Export | Purpose | -| --- | --- | -| `group()` | Opens an owned task scope. | -| `run` | Task combinators: all, race, any, pool, retry, timeout, fallback, bracket, bounded shields, and execution helpers. | -| `work()` | Batch builder with concurrency, retry, timeout, filtering, mapping, error policy, and streaming. | -| `renderTree()` | Stable text rendering for scope snapshots. | -| `createContextKey()` | Typed context keys. | -| `createBudget()` | Typed cooperative budget keys. | +It covers bounded parallelism, timeout, parent cancellation, ordered results, +typed events, and retry backoff. The lifecycle is still split across the queue, +timeout signals, retry loop, event sink, and caller. Adding sibling +cancellation, cleanup, budgets, partial results, or diagnostics extends the +same ownership protocol in several places. -## Run Helpers +With WorkIt, the ownership boundary is the API: ```ts -import { run } from "@workit/core"; - -const fastest = await run.race([ - run.timeout(callPrimary, "800ms"), - run.timeout(callReplica, "800ms"), -]); - -const resilient = run.fallback( - run.retry(callProvider, { times: 3, backoff: "exponential" }), - callBackupProvider -); +import { work } from "@workit/core"; -const batch = await run.pool(8, inputs.map((input) => async (ctx) => { - return await processInput(input, ctx.signal); -})); +const results = await work(items) + .inParallel(8) + .withRetry(3) + .withTimeout("5s") + .do(async (item, ctx) => { + ctx.report({ message: `processing ${item.id}` }); + return apiCall(item, { signal: ctx.signal }); + }); ``` -`run.race()` and `run.any()` cancel losing work. `run.pool()` preserves result order while bounding concurrency. - -## Retry Policy +This gives the batch one lifecycle contract: + +- at most 8 items run at once +- transient failures retry with cancel-aware backoff +- each item has a 5 second timeout +- every task receives the same cancellation model through `ctx.signal` +- progress is a typed runtime event +- queued and active work stop together when the owner is cancelled + +## What WorkIt Replaces + +WorkIt does not replace promises as values. It replaces repeated lifecycle +orchestration around promises. + +| Existing pattern | Real limitation | Ownership contract | WorkIt primitive | +|---|---|---:|---| +| Hand-written concurrency queue | Queue, retry, timeout, and caller cancellation each own part of the lifecycle | no single owner | `work().inParallel()` / `run.pool()` | +| Manual scope object with cancellation tokens | Works until every new feature must reimplement the same lifecycle rules | local convention | `group()` / `run.*` | +| Provider race with `Promise.race()` | Losing calls keep running unless each branch is wired to cancellation | no | `run.race()` | +| Retry loop with delayed backoff | Cancellation has to be remembered in every sleep and retry branch | no | `run.retry()` | +| Request fan-out with `Promise.all()` | Sibling cancellation and cleanup are not part of the value contract | no | `group()` / `run.all()` | +| Manual `try/finally` cleanup | Cleanup can hang or obscure the original failure without an explicit policy | partial | `run.bracket()` | +| Async iterator prefetch | Producer control and consumer demand are easy to separate accidentally | partial | `work().stream()` | +| Ad hoc token or cost counters | Nested work can charge the wrong owner without a shared context contract | partial | context budgets | +| CPU loop with `AbortController` | Cooperative signals cannot preempt the main thread | no | `offload()` | + +WorkIt's ownership contract is the combination of scope, cancellation reason, +child task set, defer stack, context, and event stream. + +## Mental Model + +WorkIt creates a scope tree. A scope owns its tasks. When a foreground task +fails, times out, or is cancelled, the scope cancels sibling work, waits for +owned children, runs cleanup, emits lifecycle events, and then settles. + +```mermaid +flowchart TD + A[scope created] --> B[foreground tasks start] + A --> C[background tasks start] + B --> D{failure, timeout, or cancel} + D -- no --> E[foreground values settle] + D -- yes --> F[cancel siblings with typed reason] + E --> G[await owned background tasks] + F --> G + G --> H[run defer stack and bracket cleanup] + H --> I[scope closes] + A -. explicit escape .-> J[detached task] + J -. not awaited by scope .-> K[external owner required] +``` -Retry defaults are resilience-oriented, not micro-benchmark-oriented. +Rules: -Both `run.retry(task, 3)` and `work(items).withRetry(3)` normalize to: +1. Every task runs inside exactly one scope. +2. A scope owns cancellation, cleanup, context, child tasks, and events. +3. Cancelling a scope aborts its signal and propagates a typed reason. +4. A scope cannot close while non-detached children are still pending. +5. `background` is owned and delays closure. +6. `detached` is explicit and transfers ownership to the caller. -```ts -{ - times: 3, - backoff: "exponential", - initialDelay: 100, - maxDelay: 30_000, - jitter: true, -} -``` +## Core API -`times` is the maximum number of attempts including the first attempt. With `times: 3`, WorkIt can run one initial attempt and two retries. +| Need | Use | +|---|---| +| One owned operation with child tasks | `group(async (task) => ...)` | +| Batch work over items | `work(items)` | +| Bounded parallel task functions | `run.pool(concurrency, tasks)` | +| Safer `Promise.all` / `race` / `any` | `run.all()`, `run.race()`, `run.any()` | +| Retry, timeout, fallback, hedge | `run.retry()`, `run.timeout()`, `run.fallback()`, `run.hedge()` | +| Resource safety | `run.bracket()` | +| Critical sections | `run.uncancellable()` | +| Backpressured streams | `work(items).stream()` | +| Producer-consumer coordination | `@workit/core/channel` | +| Worker-thread hard boundary | `@workit/core/worker` | +| Diagnostics and snapshots | `@workit/core/diagnostics` | +| OpenTelemetry bridge | `@workit/core/otel` | +| Agent helper contracts | `@workit/core/ai` | -Use the numeric form for production calls where brief backoff is desired: +## Common Use Cases -```ts -const resilient = run.retry(callProvider, 3); -``` +These are short entry points. The full narrative and benchmark discussion live +in [`articles/`](articles/). -Use an explicit zero-delay policy for local operations, tests, or fast in-memory retries: +### Owned Request Fan-Out ```ts -const fast = run.retry(callLocalCache, { - times: 3, - initialDelay: "0ms", - maxDelay: "0ms", - jitter: false, -}); +import { group } from "@workit/core"; -const output = await work(items) - .withRetry({ - times: 3, - initialDelay: "0ms", - maxDelay: "0ms", - jitter: false, - }) - .do(async (item) => processItem(item)); -``` +const response = await group(async (task) => { + const profile = task((ctx) => fetchProfile({ signal: ctx.signal })); + const account = task((ctx) => fetchAccount({ signal: ctx.signal })); -Use `retryIf` to keep retry policy explicit: + task.background(async (ctx) => { + ctx.defer(() => flushAuditBuffer()); + await writeAuditEvent({ signal: ctx.signal }); + }); -```ts -const providerCall = run.retry(callProvider, { - times: 4, - backoff: "exponential", - initialDelay: "200ms", - maxDelay: "5s", - retryIf: (err) => isTransientProviderError(err), + return { profile: await profile, account: await account }; }); ``` -Do not use retry to hide deterministic validation errors. Reject those at the boundary. +If `profile` fails, the account and audit tasks are cancelled. The audit cleanup +runs before the scope closes. -## Resource Safety +### Provider Race ```ts import { run } from "@workit/core"; -await run.bracket( - async () => await openConnection(), - async (connection, ctx) => { - return await connection.query("select 1", { signal: ctx.signal }); - }, - async (connection) => { - await connection.close(); - } -); +const result = await run.race([ + run.timeout((ctx) => primary.generate({ signal: ctx.signal }), "5s"), + run.timeout((ctx) => backup.generate({ signal: ctx.signal }), "5s"), +]); ``` -`run.bracket()` releases exactly once on success, error, timeout, and cancellation. +The first success wins. Losing branches receive `CancelReason { kind: +"race_lost" }`. -## Bounded Uncancellable Sections +### Retry With Timeout ```ts import { run } from "@workit/core"; -const commit = run.uncancellable(async (ctx) => { - await writeFinalReceipt({ signal: ctx.signal }); -}, { timeout: "2s" }); +const receipt = await run.retry( + (ctx) => + run.timeout( + (timeoutCtx) => + chargeCustomer(invoice, { + signal: AbortSignal.any([ctx.signal, timeoutCtx.signal]), + }), + "5s" + ), + { retries: 3 } +); ``` -`run.uncancellable()` delays parent cancellation while the protected body runs, but it does not hide cancellation. If the parent was cancelled during the shielded section, WorkIt rethrows the original cancellation after the section completes. +The retry policy, timeout, and caller cancellation share one owned execution +path instead of living in separate helper layers. -JavaScript cannot forcibly stop non-cooperative in-process work. For hard CPU boundaries, use worker offload with a timeout. - -## Work Builder +### Backpressured Stream ```ts import { work } from "@workit/core"; -const output = await work(documents) +for await (const summary of work(documents()) .inParallel(8) - .withRetry({ times: 3, backoff: "exponential" }) - .withTimeout("5s") - .filter((doc) => doc.enabled) - .onError("collect") - .do(async (doc, ctx) => { - return await embedDocument(doc, { signal: ctx.signal }); - }); + .map((doc, ctx) => summarize(doc, { signal: ctx.signal })) + .stream()) { + if (summary.enough) break; +} ``` -The builder defaults to sequential, fail-fast execution. Concurrency and partial-result policy are explicit. +Breaking the loop cancels remaining in-flight work and stops pulling from the +producer. -## Budgets And Context +### Budgeted Agent Work ```ts -import { CostBudget, TokenBudget, group, run } from "@workit/core"; - -await run.context.with(CostBudget, { spent: 0, limit: 100, unit: "USD" }, async () => - run.context.with(TokenBudget, { spent: 0, limit: 10_000, unit: "tokens" }, async () => - group(async (task) => { - await task(async (ctx) => { - ctx.consume(CostBudget, 25); - ctx.consume(TokenBudget, 1_200); - return await callModel({ signal: ctx.signal }); - }); +import { CostBudget, TokenBudget, run, work } from "@workit/core"; + +await run.context.with(CostBudget, { spent: 0, limit: 0.50, unit: "USD" }, () => + run.context.with(TokenBudget, { spent: 0, limit: 100_000, unit: "tokens" }, () => + work(chunks).inParallel(8).do(async (chunk, ctx) => { + ctx.consume(TokenBudget, chunk.estimatedTokens); + ctx.consume(CostBudget, chunk.estimatedCost); + return embed(chunk, { signal: ctx.signal }); }) ) ); ``` -Budget snapshots exposed to consumers are readonly. Mutation happens through `ctx.consume()`. +Budget overrun cancels the scope that installed the budget. -## Diagnostics +### Worker Boundary ```ts -import { diagnoseSnapshot } from "@workit/core/diagnostics"; +import { offload } from "@workit/core/worker"; -const report = diagnoseSnapshot(scope.status(), { - staleTaskMs: 30_000, - events: recentEvents, -}); +const result = await offload( + new URL("./cpu-worker.js", import.meta.url), + "compute", + input, + { timeout: "2s" } +); ``` -Diagnostics are subpath-only to keep the root runtime small. Reports identify old pending tasks, cleanup timeouts, cancelling scopes, and pending child scopes. - -## Channels - -```ts -import { createChannel } from "@workit/core/channel"; +`AbortController` cannot preempt a tight CPU loop on the main thread. Worker +offload gives CPU-bound or plugin-like work a hard timeout boundary. -const channel = createChannel({ capacity: 16 }); +## Worker Offload Boundary -await channel.send("item"); -const item = await channel.receive(); -``` +`offload()` is an execution boundary and a structured-clone boundary. -Channels provide bounded in-process backpressure with close and cancellation semantics. +Accepted worker inputs include primitives, arrays, plain objects, `Map`, `Set`, +dates, regexps, buffers, and typed arrays. -## AI Helpers +Rejected worker inputs include class instances, functions, symbols, custom +prototype objects, inline or remote module URLs, and parent directory segments +in worker paths. -```ts -import { runAgent, streamLLM } from "@workit/core/ai"; - -const result = await runAgent(async (agent) => { - return await agent.tool("search", { q: "structured concurrency" }, async (input, ctx) => { - return await search(input.q, { signal: ctx.signal }); - }, { - timeout: "5s", - tokens: 12, - toolCalls: 1, - }); -}); -``` +When `timeout` fires, WorkIt terminates the worker thread. This is different +from cooperative `AbortSignal` cancellation inside the main JavaScript thread. -The AI subpath supplies contracts and structured execution helpers only. It does not import OpenAI, Anthropic, cloud SDKs, HTTP clients, or provider runtimes. +## Runnable Samples -## Observability +| Sample | What it demonstrates | +|---|---| +| [`samples/progress-parallel.sample.js`](samples/progress-parallel.sample.js) | progress events during bounded parallel work | +| [`samples/race-providers.sample.js`](samples/race-providers.sample.js) | provider race with loser cancellation | +| [`samples/no-orphan.sample.js`](samples/no-orphan.sample.js) | owned background work waits before scope close | +| [`samples/streaming-summarizer.sample.js`](samples/streaming-summarizer.sample.js) | streaming summarization with early stop | +| [`samples/embed-bisection.sample.js`](samples/embed-bisection.sample.js) | bad-batch bisection for embedding pipelines | +| [`samples/supervision.sample.js`](samples/supervision.sample.js) | supervised long-lived work | +| [`samples/worker-offload.sample.js`](samples/worker-offload.sample.js) | worker timeout against non-cooperative CPU work | +| [`samples/budget-rag.sample.js`](samples/budget-rag.sample.js) | request-scoped cost budget | +| [`samples/logging-otel-bridge.sample.js`](samples/logging-otel-bridge.sample.js) | local events bridged to telemetry | -```ts -import { attachTelemetryExporter } from "@workit/core/observability"; - -const attachment = attachTelemetryExporter(scope, async (event) => { - await telemetry.write(event); -}, { - sampling: { mode: "errors_and_slow", slowThresholdMs: 2_000 }, - circuitBreaker: { failureThreshold: 3, openForMs: 60_000 }, - sanitize: (event) => event, -}); +## Verified Evidence -attachment.unsubscribe(); -``` +WorkIt claims are tied to executable gates. The benchmark timings below are +representative captured runs; the gates assert semantic invariants and budget +thresholds, not exact milliseconds. -The root event bus is local and dependency-free. Exporting events is explicit, sampled, bounded, sanitized, and circuit-broken. +| Evidence | Current result | +|---|---:| +| Unit tests | 214 passing | +| Coverage gate | 100% statements, branches, functions, lines | +| Runtime dependencies | 0 | +| Article benchmark suite | 19/19 passing | +| Core group import | 14,175 B minified / 4,835 B gzip | +| Public bundle | 29,255 B minified / 9,694 B gzip | +| Stream gate | 1,000,000 logical items with bounded producer growth | +| Soak gate | 100,000 logical tasks with bounded concurrency | +| Exporter stress | 100,000 events with bounded queue | -OpenTelemetry integration is opt-in: +Representative article-benchmark results: -```ts -import { attachOpenTelemetry } from "@workit/core/otel"; -``` +| Benchmark | Baseline | WorkIt | +|---|---:|---:| +| Provider race losers after winner | losers continue until their sleeps finish | losers cancelled in scope close | +| Retry after cancellation | 7 extra attempts, 622 ms latency | 0 extra attempts, 1 ms latency | +| Context `.with()` over 5,000 keys | 31.68 ms | 0.014 ms | +| 1B-row source, take 25 | 281 items pulled | 40 items pulled | +| Sampling volume | 1,300 events | 36 events | -`@opentelemetry/api` is an optional peer dependency so the root package can stay dependency-free. Install it only when using the OTel subpath: +Run the main verification gate: ```sh -npm install @opentelemetry/api +npm run verify ``` -If the peer is missing and `attachOpenTelemetry()` needs the default OTel API, WorkIt throws: +`npm run verify` runs type-checking, header and test hygiene, unit tests, +security checks, vulnerability audit, SBOM validation, API and bundle-size +locks, runtime benchmarks, stream and soak gates, exporter stress, +package-consumer fixtures, public-proof validation, worker-contract checks, +release-policy checks, and `npm pack --dry-run`. -```txt -To use @workit/core/otel, install: -npm install @opentelemetry/api +Run the article benchmark suite: + +```sh +npm run bench:articles ``` -You may also pass explicit `tracer` and `meter` objects for tests or custom OTel wiring. +Run the curated publication evidence suite: -## Worker Offload Boundary - -```ts -import { offload } from "@workit/core/worker"; - -const result = await offload( - new URL("./cpu-worker.js", import.meta.url), - "fibonacci", - 42, - { timeout: "2s" } -); +```sh +npm run test:evidence ``` -Worker modules must be local application-controlled files. WorkIt rejects remote URLs, inline URLs, empty module references, and parent directory segments before the worker imports anything. - -Accepted worker inputs include primitives, arrays, plain objects, `Map`, `Set`, `Date`, `RegExp`, `ArrayBuffer`, `SharedArrayBuffer`, and typed array views. +Run verification commands sequentially when they depend on `dist/`. Some gates, +including `npm run test:coverage` and `npm run verify`, rebuild or clean the +compiled artifacts. Running them in parallel with `npm run bench:articles` can +delete `dist/` while benchmark processes are importing it. -Rejected worker inputs include functions, symbols, class instances, objects with custom prototypes, remote module URLs, inline module URLs, and traversal paths. +Machine-readable reviewer evidence lives in +[`benchmarks/public-proof.json`](benchmarks/public-proof.json), the article +benchmark capture lives in +[`benchmarks/results/articles.latest.json`](benchmarks/results/articles.latest.json), +and the public claim ledger lives in +[`evidence/claims.json`](evidence/claims.json). -## Examples Index +## Security And Release Integrity -Samples run against the compiled package: +| Guarantee | Enforcement | +|---|---| +| Runtime core has no production dependencies | package metadata and security gate | +| Core does not import networking modules | static no-network gate | +| Published package includes an SBOM | CycloneDX SBOM generation and validation | +| Release workflow uses provenance controls | release-policy gate | +| Public API and bundle size are locked | API and size gates | +| Consumer fixtures install the package artifact | package-consumer gate | -```sh -npm run sample:1b -npm run sample:concurrency -npm run sample:cancel -npm run sample:timeout -npm run sample:no-orphan -npm run sample:all -npm run sample:agent -npm run sample:race -npm run sample:rag -npm run sample:batch -npm run sample:stream -npm run sample:embed100k -npm run sample:bisection -npm run sample:stt-disconnect -npm run sample:worker -npm run sample:aws -npm run sample:azure -npm run sample:next -npm run sample:otel -npm run sample:logging -``` +Additional repository controls include pinned dev dependencies, vulnerability +audit, SHA-pinned GitHub Actions, OSSF Scorecard workflow, CODEOWNERS, +Dependabot, and signed release tag policy. -| Sample | Scenario | -| --- | --- | -| `sample:all` | Safer `Promise.all()` replacement with sibling cancellation and cleanup. | -| `sample:concurrency` | Bounded parallelism with budget consumption. | -| `sample:cancel` | Typed cancellation reason propagation. | -| `sample:timeout` | Timeout-driven cancellation. | -| `sample:no-orphan` | Scope ownership preventing orphaned child work. | -| `sample:agent` | Agent-style task tree cancellation. | -| `sample:race` | Provider race with loser cancellation. | -| `sample:rag` | RAG-style budgeted work. | -| `sample:batch` | Batch upload with retry and partial-result handling. | -| `sample:stream` | Streaming summarizer with bounded work. | -| `sample:embed100k` | 100,000 logical embedding tasks. | -| `sample:bisection` | Batch bisection for partial provider failures. | -| `sample:stt-disconnect` | Speech-to-text cancellation on disconnect. | -| `sample:worker` | Worker offload for CPU/non-cooperative work. | -| `sample:aws` | AWS Lambda-shaped handler. | -| `sample:azure` | Azure Functions-shaped handler. | -| `sample:next` | Next.js route-shaped handler. | -| `sample:otel` | OpenTelemetry adapter use. | -| `sample:logging` | Logging-to-telemetry bridge. | - -## Benchmarks And Reproducibility - -WorkIt benchmark claims are tied to executable gates in the repository. They are release checks, not synthetic marketing numbers. - -| Command | What it validates | -| --- | --- | -| `npm run check:benchmark` | Basic runtime throughput for `group()` and `run.all()`. | -| `npm run check:1b` | One-billion logical item streaming shape with bounded concurrency. | -| `npm run check:stream-memory` | Slow-consumer stream memory ceiling and producer backpressure. | -| `npm run check:soak` | 100,000 logical task runtime soak. | -| `npm run check:exporter-stress` | Bounded telemetry exporter behavior under high event volume. | -| `npm run check:package-consumer` | Installed package behavior across ESM, CJS, TypeScript, framework fixtures, browser bundle split, and Cloudflare dry-run unsupported boundary. | -| `npm run check:claims` | Executable claim fixtures derived from review findings. | - -Run all public proof gates: +Security reports should follow [`SECURITY.md`](SECURITY.md). -```sh -npm run verify -``` +## Runtime Support -Run only the public proof artifact gate: +Supported: -```sh -npm run check:public-proof -``` +- Node.js `>=20.11` +- ESM consumers +- CommonJS consumers +- strict TypeScript consumers +- AWS Lambda-shaped handlers +- Azure Functions-shaped handlers +- Next.js route-shaped handlers +- Express, Fastify, tRPC, and Vercel AI SDK fixtures -The static proof artifact is `benchmarks/public-proof.json`. It records evidence commands, benchmark fixture thresholds, migration-guide coverage, and runtime matrix rows. +Unsupported today: -When comparing WorkIt with another library, keep the comparison scoped: +- browser client runtime +- Cloudflare Workers +- Next.js Edge / Vercel Edge -- compare raw throughput only for equivalent operations -- include cancellation, cleanup, and ownership when those are part of the requirement -- report Node.js version, operating system, CPU, command, iteration count, concurrency, and heap flags -- use `--expose-gc` for memory gates that require explicit garbage collection -- do not compare a structured runtime against a semaphore without naming the semantic difference +Unsupported runtimes resolve to an explicit unsupported boundary. An edge-safe +context runtime is future work. -## WorkIt Compared With Common Alternatives +## When To Use Alternatives -| Tool | Primary model | Use it when | Use WorkIt when | -| --- | --- | --- | --- | -| Native `Promise` | Async value | One async value or a small local composition is enough. | The operation needs ownership, cancellation, cleanup, or diagnostics. | -| `p-limit` | Local concurrency limiter | You only need a tiny semaphore. | Bounded work also needs scope ownership, cancellation, retry, timeout, or task events. | -| `p-map` | Concurrent mapping | You need a focused map helper. | Mapping also needs retry, timeout, stream policy, budgets, or partial-result contracts. | -| RxJS | Observable transformation graph | You are modeling rich event streams and operators. | You are modeling owned async task lifecycles. | -| Bottleneck | Rate limiting and reservoirs | You need distributed or reservoir-based rate limiting. | You need local structured concurrency and lifecycle control. | +| Tool | Prefer it when | Prefer WorkIt when | +|---|---|---| +| Native `Promise` | One async value is enough | Work needs ownership, cancellation, cleanup, or diagnostics | +| Manual scope object | The lifecycle is local and small enough to audit in one file | The lifecycle becomes a reusable cross-module contract | +| `p-limit` | You only need a tiny semaphore | Bounded work also needs lifecycle semantics | +| `p-map` | You need a focused concurrent map | Mapping needs retry, timeout, stream policy, or partial results | +| RxJS | You are modelling rich event streams | You are modelling owned async task lifecycles | +| Bottleneck | You need distributed reservoirs or rate limits | You need local structured concurrency | +| Effection | You want structured concurrency via operations/generators | You want plain `async`/`await` task functions | +| Effect-TS | You want a full effect system | You want owned async work without migrating to a DSL | -WorkIt is not a replacement for every async library. It is a structured-concurrency runtime for owned work. The correct choice depends on whether lifecycle semantics are part of the problem. +These comparisons are about ownership and composition. Some libraries expose +cancellation hooks or queue controls; WorkIt's claim is that cancellation, +cleanup, retry, timeout, budgets, backpressure, and diagnostics compose under +one owner. ## Migration Notes -### From native Promise - -Keep native promises for simple async values. Use WorkIt when the work needs ownership, cancellation, cleanup, bounded concurrency, budgets, diagnostics, or observability. +These are orientation notes, not codemods. Keep the old tool when it owns the +problem better. ### From p-limit -Use `run.pool()` when bounded concurrency also needs scope ownership and cancellation. Keep `p-limit` for a tiny standalone semaphore. +Use `run.pool()` or `work(items).inParallel(n)` when the semaphore also needs +sibling cancellation, retry, timeout, cleanup, progress, or partial-result +policy under one owner. ### From p-map -Use `work(items).inParallel(n).do(fn)` when mapping needs retry, timeout, item-level error policy, or streaming. +Use `work(items).inParallel(n).do(fn)` for concurrent maps that need the same +lifecycle semantics as the caller. Keep `p-map` for a small one-file map where +concurrency is the only concern. ### From RxJS -Keep RxJS for rich observable transformation graphs. Use WorkIt for owned async work and task lifecycle control. +Keep RxJS for rich observable graphs. Use WorkIt when the problem is owned task +lifecycle: request fan-out, provider racing, agent tools, bounded streams, or +cleanup around async work. ### From Bottleneck -Keep Bottleneck for distributed rate limits and reservoirs. Use WorkIt for local structured concurrency. +Keep Bottleneck for distributed reservoirs and external rate-limit state. Use +WorkIt for local process ownership where bounded concurrency must compose with +cancel, retry, timeout, budgets, and diagnostics. -## Runtime Support - -Supported: - -- Node.js `>=20.11` -- ESM consumers -- CommonJS consumers -- strict TypeScript consumers -- AWS Lambda-shaped handlers -- Azure Functions-shaped handlers -- Next.js route-shaped handlers -- Express, Fastify, tRPC, and Vercel AI SDK fixtures - -Unsupported today: - -- browser client runtime -- Cloudflare Workers -- Next.js Edge / Vercel Edge - -Unsupported runtimes resolve to an explicit unsupported boundary. - -## Security And Release Integrity +## Documentation -The repository includes gates for: - -- author and SPDX headers -- no runtime network clients in core -- no install lifecycle scripts -- pinned dev dependencies -- production vulnerability audit -- SBOM validation -- API surface lock -- bundle-size lock -- package-consumer fixtures -- release provenance workflow validation -- SHA-pinned GitHub Actions -- OSSF Scorecard workflow -- CODEOWNERS -- Dependabot -- signed release tag policy +| Resource | Purpose | +|---|---| +| [`articles/`](articles/) | Narrative articles with examples and benchmark discussion | +| [`benchmarks/articles/`](benchmarks/articles/) | Reproducible article benchmark suite | +| [`evidence/`](evidence/) | Machine-readable claim ledger and evidence policy | +| [`tests/evidence/`](tests/evidence/) | Curated publication evidence proofs | +| [`samples/`](samples/) | Runnable examples against the compiled package | +| [`SECURITY.md`](SECURITY.md) | Security reporting and release integrity policy | ## Contributing -Please read `CONTRIBUTING.md` before opening a pull request. +Please read [`CONTRIBUTING.md`](CONTRIBUTING.md) before opening a pull request. Before submitting code: ```sh npm run verify npm run test:coverage +npm run bench:articles +npm run test:evidence ``` -Bug reports should include the WorkIt version, Node.js version, reproduction code, and whether the failure occurs from source or the installed package. +Run these commands sequentially. Several verification commands clean and rebuild +`dist/`, while article benchmarks import the compiled package artifact. -Security reports should follow `SECURITY.md`. +Bug reports should include the WorkIt version, Node.js version, reproduction +code, and whether the failure occurs from source or the installed package. ## License -Apache-2.0. See `LICENSE`. +Apache-2.0. See [`LICENSE`](LICENSE). From 3572a331eb602c9ee72e731c5525333f3ee21f93 Mon Sep 17 00:00:00 2001 From: acossa Date: Fri, 8 May 2026 19:34:32 +0200 Subject: [PATCH 2/5] build(publication): add evidence and article scripts --- package.json | 2 ++ 1 file changed, 2 insertions(+) diff --git a/package.json b/package.json index 344989c..974c6ad 100644 --- a/package.json +++ b/package.json @@ -98,6 +98,8 @@ "check:worker-contract": "node scripts/check-worker-contract-docs.mjs", "check:release-policy": "npm run build && node scripts/check-release-provenance.mjs", "check:release": "npm run build && node scripts/check-release-provenance.mjs --registry-dry-run", + "bench:articles": "node benchmarks/articles/run-all.mjs", + "test:evidence": "node tests/evidence/run-all.mjs", "pack:dry": "npm pack --dry-run --json", "sample:1b": "npm run build && node samples/1b-stream.sample.js", "sample:concurrency": "npm run build && node samples/concurrency-budget.sample.js", From 2e4db3be178d96699939e91a69ffb78e181c9b5f Mon Sep 17 00:00:00 2001 From: acossa Date: Fri, 8 May 2026 19:34:37 +0200 Subject: [PATCH 3/5] bench(articles): add reproducible article benchmarks --- .gitignore | 2 + .../articles/01-run-all-vs-promise-all.mjs | 147 +++ .../articles/02-run-race-vs-promise-race.mjs | 95 ++ .../articles/03-run-any-vs-promise-any.mjs | 95 ++ benchmarks/articles/04-pool-vs-semaphore.mjs | 122 ++ benchmarks/articles/05-retry-on-cancel.mjs | 115 ++ .../articles/06-hedge-tied-requests.mjs | 74 ++ benchmarks/articles/07-worker-hard-kill.mjs | 106 ++ .../articles/08-uncancellable-shield.mjs | 191 +++ benchmarks/articles/09-stream-1b-lazy.mjs | 132 +++ .../articles/10-stream-slow-consumer.mjs | 85 ++ benchmarks/articles/11-channel-contract.mjs | 131 +++ .../articles/12-bracket-vs-try-finally.mjs | 196 ++++ .../13-budget-atomicity-and-cancel.mjs | 125 ++ .../articles/14-context-overlay-perf.mjs | 93 ++ benchmarks/articles/15-core-zero-network.mjs | 71 ++ .../articles/16-sampling-and-aggregation.mjs | 94 ++ .../articles/17-cardinality-safe-metrics.mjs | 67 ++ .../articles/18-diagnostics-finding-codes.mjs | 127 ++ benchmarks/articles/19-agent-scope.mjs | 172 +++ benchmarks/articles/README.md | 101 ++ benchmarks/articles/lib/baselines.mjs | 131 +++ benchmarks/articles/lib/spinner.mjs | 21 + benchmarks/articles/package.json | 31 + benchmarks/articles/run-all.mjs | 51 + benchmarks/results/articles.latest.json | 1037 +++++++++++++++++ 26 files changed, 3612 insertions(+) create mode 100644 benchmarks/articles/01-run-all-vs-promise-all.mjs create mode 100644 benchmarks/articles/02-run-race-vs-promise-race.mjs create mode 100644 benchmarks/articles/03-run-any-vs-promise-any.mjs create mode 100644 benchmarks/articles/04-pool-vs-semaphore.mjs create mode 100644 benchmarks/articles/05-retry-on-cancel.mjs create mode 100644 benchmarks/articles/06-hedge-tied-requests.mjs create mode 100644 benchmarks/articles/07-worker-hard-kill.mjs create mode 100644 benchmarks/articles/08-uncancellable-shield.mjs create mode 100644 benchmarks/articles/09-stream-1b-lazy.mjs create mode 100644 benchmarks/articles/10-stream-slow-consumer.mjs create mode 100644 benchmarks/articles/11-channel-contract.mjs create mode 100644 benchmarks/articles/12-bracket-vs-try-finally.mjs create mode 100644 benchmarks/articles/13-budget-atomicity-and-cancel.mjs create mode 100644 benchmarks/articles/14-context-overlay-perf.mjs create mode 100644 benchmarks/articles/15-core-zero-network.mjs create mode 100644 benchmarks/articles/16-sampling-and-aggregation.mjs create mode 100644 benchmarks/articles/17-cardinality-safe-metrics.mjs create mode 100644 benchmarks/articles/18-diagnostics-finding-codes.mjs create mode 100644 benchmarks/articles/19-agent-scope.mjs create mode 100644 benchmarks/articles/README.md create mode 100644 benchmarks/articles/lib/baselines.mjs create mode 100644 benchmarks/articles/lib/spinner.mjs create mode 100644 benchmarks/articles/package.json create mode 100644 benchmarks/articles/run-all.mjs create mode 100644 benchmarks/results/articles.latest.json diff --git a/.gitignore b/.gitignore index 595202c..5debc4a 100644 --- a/.gitignore +++ b/.gitignore @@ -28,6 +28,8 @@ dist-cjs/ build/ out/ lib/ +!benchmarks/articles/lib/ +!benchmarks/articles/lib/** coverage/ .nyc_output/ mnt/ diff --git a/benchmarks/articles/01-run-all-vs-promise-all.mjs b/benchmarks/articles/01-run-all-vs-promise-all.mjs new file mode 100644 index 0000000..8023478 --- /dev/null +++ b/benchmarks/articles/01-run-all-vs-promise-all.mjs @@ -0,0 +1,147 @@ +/** + * Bench 01 -- Promise.all vs run.all. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + * + * Workload: 3 tasks. A succeeds at 50ms. B fails at 30ms. C succeeds at 100ms. + * + * Native Promise.all: rejects at 30ms. A and C ARE NOT cancelled -- their + * bodies keep running and "settle silently" past the rejection. + * + * run.all: rejects at 30ms. A and C are cancelled at the AbortSignal + * boundary, their defer cleanups run, and the outer promise does not resolve + * until cleanup has completed. + * + * Output: JSON record with timestamps proving who actually stopped working. + */ + +import assert from "node:assert/strict"; +import { CancellationError, run } from "../../dist/index.js"; +import { makeClock, makeProbe, naiveSleep, sleep, jsonReplacer } from "./lib/baselines.mjs"; + +const result = { bench: "01-run-all-vs-promise-all", native: null, workit: null }; + +// --- Native Promise.all -------------------------------------------------- +{ + const clock = makeClock(); + const A = makeProbe("A"); + const B = makeProbe("B"); + const C = makeProbe("C"); + + const a = (async () => { + A.startedAt = clock.t(); + await naiveSleep(50); + A.settledAt = clock.t(); + A.settledAs = "fulfilled"; + return "A"; + })(); + const b = (async () => { + B.startedAt = clock.t(); + await naiveSleep(30); + B.settledAt = clock.t(); + B.settledAs = "rejected"; + throw new Error("B failed"); + })(); + const c = (async () => { + C.startedAt = clock.t(); + await naiveSleep(100); + C.settledAt = clock.t(); + C.settledAs = "fulfilled"; + return "C"; + })(); + + let outerRejectedAt = -1; + try { + await Promise.all([a, b, c]); + } catch (e) { + outerRejectedAt = clock.t(); + } + + // Wait long enough for the "ghost" tasks to settle. + await naiveSleep(150); + + result.native = { + outerRejectedAt, + A, B, C, + losersStillRanForMs: { + A: A.settledAt - outerRejectedAt, + C: C.settledAt - outerRejectedAt, + }, + losersWereCancelled: false, + }; +} + +// --- WorkIt run.all ------------------------------------------------------ +{ + const clock = makeClock(); + const A = makeProbe("A"); + const B = makeProbe("B"); + const C = makeProbe("C"); + + const taskA = async (ctx) => { + A.startedAt = clock.t(); + ctx.defer(() => { A.deferRanAt = clock.t(); }); + ctx.signal.addEventListener("abort", () => { A.signalAbortedAt = clock.t(); }, { once: true }); + try { + await sleep(50, ctx.signal); + A.settledAt = clock.t(); A.settledAs = "fulfilled"; + return "A"; + } catch (err) { + A.settledAt = clock.t(); + A.settledAs = err instanceof CancellationError ? "cancelled" : "rejected"; + throw err; + } + }; + const taskB = async () => { + B.startedAt = clock.t(); + await sleep(30); + B.settledAt = clock.t(); B.settledAs = "rejected"; + throw new Error("B failed"); + }; + const taskC = async (ctx) => { + C.startedAt = clock.t(); + ctx.defer(() => { C.deferRanAt = clock.t(); }); + ctx.signal.addEventListener("abort", () => { C.signalAbortedAt = clock.t(); }, { once: true }); + try { + await sleep(100, ctx.signal); + C.settledAt = clock.t(); C.settledAs = "fulfilled"; + return "C"; + } catch (err) { + C.settledAt = clock.t(); + C.settledAs = err instanceof CancellationError ? "cancelled" : "rejected"; + throw err; + } + }; + + let outerRejectedAt = -1; + let cancelReasonKind = null; + try { + await run.all([taskA, taskB, taskC]); + } catch (e) { + outerRejectedAt = clock.t(); + } + + // After the outer promise settles, defers must already have run. + result.workit = { + outerRejectedAt, + A, B, C, + losersWereCancelled: A.settledAs === "cancelled" && C.settledAs === "cancelled", + cancellationLatencyFromBFailure: { + A: A.settledAt - B.settledAt, + C: C.settledAt - B.settledAt, + }, + deferRanBeforeOuterReject: { + A: A.deferRanAt > 0 && A.deferRanAt <= outerRejectedAt, + C: C.deferRanAt > 0 && C.deferRanAt <= outerRejectedAt, + }, + }; + + // Invariants -- fail loudly if WorkIt regresses. + assert.equal(A.settledAs, "cancelled", "A must be cancelled by run.all sibling failure"); + assert.equal(C.settledAs, "cancelled", "C must be cancelled by run.all sibling failure"); + assert.ok(A.deferRanAt > 0, "A.defer must run"); + assert.ok(C.deferRanAt > 0, "C.defer must run"); +} + +process.stdout.write(JSON.stringify(result, jsonReplacer, 2) + "\n"); diff --git a/benchmarks/articles/02-run-race-vs-promise-race.mjs b/benchmarks/articles/02-run-race-vs-promise-race.mjs new file mode 100644 index 0000000..25c61f2 --- /dev/null +++ b/benchmarks/articles/02-run-race-vs-promise-race.mjs @@ -0,0 +1,95 @@ +/** + * Bench 02 -- Promise.race vs run.race. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + * + * Workload: 3 provider calls. Anthropic at 10ms, OpenAI at 50ms, Gemini at 80ms. + * + * Native Promise.race: resolves with Anthropic at 10ms; OpenAI and Gemini + * keep running and ARE STILL BILLING. + * + * run.race: resolves with Anthropic at 10ms and cancels OpenAI + Gemini at + * the AbortSignal boundary so the underlying fetch can abort. + */ + +import assert from "node:assert/strict"; +import { CancellationError, run } from "../../dist/index.js"; +import { makeClock, makeProbe, naiveSleep, sleep, jsonReplacer } from "./lib/baselines.mjs"; + +const result = { bench: "02-run-race-vs-promise-race", native: null, workit: null }; + +// --- Native Promise.race ------------------------------------------------ +{ + const clock = makeClock(); + const probes = { openai: makeProbe("openai"), anthropic: makeProbe("anthropic"), gemini: makeProbe("gemini") }; + + const make = (name, ms) => (async () => { + probes[name].startedAt = clock.t(); + await naiveSleep(ms); + probes[name].settledAt = clock.t(); + probes[name].settledAs = "fulfilled"; + return { provider: name }; + })(); + + const winner = await Promise.race([make("openai", 50), make("anthropic", 10), make("gemini", 80)]); + const winnerSettledAt = clock.t(); + + await naiveSleep(120); // let the "ghosts" settle + + result.native = { + winner: winner.provider, + winnerSettledAt, + probes, + losersStillRanForMs: { + openai: probes.openai.settledAt - winnerSettledAt, + gemini: probes.gemini.settledAt - winnerSettledAt, + }, + losersWereCancelled: false, + }; +} + +// --- WorkIt run.race ---------------------------------------------------- +{ + const clock = makeClock(); + const probes = { openai: makeProbe("openai"), anthropic: makeProbe("anthropic"), gemini: makeProbe("gemini") }; + + const make = (name, ms) => async (ctx) => { + probes[name].startedAt = clock.t(); + ctx.defer(() => { probes[name].deferRanAt = clock.t(); }); + ctx.signal.addEventListener("abort", () => { probes[name].signalAbortedAt = clock.t(); }, { once: true }); + try { + await sleep(ms, ctx.signal); + probes[name].settledAt = clock.t(); + probes[name].settledAs = "fulfilled"; + return { provider: name }; + } catch (err) { + probes[name].settledAt = clock.t(); + probes[name].settledAs = err instanceof CancellationError ? "cancelled" : "rejected"; + probes[name].cancelReasonKind = err instanceof CancellationError ? err.reason.kind : null; + throw err; + } + }; + + const winner = await run.race([make("openai", 50), make("anthropic", 10), make("gemini", 80)]); + const winnerSettledAt = clock.t(); + + result.workit = { + winner: winner.provider, + winnerSettledAt, + probes, + losersWereCancelled: probes.openai.settledAs === "cancelled" && probes.gemini.settledAs === "cancelled", + cancelReasonKindForLosers: { + openai: probes.openai.cancelReasonKind ?? null, + gemini: probes.gemini.cancelReasonKind ?? null, + }, + }; + + assert.equal(winner.provider, "anthropic"); + assert.equal(probes.openai.settledAs, "cancelled"); + assert.equal(probes.gemini.settledAs, "cancelled"); + assert.equal(probes.openai.cancelReasonKind, "race_lost"); + assert.equal(probes.gemini.cancelReasonKind, "race_lost"); +} + +process.stdout.write(JSON.stringify(result, jsonReplacer, 2) + "\n"); diff --git a/benchmarks/articles/03-run-any-vs-promise-any.mjs b/benchmarks/articles/03-run-any-vs-promise-any.mjs new file mode 100644 index 0000000..6a3b399 --- /dev/null +++ b/benchmarks/articles/03-run-any-vs-promise-any.mjs @@ -0,0 +1,95 @@ +/** + * Bench 03 -- Promise.any vs run.any. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + * + * Workload: 3 tasks. A fails at 30ms. B succeeds at 50ms. C succeeds at 100ms. + * + * Native Promise.any: resolves with B at 50ms. C keeps running and bills. + * + * run.any: resolves with B at 50ms. C is cancelled at the AbortSignal + * boundary, defer cleanups run before the outer promise resolves. + */ + +import assert from "node:assert/strict"; +import { CancellationError, run } from "../../dist/index.js"; +import { makeClock, makeProbe, naiveSleep, sleep, jsonReplacer } from "./lib/baselines.mjs"; + +const result = { bench: "03-run-any-vs-promise-any", native: null, workit: null }; + +// --- Native Promise.any ------------------------------------------------ +{ + const clock = makeClock(); + const A = makeProbe("A"), B = makeProbe("B"), C = makeProbe("C"); + + const make = (probe, kind, ms) => (async () => { + probe.startedAt = clock.t(); + await naiveSleep(ms); + probe.settledAt = clock.t(); + probe.settledAs = kind; + if (kind === "rejected") throw new Error(`${probe.name} failed`); + return probe.name; + })(); + + const winner = await Promise.any([make(A, "rejected", 30), make(B, "fulfilled", 50), make(C, "fulfilled", 100)]); + const winnerSettledAt = clock.t(); + + await naiveSleep(150); + + result.native = { + winner, + winnerSettledAt, + A, B, C, + cStillRanForMs: C.settledAt - winnerSettledAt, + losersWereCancelled: false, + }; +} + +// --- WorkIt run.any ---------------------------------------------------- +{ + const clock = makeClock(); + const A = makeProbe("A"), B = makeProbe("B"), C = makeProbe("C"); + + const make = (probe, kind, ms) => async (ctx) => { + probe.startedAt = clock.t(); + ctx.defer(() => { probe.deferRanAt = clock.t(); }); + ctx.signal.addEventListener("abort", () => { probe.signalAbortedAt = clock.t(); }, { once: true }); + try { + await sleep(ms, ctx.signal); + probe.settledAt = clock.t(); + probe.settledAs = kind === "rejected" ? "rejected" : "fulfilled"; + if (kind === "rejected") throw new Error(`${probe.name} failed`); + return probe.name; + } catch (err) { + probe.settledAt = clock.t(); + if (err instanceof CancellationError) { + probe.settledAs = "cancelled"; + probe.cancelReasonKind = err.reason.kind; + } + throw err; + } + }; + + const winner = await run.any([ + make(A, "rejected", 30), + make(B, "fulfilled", 50), + make(C, "fulfilled", 100), + ]); + const winnerSettledAt = clock.t(); + + result.workit = { + winner, + winnerSettledAt, + A, B, C, + cWasCancelled: C.settledAs === "cancelled", + cancelLatencyForC: C.settledAt - winnerSettledAt, + deferRanForC: C.deferRanAt > 0 && C.deferRanAt <= winnerSettledAt, + }; + + assert.equal(winner, "B"); + assert.equal(C.settledAs, "cancelled", "C must be cancelled when run.any picks B"); + assert.ok(C.deferRanAt > 0, "C.defer must run before run.any resolves"); +} + +process.stdout.write(JSON.stringify(result, jsonReplacer, 2) + "\n"); diff --git a/benchmarks/articles/04-pool-vs-semaphore.mjs b/benchmarks/articles/04-pool-vs-semaphore.mjs new file mode 100644 index 0000000..713222a --- /dev/null +++ b/benchmarks/articles/04-pool-vs-semaphore.mjs @@ -0,0 +1,122 @@ +/** + * Bench 04 -- p-limit-style semaphore vs run.pool. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + * + * Workload: 10 items. Item index 3 throws at 20ms. Every other item takes + * 100ms. Concurrency 4. + * + * Semaphore baseline: when the failing item throws, queued items KEEP RUNNING. + * The semaphore has no cancellation. We measure how many items still ran. + * + * run.pool: first throw cancels queued and in-flight. We measure that the + * outer promise rejects fast and that the in-flight items got + * AbortSignal aborts. + */ + +import assert from "node:assert/strict"; +import { CancellationError, run } from "../../dist/index.js"; +import { + makeClock, makeProbe, naiveSleep, sleep, pLimitLike, jsonReplacer, +} from "./lib/baselines.mjs"; + +const ITEMS = 10; +const CONCURRENCY = 4; +const FAILING_INDEX = 3; +const result = { bench: "04-pool-vs-semaphore", native: null, workit: null }; + +// --- p-limit-style baseline --------------------------------------------- +{ + const clock = makeClock(); + const limit = pLimitLike(CONCURRENCY); + const probes = Array.from({ length: ITEMS }, (_, i) => makeProbe(`item-${i}`)); + + const tasks = probes.map((probe, i) => limit(async () => { + probe.startedAt = clock.t(); + if (i === FAILING_INDEX) { + await naiveSleep(20); + probe.settledAt = clock.t(); probe.settledAs = "rejected"; + throw new Error(`item-${i} failed`); + } + await naiveSleep(100); + probe.settledAt = clock.t(); probe.settledAs = "fulfilled"; + return i; + })); + + // Use Promise.all: rejects on first throw. Other tasks keep running. + let outerRejectedAt = -1; + try { + await Promise.all(tasks); + } catch (e) { + outerRejectedAt = clock.t(); + } + + // Settle: drain everything else by awaiting allSettled so probes capture. + await Promise.allSettled(tasks); + + const ran = probes.filter((p) => p.settledAt > -1); + result.native = { + outerRejectedAt, + started: probes.filter((p) => p.startedAt > -1).length, + fulfilledAfterRejection: probes.filter((p) => p.settledAs === "fulfilled" && p.settledAt > outerRejectedAt).length, + cancelled: 0, + longestPostRejectionRunMs: Math.max(...ran.map((p) => p.settledAt - outerRejectedAt), 0), + probes, + }; +} + +// --- WorkIt run.pool ---------------------------------------------------- +{ + const clock = makeClock(); + const probes = Array.from({ length: ITEMS }, (_, i) => makeProbe(`item-${i}`)); + + const tasks = probes.map((probe, i) => async (ctx) => { + probe.startedAt = clock.t(); + ctx.defer(() => { probe.deferRanAt = clock.t(); }); + ctx.signal.addEventListener("abort", () => { probe.signalAbortedAt = clock.t(); }, { once: true }); + try { + if (i === FAILING_INDEX) { + await sleep(20, ctx.signal); + probe.settledAt = clock.t(); probe.settledAs = "rejected"; + throw new Error(`item-${i} failed`); + } + await sleep(100, ctx.signal); + probe.settledAt = clock.t(); probe.settledAs = "fulfilled"; + return i; + } catch (err) { + probe.settledAt = clock.t(); + if (err instanceof CancellationError) { + probe.settledAs = "cancelled"; + probe.cancelReasonKind = err.reason.kind; + } + throw err; + } + }); + + let outerRejectedAt = -1; + try { + await run.pool(CONCURRENCY, tasks); + } catch (e) { + outerRejectedAt = clock.t(); + } + + result.workit = { + outerRejectedAt, + started: probes.filter((p) => p.startedAt > -1).length, + fulfilledAfterRejection: probes.filter((p) => p.settledAs === "fulfilled" && p.settledAt > outerRejectedAt).length, + cancelled: probes.filter((p) => p.settledAs === "cancelled").length, + notStarted: probes.filter((p) => p.startedAt === -1).length, + cancelReasonKindsForCancelled: [...new Set(probes.filter((p) => p.cancelReasonKind).map((p) => p.cancelReasonKind))], + probes, + }; + + // Invariants + assert.equal(result.workit.fulfilledAfterRejection, 0, "no item must complete after first rejection"); + assert.ok( + result.workit.cancelled + 1 + result.workit.notStarted === ITEMS - probes.filter((p) => p.settledAs === "fulfilled").length, + "every started, non-failing item must be cancelled", + ); +} + +process.stdout.write(JSON.stringify(result, jsonReplacer, 2) + "\n"); diff --git a/benchmarks/articles/05-retry-on-cancel.mjs b/benchmarks/articles/05-retry-on-cancel.mjs new file mode 100644 index 0000000..a7fdd37 --- /dev/null +++ b/benchmarks/articles/05-retry-on-cancel.mjs @@ -0,0 +1,115 @@ +/** + * Bench 05 -- signal-unaware retry loop vs run.retry under cancellation. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + * + * Workload: a body that throws on every attempt. We trigger an external + * cancel mid-retry and measure two things: + * + * 1. How long the cancel takes to actually stop the loop. + * 2. How many extra attempts run after the cancel was requested. + * + * Baseline retry loop: signal-unaware sleep. Cancel is observed only between + * attempts, after the next sleep completes. Extra attempts can happen. + * + * run.retry: sleep is signal-aware. Cancel rejects the sleep, exits the + * loop, settles as cancelled (not failed). + */ + +import assert from "node:assert/strict"; +import { CancellationError, run } from "../../dist/index.js"; +import { makeClock, naiveSleep, sleep, signalUnawareRetryLike, jsonReplacer } from "./lib/baselines.mjs"; + +const result = { bench: "05-retry-on-cancel", native: null, workit: null }; + +// --- Signal-unaware retry baseline -------------------------------------- +{ + const clock = makeClock(); + let attemptsAfterCancel = 0; + let cancelRequestedAt = -1; + let outerSettledAt = -1; + const controller = new AbortController(); + + setTimeout(() => { + cancelRequestedAt = clock.t(); + controller.abort(); + }, 50); + + try { + await signalUnawareRetryLike(async (attempt) => { + if (cancelRequestedAt > 0 && attempt > 1) attemptsAfterCancel++; + // Body itself doesn't observe the abort -- that's the point. + await naiveSleep(20); + throw new Error(`attempt ${attempt} failed`); + }, { retries: 8, minDelay: 50 }); + } catch { + outerSettledAt = clock.t(); + } + + result.native = { + cancelRequestedAt, + outerSettledAt, + cancelLatencyMs: outerSettledAt - cancelRequestedAt, + attemptsAfterCancel, + settledAs: "rejected", + signalAwareSleep: false, + }; +} + +// --- WorkIt run.retry --------------------------------------------------- +{ + const clock = makeClock(); + let attemptsAfterCancel = 0; + let cancelRequestedAt = -1; + let outerSettledAt = -1; + let settledAs = "pending"; + let cancelReasonKind = null; + + const wrapped = run.retry(async (ctx) => { + if (cancelRequestedAt > 0 && ctx.attempt > 1) attemptsAfterCancel++; + await sleep(20, ctx.signal); + throw new Error(`attempt ${ctx.attempt} failed`); + }, { times: 8, initialDelay: "50ms", jitter: false, backoff: "fixed" }); + + // Drive it through a scope so we can cancel from outside. + let scopeRef; + const promise = run.scope(async (scope) => { + scopeRef = scope; + await scope.spawn(wrapped, { name: "retried-call" }); + }, { name: "retry-bench" }); + + setTimeout(() => { + cancelRequestedAt = clock.t(); + scopeRef.cancel({ kind: "manual", tag: "external-cancel" }); + }, 50); + + try { + await promise; + outerSettledAt = clock.t(); + settledAs = "fulfilled"; + } catch (err) { + outerSettledAt = clock.t(); + if (err instanceof CancellationError) { + settledAs = "cancelled"; + cancelReasonKind = err.reason.kind; + } else { + settledAs = "rejected"; + } + } + + result.workit = { + cancelRequestedAt, + outerSettledAt, + cancelLatencyMs: outerSettledAt - cancelRequestedAt, + attemptsAfterCancel, + settledAs, + cancelReasonKind, + signalAwareSleep: true, + }; + + assert.equal(settledAs, "cancelled", "run.retry must settle as cancelled, not failed, on parent cancel"); + assert.equal(attemptsAfterCancel, 0, "no further attempts after cancel was observed"); +} + +process.stdout.write(JSON.stringify(result, jsonReplacer, 2) + "\n"); diff --git a/benchmarks/articles/06-hedge-tied-requests.mjs b/benchmarks/articles/06-hedge-tied-requests.mjs new file mode 100644 index 0000000..b23fa36 --- /dev/null +++ b/benchmarks/articles/06-hedge-tied-requests.mjs @@ -0,0 +1,74 @@ +/** + * Bench 06 -- run.hedge tied-request behavior. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + * + * Workload: the body sleeps a configurable amount each call. With + * { after: "50ms", max: 3 }, run.hedge fires up to two more attempts at + * 50ms intervals. Whichever finishes first wins; the others cancel. + * + * We run two scenarios: + * + * slow: body sleeps 200ms. Hedges should fire at ~50ms and ~100ms. + * First completion at ~200ms. Two losers cancelled. + * fast: body sleeps 30ms. The first call wins before hedge timer. + * No hedges fired. No losers. + */ + +import assert from "node:assert/strict"; +import { CancellationError, run } from "../../dist/index.js"; +import { makeClock, sleep, jsonReplacer } from "./lib/baselines.mjs"; + +const result = { bench: "06-hedge-tied-requests", slow: null, fast: null }; + +async function runScenario(name, bodyMs, hedgeOpts) { + const clock = makeClock(); + const fired = []; // attempt-start timestamps + const settled = []; // { kind, t } + let attemptCounter = 0; + + const hedged = run.hedge(async (ctx) => { + const id = ++attemptCounter; + fired.push({ id, t: clock.t() }); + try { + await sleep(bodyMs, ctx.signal); + settled.push({ id, kind: "fulfilled", t: clock.t() }); + return id; + } catch (err) { + if (err instanceof CancellationError) { + settled.push({ id, kind: "cancelled", t: clock.t(), reason: err.reason.kind }); + } else { + settled.push({ id, kind: "rejected", t: clock.t() }); + } + throw err; + } + }, hedgeOpts); + + const winner = await run.scope(async (scope) => { + return await scope.spawn(hedged, { name: `hedge-${name}` }); + }); + + return { + scenario: name, + bodyMs, + hedgeOpts, + winner, + attemptsFired: fired.length, + fired, + settled, + losersCancelled: settled.filter((s) => s.kind === "cancelled").length, + }; +} + +result.slow = await runScenario("slow", 200, { after: "50ms", max: 3 }); +result.fast = await runScenario("fast", 30, { after: "50ms", max: 3 }); + +// Invariants +assert.ok(result.slow.attemptsFired >= 2, "slow scenario must fire at least one hedge"); +assert.ok(result.slow.attemptsFired <= 3, "max bound must hold"); +assert.equal(result.slow.losersCancelled, result.slow.attemptsFired - 1, "every loser must be cancelled"); +assert.equal(result.fast.attemptsFired, 1, "fast scenario must not fire a hedge"); +assert.equal(result.fast.losersCancelled, 0); + +process.stdout.write(JSON.stringify(result, jsonReplacer, 2) + "\n"); diff --git a/benchmarks/articles/07-worker-hard-kill.mjs b/benchmarks/articles/07-worker-hard-kill.mjs new file mode 100644 index 0000000..7b865d5 --- /dev/null +++ b/benchmarks/articles/07-worker-hard-kill.mjs @@ -0,0 +1,106 @@ +/** + * Bench 07 -- main-thread cooperative attempt vs offload({ timeout }). + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + * + * Workload: a non-cooperative CPU spinner that runs for 5 seconds and + * writes a late-marker file *after* the loop completes. + * + * Native main-thread attempt: an `AbortController` cannot stop the loop. + * The signal aborts. The loop ignores it. The marker file IS written. The + * "abort" is a lie. + * + * WorkIt offload({ timeout: "200ms" }): the worker thread is terminated by + * the host. The promise rejects with TimeoutError. The marker file does NOT + * exist on disk. CI gates the build on this. + */ + +import assert from "node:assert/strict"; +import { existsSync, rmSync } from "node:fs"; +import os from "node:os"; +import path from "node:path"; +import url from "node:url"; + +import { TimeoutError, run } from "../../dist/index.js"; +import { offload } from "../../dist/worker/index.js"; +import { makeClock, jsonReplacer } from "./lib/baselines.mjs"; + +const here = path.dirname(url.fileURLToPath(import.meta.url)); +const spinnerURL = new URL("./lib/spinner.mjs", import.meta.url); +const SPIN_MS = 5_000; +const TIMEOUT_MS = 200; + +const result = { bench: "07-worker-hard-kill", native: null, workit: null }; + +// --- Native main-thread "abort" attempt --------------------------------- +// We import the spinner here as a normal module, run it on the main thread, +// and trigger an AbortController after TIMEOUT_MS. The signal CANNOT stop the +// busy loop because there is no await boundary inside it. The late-marker IS +// written. +{ + const clock = makeClock(); + const markerPath = path.join(os.tmpdir(), `workit-bench-07-native-${process.pid}-${Date.now()}.marker`); + if (existsSync(markerPath)) rmSync(markerPath); + const controller = new AbortController(); + let abortRequestedAt = -1; + let abortVisibleAt = -1; + controller.signal.addEventListener("abort", () => { abortVisibleAt = clock.t(); }, { once: true }); + setTimeout(() => { abortRequestedAt = clock.t(); controller.abort(); }, TIMEOUT_MS); + + const { spin } = await import(spinnerURL.href); + const finalState = spin({ durationMs: SPIN_MS, markerPath }); + const completedAt = clock.t(); + + result.native = { + abortRequestedAt, + abortVisibleAt, + completedAt, + bodyCompleted: finalState.completed, + elapsedMs: finalState.elapsedMs, + markerExists: existsSync(markerPath), + }; + if (existsSync(markerPath)) rmSync(markerPath); +} + +// --- WorkIt offload({ timeout }) ---------------------------------------- +{ + const clock = makeClock(); + const markerPath = path.join(os.tmpdir(), `workit-bench-07-workit-${process.pid}-${Date.now()}.marker`); + if (existsSync(markerPath)) rmSync(markerPath); + + let rejectedAt = -1; + let rejectionClass = null; + + const task = offload(spinnerURL, "spin", { durationMs: SPIN_MS, markerPath }, { timeout: `${TIMEOUT_MS}ms` }); + + try { + await run.scope(async (scope) => scope.spawn(task, { name: "spinner" })); + rejectedAt = clock.t(); + rejectionClass = "fulfilled"; + } catch (err) { + rejectedAt = clock.t(); + rejectionClass = err?.constructor?.name ?? "Unknown"; + } + + // Give the worker a generous grace window in case the OS termination is + // racing the marker write. If hard-kill works, the marker still does not + // appear because the host thread tore the worker down at TIMEOUT_MS. + await new Promise((r) => setTimeout(r, 800)); + + result.workit = { + timeoutMs: TIMEOUT_MS, + rejectedAt, + rejectionClass, + markerExistsAfterRejection: existsSync(markerPath), + markerExistsAfterGrace: existsSync(markerPath), + }; + if (existsSync(markerPath)) rmSync(markerPath); + + // Invariants + assert.equal(rejectionClass, "TimeoutError", "offload must reject with TimeoutError when its timeout fires"); + assert.equal(result.workit.markerExistsAfterRejection, false, "late marker must NOT exist after offload timeout"); + assert.equal(result.workit.markerExistsAfterGrace, false, "late marker must NOT appear during grace window either"); +} + +process.stdout.write(JSON.stringify(result, jsonReplacer, 2) + "\n"); diff --git a/benchmarks/articles/08-uncancellable-shield.mjs b/benchmarks/articles/08-uncancellable-shield.mjs new file mode 100644 index 0000000..493b957 --- /dev/null +++ b/benchmarks/articles/08-uncancellable-shield.mjs @@ -0,0 +1,191 @@ +/** + * Bench 08 -- run.uncancellable shield contract. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + * + * Three scenarios prove the shield's runtime contract: + * + * A. parent_cancel_during_body -- parent scope cancels mid-body. The body + * runs to completion and observes its OWN signal (not the parent's). + * After the body returns, the shield rethrows the original cancel. + * + * B. shield_timeout -- shield's own timeout fires while the body is inside. + * The body sees a TimeoutError on its own signal. + * + * C. nested_shields -- outer scope cancels while two nested shields are + * active. Both bodies complete. The outer cancel reason is preserved + * and rethrown after the bodies finish. + * + * Note: run.uncancellable(body, opts) returns a TaskFn -- it must be spawned + * into a scope. We use scope.spawn(...) below. + */ + +import assert from "node:assert/strict"; +import { CancellationError, run } from "../../dist/index.js"; +import { makeClock, sleep, jsonReplacer } from "./lib/baselines.mjs"; + +const result = { bench: "08-uncancellable-shield" }; + +// --- Scenario A -- parent cancel during body ----------------------------- +{ + const clock = makeClock(); + let bodyStartedAt = -1; + let bodyCompletedAt = -1; + let bodyObservedAbort = false; + let outerSettledAt = -1; + let outerSettledAs = "pending"; + let outerCancelReasonKind = null; + let parentCancelRequestedAt = -1; + + const shielded = run.uncancellable(async (ctx) => { + bodyStartedAt = clock.t(); + try { + await sleep(120, ctx.signal); + bodyCompletedAt = clock.t(); + } catch { + bodyObservedAbort = true; + } + }, { timeout: "1s" }); + + let scopeRef; + const promise = run.scope(async (scope) => { + scopeRef = scope; + await scope.spawn(shielded, { name: "shielded" }); + }, { name: "scenario-A" }); + + setTimeout(() => { + parentCancelRequestedAt = clock.t(); + scopeRef.cancel({ kind: "manual", tag: "outer-cancel" }); + }, 30); + + try { + await promise; + outerSettledAt = clock.t(); + outerSettledAs = "fulfilled"; + } catch (err) { + outerSettledAt = clock.t(); + if (err instanceof CancellationError) { + outerSettledAs = "cancelled"; + outerCancelReasonKind = err.reason.kind; + } else { + outerSettledAs = "rejected"; + } + } + + result.A_parent_cancel_during_body = { + parentCancelRequestedAt, + bodyStartedAt, + bodyCompletedAt, + bodyObservedAbort, + outerSettledAt, + outerSettledAs, + outerCancelReasonKind, + bodyOutlivedCancelByMs: bodyCompletedAt - parentCancelRequestedAt, + }; + + assert.ok(bodyStartedAt >= 0, "shielded body must run"); + assert.ok(bodyCompletedAt > 0, "body must complete naturally inside the shield"); + assert.equal(bodyObservedAbort, false, "body's signal must NOT see the parent cancel"); + assert.equal(outerSettledAs, "cancelled", "outer scope must rethrow the original cancel after body"); + assert.equal(outerCancelReasonKind, "manual", "cancel reason must be preserved"); + assert.ok(bodyCompletedAt > parentCancelRequestedAt, "body must outlive the cancel request"); +} + +// --- Scenario B -- shield timeout while body is inside ------------------- +{ + const clock = makeClock(); + let bodyStartedAt = -1; + let bodyObservedAbort = false; + let bodyAbortReasonClass = null; + let outerSettledAt = -1; + let outerSettledClass = null; + + const shielded = run.uncancellable(async (ctx) => { + bodyStartedAt = clock.t(); + try { + // Body sleeps longer than the shield's own timeout + await sleep(2_000, ctx.signal); + } catch (err) { + bodyObservedAbort = true; + bodyAbortReasonClass = err?.constructor?.name ?? "Unknown"; + throw err; + } + }, { timeout: "100ms" }); + + try { + await run.scope(async (scope) => { + await scope.spawn(shielded, { name: "shielded-timeout" }); + }); + outerSettledAt = clock.t(); + outerSettledClass = "fulfilled"; + } catch (err) { + outerSettledAt = clock.t(); + outerSettledClass = err?.constructor?.name ?? "Unknown"; + } + + result.B_shield_timeout = { + bodyStartedAt, + bodyObservedAbort, + bodyAbortReasonClass, + outerSettledAt, + outerSettledClass, + }; + + assert.equal(bodyObservedAbort, true, "body must see the shield's own timeout on its signal"); + assert.equal(bodyAbortReasonClass, "TimeoutError", "abort reason inside body must be TimeoutError"); +} + +// --- Scenario C -- nested shields preserve outer cancel reason ----------- +{ + const clock = makeClock(); + let innerCompletedAt = -1; + let outerInnerCompletedAt = -1; + let outerSettledAt = -1; + let outerCancelReasonKind = null; + let parentCancelRequestedAt = -1; + + const inner = run.uncancellable(async (ctxInner) => { + await sleep(80, ctxInner.signal); + innerCompletedAt = clock.t(); + }, { timeout: "1s" }); + + const outerShield = run.uncancellable(async (ctxOuter) => { + // ctxOuter has the shield-local signal; the inner shield wraps again + await inner(ctxOuter); + outerInnerCompletedAt = clock.t(); + }, { timeout: "1s" }); + + let scopeRef; + const promise = run.scope(async (scope) => { + scopeRef = scope; + await scope.spawn(outerShield, { name: "shielded-nested" }); + }, { name: "scenario-C" }); + + setTimeout(() => { + parentCancelRequestedAt = clock.t(); + scopeRef.cancel({ kind: "manual", tag: "scenario-c-cancel" }); + }, 20); + + try { + await promise; + } catch (err) { + outerSettledAt = clock.t(); + if (err instanceof CancellationError) outerCancelReasonKind = err.reason.kind; + } + + result.C_nested_shields = { + parentCancelRequestedAt, + innerCompletedAt, + outerInnerCompletedAt, + outerSettledAt, + outerCancelReasonKind, + }; + + assert.ok(innerCompletedAt > 0, "innermost body must complete"); + assert.ok(outerInnerCompletedAt > 0, "outer shield body must complete after inner"); + assert.equal(outerCancelReasonKind, "manual", "outer cancel reason preserved through both shields"); + assert.ok(outerSettledAt > parentCancelRequestedAt, "outer must rethrow only after both shields finish"); +} + +process.stdout.write(JSON.stringify(result, jsonReplacer, 2) + "\n"); diff --git a/benchmarks/articles/09-stream-1b-lazy.mjs b/benchmarks/articles/09-stream-1b-lazy.mjs new file mode 100644 index 0000000..eb9e19e --- /dev/null +++ b/benchmarks/articles/09-stream-1b-lazy.mjs @@ -0,0 +1,132 @@ +/** + * Bench 09 -- work(asyncIter).inParallel(N).map().stream() lazy producer. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + * + * Workload: a 1,000,000,000-row async generator. The consumer takes 25 items + * then `break`s. The invariant the article claims: + * + * produced <= TAKE + CONCURRENCY (the producer paused as soon as the + * inflight slots were full) + * maxActive <= CONCURRENCY (hard concurrency cap) + * active === 0 after break (every in-flight slot was cancelled + * or completed cleanly) + * + * Comparison shape: the "naive eager" baseline pulls items as fast as the + * source yields and starts a worker per item up to a buffer cap. It does + * NOT pause the producer when the consumer breaks -- every prefetched item + * keeps running until completion. We measure how many items it produced + * before the consumer broke. + */ + +import assert from "node:assert/strict"; +import { work } from "../../dist/index.js"; +import { makeClock, jsonReplacer } from "./lib/baselines.mjs"; + +const TOTAL = 1_000_000_000; +const TAKE = 25; +const CONCURRENCY = 16; + +const result = { bench: "09-stream-1b-lazy", naive: null, workit: null }; + +// --- Naive eager pre-buffer baseline ------------------------------------ +{ + const clock = makeClock(); + let produced = 0; + let active = 0; + let maxActive = 0; + const PREFETCH = 256; // typical "queue ahead" knob + + async function* virtualBillion() { + for (let i = 0; i < TOTAL; i++) { produced++; yield i; } + } + const iter = virtualBillion(); + + // Pre-fill PREFETCH inflight workers; do NOT pause when consumer breaks. + const inflight = new Map(); + let nextIdx = 0; + let done = false; + + async function refill() { + while (!done && inflight.size < PREFETCH) { + const next = await iter.next(); + if (next.done) { done = true; break; } + const idx = nextIdx++; + active++; if (active > maxActive) maxActive = active; + const p = (async () => { + await Promise.resolve(); // simulate trivial async work + active--; + return { idx, value: next.value * 2 }; + })(); + inflight.set(idx, p); + } + } + + await refill(); + const taken = []; + while (taken.length < TAKE && inflight.size > 0) { + const winner = await Promise.race(inflight.values()); + inflight.delete(winner.idx); + taken.push(winner.value); + await refill(); + } + + // Even after the consumer "breaks", the prefetched ones keep running. + await Promise.allSettled(inflight.values()); + const settledAt = clock.t(); + + result.naive = { + settledAt, + consumed: taken.length, + produced, + prefetch: PREFETCH, + maxActive, + activeAfter: active, + }; +} + +// --- WorkIt work().inParallel(N).map().stream() ------------------------- +{ + const clock = makeClock(); + let produced = 0; + let active = 0; + let maxActive = 0; + + async function* virtualBillion() { + for (let i = 0; i < TOTAL; i++) { produced++; yield i; } + } + + const taken = []; + for await (const value of work(virtualBillion()) + .inParallel(CONCURRENCY) + .map(async (n) => { + active++; if (active > maxActive) maxActive = active; + await Promise.resolve(); + active--; + return n * 2; + }) + .stream()) { + taken.push(value); + if (taken.length === TAKE) break; + } + const settledAt = clock.t(); + + result.workit = { + settledAt, + consumed: taken.length, + produced, + concurrency: CONCURRENCY, + maxActive, + activeAfter: active, + producedBound: TAKE + CONCURRENCY, + }; + + // Invariants + assert.equal(taken.length, TAKE, "must consume exactly TAKE items"); + assert.ok(produced <= TAKE + CONCURRENCY, `produced (${produced}) must be <= TAKE + CONCURRENCY (${TAKE + CONCURRENCY})`); + assert.ok(maxActive <= CONCURRENCY, `maxActive (${maxActive}) must be <= CONCURRENCY (${CONCURRENCY})`); + assert.equal(active, 0, "no in-flight slots may remain after break"); +} + +process.stdout.write(JSON.stringify(result, jsonReplacer, 2) + "\n"); diff --git a/benchmarks/articles/10-stream-slow-consumer.mjs b/benchmarks/articles/10-stream-slow-consumer.mjs new file mode 100644 index 0000000..392d982 --- /dev/null +++ b/benchmarks/articles/10-stream-slow-consumer.mjs @@ -0,0 +1,85 @@ +/** + * Bench 10 -- slow consumer pauses the producer. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + * + * Workload: a fast producer (yields every microtask), a 16-wide map, a slow + * consumer (~5 ms per item). + * + * Without backpressure: producer races ahead, prefetched items pile up, heap + * grows linearly with the producer rate x consumer lag. + * + * With WorkIt's stream(): the consumer's await pause holds the producer at + * `inflight + N` items at a time. We measure produced vs consumed over time. + */ + +import assert from "node:assert/strict"; +import { work } from "../../dist/index.js"; +import { makeClock, jsonReplacer } from "./lib/baselines.mjs"; + +const SOURCE_SIZE = 5_000; +const CONCURRENCY = 16; +const CONSUME_DELAY_MS = 5; +const TAKE = 200; + +const result = { bench: "10-stream-slow-consumer", workit: null }; + +// --- WorkIt -- slow consumer --------------------------------------------- +{ + const clock = makeClock(); + let produced = 0; + let active = 0; + let maxActive = 0; + let producedAtFirstConsume = -1; + let producedAtLastConsume = -1; + + async function* source() { + for (let i = 0; i < SOURCE_SIZE; i++) { produced++; yield i; } + } + + const consumed = []; + for await (const value of work(source()) + .inParallel(CONCURRENCY) + .map(async (n) => { + active++; if (active > maxActive) maxActive = active; + await Promise.resolve(); + active--; + return n * 10; + }) + .stream()) { + if (producedAtFirstConsume < 0) producedAtFirstConsume = produced; + consumed.push(value); + producedAtLastConsume = produced; + await new Promise((r) => setTimeout(r, CONSUME_DELAY_MS)); + if (consumed.length === TAKE) break; + } + const elapsedMs = clock.t(); + + result.workit = { + sourceSize: SOURCE_SIZE, + take: TAKE, + concurrency: CONCURRENCY, + consumeDelayMs: CONSUME_DELAY_MS, + consumed: consumed.length, + produced, + producedAtFirstConsume, + producedAtLastConsume, + maxActive, + activeAfterBreak: active, + elapsedMs, + producerOvershoot: produced - consumed.length, + producerOvershootBound: CONCURRENCY + 1, // map slot + buffered next + }; + + // Invariants + assert.equal(consumed.length, TAKE, "must consume exactly TAKE items"); + assert.ok(maxActive <= CONCURRENCY, `maxActive (${maxActive}) must be <= CONCURRENCY`); + assert.ok( + produced - consumed.length <= CONCURRENCY + 1, + `producer overshoot (${produced - consumed.length}) must stay within CONCURRENCY + 1`, + ); + assert.equal(active, 0, "all in-flight slots cancelled or settled"); +} + +process.stdout.write(JSON.stringify(result, jsonReplacer, 2) + "\n"); diff --git a/benchmarks/articles/11-channel-contract.mjs b/benchmarks/articles/11-channel-contract.mjs new file mode 100644 index 0000000..11448d6 --- /dev/null +++ b/benchmarks/articles/11-channel-contract.mjs @@ -0,0 +1,131 @@ +/** + * Bench 11 -- createChannel contract. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + * + * Five scenarios prove the channel's runtime contract: + * + * A. capacity_backpressure -- send blocks when the channel is full; + * receive unblocks the next pending send. + * B. close_drains -- close() lets buffered values drain; then + * async iteration ends with done=true. + * C. close_rejects_pending -- pending sends after close() reject with + * ChannelClosedError carrying the reason. + * D. signal_cancels_receive -- a receive that is awaiting on an empty + * channel rejects when its signal aborts. + * E. capacity_validation -- createChannel rejects 0/-1/0.5/NaN at + * construction. + */ + +import assert from "node:assert/strict"; +import { ChannelClosedError, createChannel } from "../../dist/channel/index.js"; +import { makeClock, jsonReplacer } from "./lib/baselines.mjs"; + +const result = { bench: "11-channel-contract" }; + +// --- A -- capacity_backpressure ------------------------------------------ +{ + const clock = makeClock(); + const ch = createChannel({ capacity: 2 }); + await ch.send("a"); + await ch.send("b"); // channel is now full + + let thirdSendStartedAt = clock.t(); + let thirdSendCompletedAt = -1; + let thirdSettledBeforeReceive = false; + let thirdSettled = false; + const third = ch.send("c").then(() => { + thirdSettled = true; + thirdSendCompletedAt = clock.t(); + }); + + // One microtask turn is enough to catch an incorrectly unblocked send + // without making the bench depend on wall-clock timer jitter. + await Promise.resolve(); + thirdSettledBeforeReceive = thirdSettled; + + // Receive one -- third send must complete shortly after. + const r1 = await ch.receive(); + await third; + const completedAfterReceiveBy = thirdSendCompletedAt - thirdSendStartedAt; + + result.A_capacity_backpressure = { + capacity: 2, + sizeAfterTwoSends: 2, + thirdSendStartedAt, + thirdSendCompletedAt, + thirdSettledBeforeReceive, + firstReceived: r1, + thirdSendUnblockedWithinMs: completedAfterReceiveBy, + }; + + assert.equal(thirdSettledBeforeReceive, false, "third send must remain pending while channel is full"); + assert.equal(thirdSettled, true, "third send must complete after a receive frees a slot"); + assert.deepEqual(r1, { done: false, value: "a" }); +} + +// --- B -- close_drains --------------------------------------------------- +{ + const ch = createChannel({ capacity: 8 }); + await ch.send(1); + await ch.send(2); + await ch.send(3); + ch.close(); + + const collected = []; + for await (const v of ch) collected.push(v); + result.B_close_drains = { + collected, + iterationEndedCleanly: true, + }; + assert.deepEqual(collected, [1, 2, 3], "buffered values must drain after close()"); +} + +// --- C -- close_rejects_pending ------------------------------------------ +{ + const ch = createChannel({ capacity: 1 }); + await ch.send("x"); // fills the buffer + let rejectedClass = null; + let rejectionReason = null; + + const pending = ch.send("y").catch((err) => { + rejectedClass = err?.constructor?.name ?? "Unknown"; + rejectionReason = err instanceof ChannelClosedError ? err.reason : null; + }); + ch.close({ tag: "shutdown" }); + await pending; + + result.C_close_rejects_pending = { rejectedClass, rejectionReason }; + assert.equal(rejectedClass, "ChannelClosedError"); + assert.deepEqual(rejectionReason, { tag: "shutdown" }); +} + +// --- D -- signal_cancels_receive ----------------------------------------- +{ + const ch = createChannel({ capacity: 1 }); + const ctrl = new AbortController(); + + let rejectedClass = null; + const pending = ch.receive({ signal: ctrl.signal }).catch((err) => { + rejectedClass = err?.constructor?.name ?? "Unknown"; + }); + setTimeout(() => ctrl.abort(new Error("user-aborted")), 20); + await pending; + + result.D_signal_cancels_receive = { rejectedClass }; + assert.ok(rejectedClass !== null, "pending receive must reject when signal aborts"); +} + +// --- E -- capacity_validation -------------------------------------------- +{ + const rejected = []; + for (const bad of [0, -1, 0.5, Number.NaN, Number.POSITIVE_INFINITY]) { + try { createChannel({ capacity: bad }); rejected.push({ bad, accepted: true }); } + catch (err) { rejected.push({ bad, error: err?.constructor?.name ?? "Error" }); } + } + result.E_capacity_validation = { rejected }; + for (const r of rejected) assert.ok(r.error !== undefined, `capacity ${r.bad} must be rejected`); +} + +process.stdout.write(JSON.stringify(result, jsonReplacer, 2) + "\n"); diff --git a/benchmarks/articles/12-bracket-vs-try-finally.mjs b/benchmarks/articles/12-bracket-vs-try-finally.mjs new file mode 100644 index 0000000..9e9efb1 --- /dev/null +++ b/benchmarks/articles/12-bracket-vs-try-finally.mjs @@ -0,0 +1,196 @@ +/** + * Bench 12 -- try/finally vs run.bracket under cancellation, error, and + * hanging cleanup paths. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + * + * Five scenarios prove the bracket contract: + * + * A. success -- [acquire, use, release], release runs once. + * B. use_throws -- release runs once with the resource; + * the original error propagates. + * C. acquire_throws -- release does NOT run. + * D. parent_cancel_during_use -- release runs, outer rejects with + * CancellationError carrying the parent's reason. + * E. hanging_release -- `try/finally` would deadlock the caller forever; + * run.bracket bounds the cleanup with + * CleanupOpts.timeout and surfaces the timeout. + */ + +import assert from "node:assert/strict"; +import { CancellationError, run } from "../../dist/index.js"; +import { makeClock, sleep, jsonReplacer } from "./lib/baselines.mjs"; + +const result = { bench: "12-bracket-vs-try-finally" }; + +// --- A -- success --------------------------------------------------------- +{ + const order = []; + const out = await run.scope(async (scope) => scope.spawn(run.bracket( + async () => { order.push("acquire"); return { id: "RES-A" }; }, + async (res) => { order.push("use"); return res.id + ":used"; }, + async (res) => { order.push(`release:${res.id}`); }, + ))); + result.A_success = { order, out, releaseCount: order.filter((s) => s.startsWith("release")).length }; + assert.deepEqual(order, ["acquire", "use", "release:RES-A"]); + assert.equal(out, "RES-A:used"); +} + +// --- B -- use throws ------------------------------------------------------ +{ + const order = []; + let caughtMessage = null; + try { + await run.scope(async (scope) => scope.spawn(run.bracket( + async () => { order.push("acquire"); return { id: "RES-B" }; }, + async () => { order.push("use"); throw new Error("use-failed"); }, + async (res) => { order.push(`release:${res.id}`); }, + ))); + } catch (err) { caughtMessage = err?.message ?? null; } + + result.B_use_throws = { + order, + caughtMessage, + releaseRanWithResource: order.includes("release:RES-B"), + }; + assert.deepEqual(order, ["acquire", "use", "release:RES-B"]); + assert.equal(caughtMessage, "use-failed"); +} + +// --- C -- acquire throws -------------------------------------------------- +{ + const order = []; + let caughtMessage = null; + try { + await run.scope(async (scope) => scope.spawn(run.bracket( + async () => { order.push("acquire"); throw new Error("acquire-failed"); }, + async () => { order.push("use"); }, + async () => { order.push("release"); }, + ))); + } catch (err) { caughtMessage = err?.message ?? null; } + + result.C_acquire_throws = { + order, + caughtMessage, + releaseRan: order.includes("release"), + }; + assert.deepEqual(order, ["acquire"]); + assert.equal(caughtMessage, "acquire-failed"); +} + +// --- D -- parent cancel during use ---------------------------------------- +{ + const clock = makeClock(); + const order = []; + let releasedAt = -1; + let outerSettledClass = null; + let outerCancelReasonKind = null; + + let scopeRef; + const promise = run.scope(async (scope) => { + scopeRef = scope; + await scope.spawn(run.bracket( + async () => { order.push("acquire"); return { id: "RES-D" }; }, + async (res, ctx) => { order.push("use"); await sleep(200, ctx.signal); return res.id; }, + async (res) => { order.push(`release:${res.id}`); releasedAt = clock.t(); }, + )); + }); + + setTimeout(() => scopeRef.cancel({ kind: "manual", tag: "parent-cancel" }), 30); + + try { await promise; } catch (err) { + outerSettledClass = err?.constructor?.name ?? "Unknown"; + if (err instanceof CancellationError) outerCancelReasonKind = err.reason.kind; + } + + result.D_parent_cancel_during_use = { + order, + releasedAt, + outerSettledClass, + outerCancelReasonKind, + }; + assert.ok(order.includes("release:RES-D"), "release must run on parent cancel"); + assert.equal(outerSettledClass, "CancellationError"); + assert.equal(outerCancelReasonKind, "manual"); +} + +// --- E -- hanging release: try/finally vs run.bracket --------------------- +{ + // Naive try/finally with a hanging cleanup function. + // We give it 250ms to bail out via a manual timeout race; if the bracket-style + // timeout were in place this would never run forever. The test of the + // The baseline below verifies the absence of bounded cleanup in raw try/finally. + const clock = makeClock(); + let nativeOuterSettledAt = -1; + let nativeOuterSettledClass = null; + let nativeReleaseCompleted = false; + const nativeRace = await Promise.race([ + (async () => { + try { + try { + /* use */ return "value"; + } finally { + // Hanging cleanup -- never resolves. + await new Promise(() => {}); + nativeReleaseCompleted = true; + } + } catch (e) { return e; } + })().then( + (v) => ({ outcome: "fulfilled", at: clock.t(), value: v }), + (e) => ({ outcome: "rejected", at: clock.t(), error: e?.message ?? null }), + ), + sleep(250).then(() => ({ outcome: "still_pending_after_250ms", at: clock.t() })), + ]); + nativeOuterSettledAt = nativeRace.at; + nativeOuterSettledClass = nativeRace.outcome; + + // run.bracket with hanging release + CleanupOpts.timeout + const clock2 = makeClock(); + let bracketSettledAt = -1; + let bracketSettledClass = null; + let cleanupEvents = []; + try { + await run.scope(async (scope) => { + scope.onEvent((e) => { + if (e.type === "task:cleanup_timeout" || e.type === "task:cleanup_failed") { + cleanupEvents.push(e.type); + } + }); + await scope.spawn(run.bracket( + async () => ({ id: "RES-E" }), + async () => "value", + async () => { await new Promise(() => {}); }, // hanging cleanup + { timeout: "150ms" }, // bounded + )); + }); + bracketSettledAt = clock2.t(); + bracketSettledClass = "fulfilled"; + } catch (err) { + bracketSettledAt = clock2.t(); + bracketSettledClass = err?.constructor?.name ?? "Unknown"; + } + + result.E_hanging_release = { + native: { + outerSettledAt: nativeOuterSettledAt, + outcome: nativeOuterSettledClass, + releaseCompleted: nativeReleaseCompleted, + }, + workit: { + cleanupTimeoutMs: 150, + bracketSettledAt, + bracketSettledClass, + cleanupEventsObserved: cleanupEvents, + }, + }; + + assert.equal(nativeOuterSettledClass, "still_pending_after_250ms", + "native try/finally with hanging cleanup must NOT settle within 250ms"); + assert.ok(bracketSettledAt < 250, + `run.bracket must settle within the cleanup timeout; got ${bracketSettledAt}ms`); + assert.ok(cleanupEvents.includes("task:cleanup_timeout"), + "task:cleanup_timeout event must fire when cleanup exceeds the bound"); +} + +process.stdout.write(JSON.stringify(result, jsonReplacer, 2) + "\n"); diff --git a/benchmarks/articles/13-budget-atomicity-and-cancel.mjs b/benchmarks/articles/13-budget-atomicity-and-cancel.mjs new file mode 100644 index 0000000..d534cca --- /dev/null +++ b/benchmarks/articles/13-budget-atomicity-and-cancel.mjs @@ -0,0 +1,125 @@ +/** + * Bench 13 -- budget atomicity, owning-scope cancel, snapshot immutability. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + * + * Three scenarios: + * + * A. atomic_concurrent_charges + * 100 sibling tasks each consume 0.01 from a 1.00 CostBudget. Final spent + * must be exactly 1.00 with no double-charge or lost-update. + * + * B. owning_scope_cancellation_at_depth + * A budget is set at scope depth 0. A nested chain spawns a task at + * depth 5. That deep task tries to consume an amount that exceeds the + * budget. The scope at depth 0 (the OWNER of the budget) must be the one + * to cancel with reason kind "budget". + * + * C. caller_object_immutability + * The plain object passed into run.context.with(CostBudget, {...}) is + * never mutated by the engine. After charges, the CALLER's reference is + * still { spent: 0, ... }. + */ + +import assert from "node:assert/strict"; +import { CancellationError, ContextBagImpl, CostBudget, group, run } from "../../dist/index.js"; +import { makeClock, jsonReplacer } from "./lib/baselines.mjs"; + +const result = { bench: "13-budget-atomicity-and-cancel" }; + +// --- A -- atomic concurrent charges -------------------------------------- +{ + const N = 100; + const PER = 0.01; + const callerBudget = { spent: 0, limit: 1.0, unit: "USD" }; + const ctx = new ContextBagImpl().with(CostBudget, callerBudget); + + await group(async (task) => { + const handles = []; + for (let i = 0; i < N; i++) { + handles.push(task(async (c) => { c.consumeCost(PER); }, { name: `charge-${i}` })); + } + await Promise.all(handles); + }, { context: ctx }); + + const liveBudget = ctx.get(CostBudget); + result.A_atomic_concurrent_charges = { + siblings: N, + perCharge: PER, + finalSpent: liveBudget.spent, + expectedSpent: N * PER, + callerObjectSpentAfter: callerBudget.spent, + }; + assert.ok(Math.abs(liveBudget.spent - N * PER) < 1e-9, + `expected exactly ${N * PER}, got ${liveBudget.spent}`); +} + +// --- B -- owning-scope cancellation at depth ----------------------------- +{ + const clock = makeClock(); + const callerBudget = { spent: 0, limit: 1.0, unit: "USD" }; + const ctx = new ContextBagImpl().with(CostBudget, callerBudget); + + let outerCancelKind = null; + let outerSettledAt = -1; + let chargeAttemptedAtDepth = -1; + + try { + await group(async (taskD0) => { // depth 0 -- OWNS budget + await taskD0(async () => { + await group(async (taskD1) => { // depth 1 + await taskD1(async () => { + await group(async (taskD2) => { // depth 2 + await taskD2(async () => { + await group(async (taskD3) => { // depth 3 + await taskD3(async () => { + await group(async (taskD4) => { // depth 4 + await taskD4(async (cD5) => { // depth 5 + chargeAttemptedAtDepth = 5; + cD5.consumeCost(2.0); // exceeds 1.0 limit + }); + }); + }); + }); + }); + }); + }); + }); + }); + }, { context: ctx }); + } catch (err) { + outerSettledAt = clock.t(); + if (err instanceof CancellationError) outerCancelKind = err.reason.kind; + } + + result.B_owning_scope_cancel = { + chargeAttemptedAtDepth, + outerSettledAt, + outerCancelKind, + }; + assert.equal(outerCancelKind, "budget", "owning scope must cancel with kind='budget'"); + assert.equal(chargeAttemptedAtDepth, 5); +} + +// --- C -- caller object immutability ------------------------------------- +{ + const callerBudget = { spent: 0, limit: 1.0, unit: "USD" }; + const ctx = new ContextBagImpl().with(CostBudget, callerBudget); + + await group(async (task) => { + await task(async (c) => { c.consumeCost(0.25); }); + await task(async (c) => { c.consumeCost(0.25); }); + }, { context: ctx }); + + const liveBudget = ctx.get(CostBudget); + result.C_caller_immutability = { + callerSpentAfter: callerBudget.spent, + liveSpentAfter: liveBudget.spent, + callerLimitAfter: callerBudget.limit, + }; + assert.equal(callerBudget.spent, 0, "caller's input object must not be mutated"); + assert.ok(Math.abs(liveBudget.spent - 0.5) < 1e-9, "live snapshot must reflect the actual spend"); +} + +process.stdout.write(JSON.stringify(result, jsonReplacer, 2) + "\n"); diff --git a/benchmarks/articles/14-context-overlay-perf.mjs b/benchmarks/articles/14-context-overlay-perf.mjs new file mode 100644 index 0000000..c93703f --- /dev/null +++ b/benchmarks/articles/14-context-overlay-perf.mjs @@ -0,0 +1,93 @@ +/** + * Bench 14 -- naive Map-clone context vs WorkIt overlay context. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + * + * Workload: pre-fill a context bag with 5,000 keys. Then call .with(key, val) + * 100 times in succession (typical agent stack depth x repeated overrides). + * + * Naive baseline: every .with() clones the underlying Map. O(N) per call; + * the chain is O(M·N) for M calls over N keys. + * + * WorkIt overlay: every .with() returns a child bag that points to the parent + * and stores a single-key delta. O(1) per .with(). Lookup walks the chain. + * + * The article's claim: representative clone-vs-overlay timing for 100x5000. + * That bound is documented in `npm run check:context-performance`. + */ + +import assert from "node:assert/strict"; +import { ContextBagImpl, createContextKey } from "../../dist/index.js"; +import { jsonReplacer } from "./lib/baselines.mjs"; + +const KEYS = 5_000; +const WITH_CALLS = 100; +const result = { bench: "14-context-overlay-perf", naive: null, workit: null }; + +// --- Pre-create keys and values shared across both baselines ------------ +const keys = Array.from({ length: KEYS }, (_, i) => createContextKey(`k-${i}`)); +const seedValues = Array.from({ length: KEYS }, (_, i) => `v-${i}`); +const overrideKey = keys[Math.floor(KEYS / 2)]; + +// --- Naive Map-clone context (inline implementation) -------------------- +{ + class NaiveBag { + constructor(map) { this.map = map ?? new Map(); } + get(key) { return this.map.get(key); } + with(key, value) { + const next = new Map(this.map); // O(N) clone every time + next.set(key, value); + return new NaiveBag(next); + } + } + let bag = new NaiveBag(); + for (let i = 0; i < KEYS; i++) bag = bag.with(keys[i], seedValues[i]); + + const t0 = performance.now(); + let cur = bag; + for (let i = 0; i < WITH_CALLS; i++) cur = cur.with(overrideKey, `override-${i}`); + const elapsedMs = performance.now() - t0; + + result.naive = { + keys: KEYS, + withCalls: WITH_CALLS, + elapsedMs: Number(elapsedMs.toFixed(3)), + perCallMs: Number((elapsedMs / WITH_CALLS).toFixed(4)), + deepestLookup: cur.get(overrideKey), + }; +} + +// --- WorkIt overlay context --------------------------------------------- +{ + let bag = new ContextBagImpl(); + for (let i = 0; i < KEYS; i++) bag = bag.with(keys[i], seedValues[i]); + + const t0 = performance.now(); + let cur = bag; + for (let i = 0; i < WITH_CALLS; i++) cur = cur.with(overrideKey, `override-${i}`); + const elapsedMs = performance.now() - t0; + + result.workit = { + keys: KEYS, + withCalls: WITH_CALLS, + elapsedMs: Number(elapsedMs.toFixed(3)), + perCallMs: Number((elapsedMs / WITH_CALLS).toFixed(4)), + deepestLookup: cur.get(overrideKey), + speedupVsNaive: Number((result.naive.elapsedMs / Math.max(elapsedMs, 1e-6)).toFixed(0)), + }; + + // The published gate is < 10ms; we assert that floor here. + assert.ok(elapsedMs < 10, + `WorkIt overlay context must complete 100 .with() over 5000 keys in under 10ms; got ${elapsedMs.toFixed(3)}ms`); + + // Correctness -- both bags must resolve the deepest override the same way. + assert.equal(result.workit.deepestLookup, result.naive.deepestLookup); + + // We also assert the naive baseline is at least 10x slower so the bench is + // meaningful (not tied to a specific number, since hardware varies). + assert.ok(result.naive.elapsedMs > elapsedMs * 10, + `naive baseline (${result.naive.elapsedMs}ms) must be >=10x slower than overlay (${elapsedMs.toFixed(3)}ms)`); +} + +process.stdout.write(JSON.stringify(result, jsonReplacer, 2) + "\n"); diff --git a/benchmarks/articles/15-core-zero-network.mjs b/benchmarks/articles/15-core-zero-network.mjs new file mode 100644 index 0000000..fe02280 --- /dev/null +++ b/benchmarks/articles/15-core-zero-network.mjs @@ -0,0 +1,71 @@ +/** + * Bench 15 -- zero networking imports in the published core bundle. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + * + * The production gate `npm run check:no-network` runs over the `src/` tree. + * This bench reproduces the same property over the *built artifact* -- the + * exact files a consumer installs from npm -- across the core surface and + * its non-network subpaths. + * + * Subpaths intentionally excluded: + * - dist/observability (opt-in exporter bridge; uses no network either, + * but it's the network *seam* and we don't want to + * tie the article's claim to it staying empty) + * - dist/otel (opt-in OpenTelemetry bridge; uses the user's + * tracer/meter object) + * - dist/worker (uses node:worker_threads, not networking) + */ + +import assert from "node:assert/strict"; +import { readdir, readFile } from "node:fs/promises"; +import path from "node:path"; +import url from "node:url"; +import { jsonReplacer } from "./lib/baselines.mjs"; + +const repoRoot = path.resolve(path.dirname(url.fileURLToPath(import.meta.url)), "..", ".."); +const distRoot = path.join(repoRoot, "dist"); + +const FORBIDDEN = [ + { name: "import 'node:http'", pattern: /["']node:http["']/ }, + { name: "import 'node:https'", pattern: /["']node:https["']/ }, + { name: "import 'http'", pattern: /from\s+["']http["']/ }, + { name: "import 'https'", pattern: /from\s+["']https["']/ }, + { name: "global fetch(...)", pattern: /\bfetch\s*\(/ }, +]; + +const EXCLUDED_DIRS = new Set(["observability", "otel", "worker"]); + +const result = { bench: "15-core-zero-network", filesScanned: 0, hits: [], excluded: [...EXCLUDED_DIRS] }; + +async function walk(dir, relRoot = "") { + let entries; + try { entries = await readdir(dir, { withFileTypes: true }); } + catch { return; } + for (const entry of entries) { + const rel = relRoot ? `${relRoot}/${entry.name}` : entry.name; + const full = path.join(dir, entry.name); + if (entry.isDirectory()) { + if (relRoot === "" && EXCLUDED_DIRS.has(entry.name)) continue; + await walk(full, rel); + continue; + } + if (!/\.(?:js|cjs|mjs)$/.test(entry.name)) continue; + result.filesScanned++; + const text = await readFile(full, "utf8"); + for (const { name, pattern } of FORBIDDEN) { + if (pattern.test(text)) result.hits.push({ file: rel, kind: name }); + } + } +} + +await walk(distRoot); + +result.passed = result.hits.length === 0; + +process.stdout.write(JSON.stringify(result, jsonReplacer, 2) + "\n"); + +assert.equal(result.hits.length, 0, + `core network imports leaked into dist/: ${JSON.stringify(result.hits)}`); +assert.ok(result.filesScanned > 0, "expected at least one file to be scanned (dist/ must be built)"); diff --git a/benchmarks/articles/16-sampling-and-aggregation.mjs b/benchmarks/articles/16-sampling-and-aggregation.mjs new file mode 100644 index 0000000..c3ab4a9 --- /dev/null +++ b/benchmarks/articles/16-sampling-and-aggregation.mjs @@ -0,0 +1,94 @@ +/** + * Bench 16 -- sampling reduction in exported event volume. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + * + * Workload: 100 root scopes. Each spawns 5 child tasks. 95% of scopes + * complete fast and successfully; 5% are slow (>= slowThresholdMs); a small + * fraction of the rest fail. + * + * We compare two sampling policies attached via `attachTelemetryExporter`: + * + * `mode: "all"` -- every TaskEvent reaches the exporter (raw firehose) + * `mode: "errors_and_slow"` -- only the errored or slow traces reach the exporter + * + * The article's claim is a 20x reduction at 5% slow/errored. We assert + * at least 5x to keep the bench tolerant to platform jitter; in practice + * the measured factor is significantly higher. + * + * The "200 tasks -> 1 summary record" claim from the article is a separate + * proof -- it lives in the production gate `npm run check:exporter-stress` + * because `attachScopeSummaryExporter` requires the `scope:opened` event, + * which fires before user code can attach inside `run.scope`. + */ + +import assert from "node:assert/strict"; +import { run } from "../../dist/index.js"; +import { attachTelemetryExporter } from "../../dist/observability/index.js"; +import { sleep, jsonReplacer } from "./lib/baselines.mjs"; + +const ROOTS = 100; +const TASKS_PER_ROOT = 5; +const SLOW_RATE = 0.05; +const ERROR_RATE = 0.02; +const SLOW_MS = 60; +const FAST_MS = 5; + +async function workload(scope, slow, willFail) { + const handles = []; + for (let i = 0; i < TASKS_PER_ROOT; i++) { + handles.push(scope.spawn(async (ctx) => { + await sleep(slow ? SLOW_MS : FAST_MS, ctx.signal); + if (willFail && i === TASKS_PER_ROOT - 1) throw new Error("synthetic-fail"); + return i; + }, { name: `child-${i}`, kind: "io" })); + } + await Promise.allSettled(handles); + if (willFail) throw new Error("root-fail"); +} + +async function runMany(sampling) { + const taskEvents = []; + + const completions = []; + for (let i = 0; i < ROOTS; i++) { + const slow = i < ROOTS * SLOW_RATE; + const willFail = !slow && i < ROOTS * (SLOW_RATE + ERROR_RATE); + completions.push((async () => { + try { + await run.scope(async (scope) => { + attachTelemetryExporter(scope, (e) => { taskEvents.push(e); }, { sampling }); + await workload(scope, slow, willFail); + }); + } catch { /* expected for the willFail traces */ } + })()); + } + await Promise.all(completions); + return { taskEvents: taskEvents.length }; +} + +const result = { bench: "16-sampling-and-aggregation" }; + +result.unsampled_per_task = await runMany({ mode: "all" }); +result.errors_and_slow_per_task = await runMany({ mode: "errors_and_slow", slowThresholdMs: SLOW_MS - 5 }); + +result.reduction_factor = +( + result.unsampled_per_task.taskEvents / Math.max(1, result.errors_and_slow_per_task.taskEvents) +).toFixed(2); + +result.workload = { + rootScopes: ROOTS, + tasksPerRoot: TASKS_PER_ROOT, + slowRate: SLOW_RATE, + errorRate: ERROR_RATE, + slowMs: SLOW_MS, +}; + +process.stdout.write(JSON.stringify(result, jsonReplacer, 2) + "\n"); + +// Invariants +assert.ok( + result.errors_and_slow_per_task.taskEvents < result.unsampled_per_task.taskEvents / 5, + `errors_and_slow must reduce task-event volume at least 5x (got ${result.reduction_factor}x)`, +); diff --git a/benchmarks/articles/17-cardinality-safe-metrics.mjs b/benchmarks/articles/17-cardinality-safe-metrics.mjs new file mode 100644 index 0000000..b0564bd --- /dev/null +++ b/benchmarks/articles/17-cardinality-safe-metrics.mjs @@ -0,0 +1,67 @@ +/** + * Bench 17 -- cardinality-safe metric exporter. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + * + * Workload: emit a small batch of metric points through the cardinality-safe + * wrapper. Some points use bounded labels (`task.kind` enum, `outcome` enum). + * Others try to smuggle unbounded labels (`task.id` UUIDs, free-form + * `error.message`). + * + * The wrapper must: + * - accept bounded labels when they appear in the allowedLabels list + * - reject (or strip) unbounded labels not in the allowed list + */ + +import assert from "node:assert/strict"; +import { createCardinalitySafeMetricExporter } from "../../dist/observability/index.js"; +import { jsonReplacer } from "./lib/baselines.mjs"; + +const result = { bench: "17-cardinality-safe-metrics", emitted: [], rejected: [] }; + +const allowedLabels = ["task.kind", "outcome", "scope.name"]; + +const safeExporter = createCardinalitySafeMetricExporter( + (point) => { result.emitted.push(point); }, + { allowedLabels }, +); + +const candidates = [ + { name: "task.duration", value: 12, labels: { "task.kind": "io", "outcome": "succeeded" } }, + { name: "task.duration", value: 35, labels: { "task.kind": "llm", "outcome": "failed" } }, + // Unbounded label task.id - must be rejected or stripped. + { name: "task.duration", value: 99, labels: { "task.kind": "tool", "task.id": "uuid-abc" } }, + // Free-form text in label value. + { name: "task.duration", value: 17, labels: { "task.kind": "io", "error.message": "EHOSTUNREACH at 10.0.0.42 retrying" } }, + // Out-of-enum value for a bounded label. + { name: "task.duration", value: 21, labels: { "task.kind": "evil" } }, +]; + +for (const candidate of candidates) { + try { + await safeExporter(candidate); + } catch (err) { + result.rejected.push({ + name: candidate.name, + labels: candidate.labels, + errorClass: err?.constructor?.name ?? "Unknown", + errorMessage: err?.message ?? null, + }); + } +} + +result.summary = { + emittedCount: result.emitted.length, + rejectedCount: result.rejected.length, + emittedKinds: result.emitted.map((p) => p.labels?.["task.kind"]).filter(Boolean), +}; + +process.stdout.write(JSON.stringify(result, jsonReplacer, 2) + "\n"); + +// Invariants +assert.ok(result.rejected.length >= 1, "at least one unbounded-label point must be rejected or stripped"); +assert.ok( + result.emitted.every((p) => Object.keys(p.labels ?? {}).every((k) => allowedLabels.includes(k))), + "every emitted metric must only carry allowed labels", +); diff --git a/benchmarks/articles/18-diagnostics-finding-codes.mjs b/benchmarks/articles/18-diagnostics-finding-codes.mjs new file mode 100644 index 0000000..9c1ff06 --- /dev/null +++ b/benchmarks/articles/18-diagnostics-finding-codes.mjs @@ -0,0 +1,127 @@ +/** + * Bench 18 -- diagnoseSnapshot finding codes. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + * + * Workload: hand-craft four ScopeSnapshot inputs that each trigger one of the + * stable finding codes the diagnostics subpath emits: + * + * old_pending_task -- a task that has been running well past staleTaskMs + * pending_child_scope -- a child scope still active when the parent should + * be closing + * scope_cancelling -- a scope that has begun cancelling but is not yet + * closed + * cleanup_timeout -- the diagnoses surfaces a recent + * task:cleanup_timeout event in the bounded window + * + * The bench asserts each code can be produced and that the report's status + * flips to needs_attention when there are findings. + */ + +import assert from "node:assert/strict"; +import { diagnoseSnapshot } from "../../dist/diagnostics/index.js"; +import { jsonReplacer } from "./lib/baselines.mjs"; + +const NOW = 1_000_000_000; +const STALE_MS = 30_000; + +function makeTask(over) { + return { + id: "task-1", + name: "io", + kind: "io", + status: "running", + attempt: 1, + startedAt: NOW - 60_000, // very old by default + ...over, + }; +} + +function makeSnapshot(over) { + return { + id: "scope-1", + name: "root", + status: "running", + startedAt: NOW - 60_000, + pendingCount: 0, + completedCount: 0, + failedCount: 0, + cancelledCount: 0, + tasks: [], + scopes: [], + ...over, + }; +} + +const result = { bench: "18-diagnostics-finding-codes", scenarios: {} }; + +// 1. Healthy snapshot +{ + const report = diagnoseSnapshot(makeSnapshot({ pendingCount: 0 }), { now: NOW, staleTaskMs: STALE_MS }); + result.scenarios.healthy = { status: report.status, findingCodes: report.findings.map((f) => f.code) }; + assert.equal(report.status, "ok"); + assert.equal(report.findings.length, 0); +} + +// 2. Old pending task +{ + const snap = makeSnapshot({ + pendingCount: 1, + tasks: [makeTask({ status: "pending" })], + }); + const report = diagnoseSnapshot(snap, { now: NOW, staleTaskMs: STALE_MS }); + result.scenarios.old_pending_task = { status: report.status, findingCodes: report.findings.map((f) => f.code) }; + assert.equal(report.status, "needs_attention"); + assert.ok(report.findings.some((f) => f.code === "old_pending_task")); +} + +// 3. Cancelling scope +{ + const snap = makeSnapshot({ status: "cancelling" }); + const report = diagnoseSnapshot(snap, { now: NOW, staleTaskMs: STALE_MS }); + result.scenarios.scope_cancelling = { status: report.status, findingCodes: report.findings.map((f) => f.code) }; + assert.equal(report.status, "needs_attention"); + assert.ok(report.findings.some((f) => f.code === "scope_cancelling")); +} + +// 4. Pending child scope +{ + const child = makeSnapshot({ + id: "scope-child", name: "child", status: "running", pendingCount: 1, + tasks: [makeTask({ id: "task-c", status: "pending" })], + }); + const snap = makeSnapshot({ scopes: [child], pendingCount: 1 }); + const report = diagnoseSnapshot(snap, { now: NOW, staleTaskMs: STALE_MS }); + result.scenarios.pending_child_scope = { + status: report.status, + findingCodes: report.findings.map((f) => f.code), + }; + assert.equal(report.status, "needs_attention"); + assert.ok(report.findings.some((f) => f.code === "pending_child_scope")); + // The recursion should also find the child's old pending task. + assert.ok(report.findings.some((f) => f.code === "old_pending_task")); +} + +// 5. Cleanup timeout via the events window +{ + const snap = makeSnapshot(); + const events = [ + { + type: "task:cleanup_timeout", + taskId: "task-cleanup", + timeoutMs: 250, + durationMs: 252, + at: NOW - 1_000, + }, + ]; + const report = diagnoseSnapshot(snap, { now: NOW, staleTaskMs: STALE_MS, events }); + result.scenarios.cleanup_timeout = { + status: report.status, + findingCodes: report.findings.map((f) => f.code), + }; + assert.equal(report.status, "needs_attention"); + assert.ok(report.findings.some((f) => f.code === "cleanup_timeout")); +} + +process.stdout.write(JSON.stringify(result, jsonReplacer, 2) + "\n"); diff --git a/benchmarks/articles/19-agent-scope.mjs b/benchmarks/articles/19-agent-scope.mjs new file mode 100644 index 0000000..6ee0d2f --- /dev/null +++ b/benchmarks/articles/19-agent-scope.mjs @@ -0,0 +1,172 @@ +/** + * Bench 19 -- runAgent / AgentScope contract. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + * + * Five scenarios prove the agent loop's runtime contract: + * + * A. tool_events -- every agent.tool() call brackets the body with + * replayable started/succeeded events; agentId + * stays stable; seq is sequential; `at` is + * monotonically non-decreasing. + * B. tool_calls_budget -- AgentToolCalls is charged exactly once per + * call; reaching the cap rejects with + * BudgetExceededError tagged with the budget + * key. + * C. tokens_budget -- OpenAITokens is charged via { tokens: N }; + * final spent equals the sum of the calls. + * D. parent_cancel -- when the parent scope cancels mid-tool, the + * tool body's ctx.signal aborts and the + * outer settles as CancellationError with the + * parent's reason. + * E. replayable_log -- the events array on AgentRunResult is a + * complete, ordered, type-discriminated trace + * of the run. + */ + +import assert from "node:assert/strict"; +import { + BudgetExceededError, + CancellationError, + run, +} from "../../dist/index.js"; +import { + AgentToolCalls, + OpenAITokens, + runAgent, +} from "../../dist/ai/index.js"; +import { jsonReplacer } from "./lib/baselines.mjs"; + +const result = { bench: "19-agent-scope" }; + +// --- A -- tool events bracket execution ---------------------------------- +{ + const r = await runAgent(async (agent) => { + const v = await agent.tool("calc", 3, async (x) => x * x); + return v; + }); + const toolEvents = r.events.filter((e) => /^agent:tool_/.test(e.type)); + result.A_tool_events = { + finalResult: r.result, + eventCount: r.events.length, + eventTypesInOrder: r.events.map((e) => e.type), + toolName: toolEvents[0]?.tool ?? null, + seqs: r.events.map((e) => e.seq), + monotonicAt: r.events.every((e, i) => i === 0 || e.at >= r.events[i - 1].at), + sameAgentId: r.events.every((e) => e.agentId === r.events[0].agentId), + }; + assert.equal(r.result, 9); + assert.equal(toolEvents[0].type, "agent:tool_started"); + assert.equal(toolEvents[1].type, "agent:tool_succeeded"); + assert.equal(toolEvents[0].tool, "calc"); + assert.deepEqual(result.A_tool_events.seqs, [1, 2, 3, 4]); + assert.ok(result.A_tool_events.monotonicAt); + assert.ok(result.A_tool_events.sameAgentId); +} + +// --- B -- AgentToolCalls budget hard cap --------------------------------- +{ + let outerError = null; + let evidence = null; + try { + await run.context.with(AgentToolCalls, { spent: 0, limit: 1, unit: "tool_calls" }, async () => { + await runAgent(async (agent) => { + await agent.tool("first", 0, async () => "ok", { toolCalls: 1 }); + await agent.tool("second", 0, async () => "ok", { toolCalls: 1 }); // overflow + }); + }); + } catch (err) { + outerError = err; + evidence = { + class: err?.constructor?.name, + budgetKey: err?.budgetKey, + limit: err?.limit, + attempted: err?.attempted, + }; + } + result.B_tool_calls_budget = evidence; + assert.ok(outerError instanceof BudgetExceededError); + assert.equal(outerError.budgetKey, "AgentToolCalls"); +} + +// --- C -- OpenAITokens budget consumed via tool opts --------------------- +{ + let final = null; + await run.context.with(OpenAITokens, { spent: 0, limit: 1000, unit: "tokens" }, async () => { + await runAgent(async (agent) => { + await agent.tool("a", 1, async () => "ok", { tokens: 50 }); + await agent.tool("b", 2, async () => "ok", { tokens: 25 }); + }); + final = run.context.budget(OpenAITokens); + }); + result.C_tokens_budget = { final }; + assert.equal(final.spent, 75); +} + +// --- D -- parent cancel during tool propagates to tool body's signal ----- +{ + let outerError = null; + let toolSignalAborted = false; + const winner = await Promise.race([ + (async () => { + try { + await runAgent(async (agent, ctx) => { + const p = agent.tool("slow", 0, async (_, c) => { + await new Promise((res, rej) => { + if (c.signal.aborted) return rej(c.signal.reason); + c.signal.addEventListener("abort", () => { + toolSignalAborted = true; + rej(c.signal.reason); + }, { once: true }); + }); + }); + setTimeout(() => ctx.scope.cancel({ kind: "manual", tag: "user-stop" }), 10); + await p; + }); + } catch (err) { outerError = err; } + return "done"; + })(), + new Promise((r) => setTimeout(() => r("TIMEOUT"), 800)), + ]); + result.D_parent_cancel = { + winner, + toolSignalAborted, + outerErrorClass: outerError?.constructor?.name ?? null, + cancelReasonKind: outerError instanceof CancellationError ? outerError.reason.kind : null, + cancelReasonTag: outerError instanceof CancellationError && outerError.reason.kind === "manual" + ? outerError.reason.tag : null, + }; + assert.equal(winner, "done", "must not hit the 800ms watchdog"); + assert.equal(toolSignalAborted, true, "tool body's signal must observe the parent cancel"); + assert.ok(outerError instanceof CancellationError, "outer must reject with CancellationError"); + assert.equal(result.D_parent_cancel.cancelReasonKind, "manual"); +} + +// --- E -- replayable event log on a 3-tool run --------------------------- +{ + const r = await runAgent(async (agent) => { + await agent.tool("plan", "goal", async () => ["fetch", "summarize"]); + await agent.tool("fetch", "https", async () => ""); + await agent.tool("summarize","", async () => "tl;dr"); + return "done"; + }); + result.E_replayable_log = { + eventCount: r.events.length, + eventTypes: r.events.map((e) => e.type), + seqs: r.events.map((e) => e.seq), + monotonicAt: r.events.every((e, i) => i === 0 || e.at >= r.events[i - 1].at), + sameAgentId: r.events.every((e) => e.agentId === r.events[0].agentId), + toolStartedNames: r.events.filter((e) => e.type === "agent:tool_started").map((e) => e.tool), + toolSucceededNames:r.events.filter((e) => e.type === "agent:tool_succeeded").map((e) => e.tool), + }; + assert.equal(r.result, "done"); + // 1 agent:started + 3 x (tool_started + tool_succeeded) + 1 agent:completed = 8 + assert.equal(r.events.length, 8); + assert.deepEqual(result.E_replayable_log.toolStartedNames, ["plan", "fetch", "summarize"]); + assert.deepEqual(result.E_replayable_log.toolSucceededNames, ["plan", "fetch", "summarize"]); + assert.ok(result.E_replayable_log.monotonicAt); + assert.deepEqual(result.E_replayable_log.seqs, [1, 2, 3, 4, 5, 6, 7, 8]); +} + +process.stdout.write(JSON.stringify(result, jsonReplacer, 2) + "\n"); diff --git a/benchmarks/articles/README.md b/benchmarks/articles/README.md new file mode 100644 index 0000000..a4af7a4 --- /dev/null +++ b/benchmarks/articles/README.md @@ -0,0 +1,101 @@ + + +# Article-Series Benchmarks + +Self-contained, side-by-side runtime benches that back the WorkIt runtime claims made in the [WorkIt article series](../../articles/). Zero external dependencies -- the promise-helper baselines are inlined as minimal implementations of the unsignaled behavior patterns being compared in [`lib/baselines.mjs`](lib/baselines.mjs). + +## Why this folder exists + +The articles compare WorkIt to native promises and small local baselines that model common promise-helper patterns. Runtime claims that are not reproducible from the published package stay out of the series. + +Each bench in this folder runs both implementations against the same workload, captures timestamps for every observable event (`startedAt`, `settledAt`, `signalAbortedAt`, `deferRanAt`), and emits one JSON record. If a WorkIt invariant regresses, the bench's `assert.*` calls fail the run. + +## What's measured + +| File | Article section | What it verifies | +|---|---|---| +| [`01-run-all-vs-promise-all.mjs`](01-run-all-vs-promise-all.mjs) | `run.all` | Native `Promise.all` lets losers run for the full latency past rejection. `run.all` cancels them at the AbortSignal boundary; defer cleanups run before the outer promise rejects. | +| [`02-run-race-vs-promise-race.mjs`](02-run-race-vs-promise-race.mjs) | `run.race` | Native `Promise.race` leaves losing fetches running. `run.race` aborts losers and tags them with `CancelReason.kind === "race_lost"`. | +| [`03-run-any-vs-promise-any.mjs`](03-run-any-vs-promise-any.mjs) | `run.any` | Native `Promise.any` keeps slower siblings running after the first success. `run.any` cancels remaining siblings, defer runs. | +| [`04-pool-vs-semaphore.mjs`](04-pool-vs-semaphore.mjs) | `run.pool` | A `p-limit`-style semaphore lets queued items keep running after one throws. `run.pool` cancels queued and in-flight on first failure. | +| [`05-retry-on-cancel.mjs`](05-retry-on-cancel.mjs) | `run.retry` | A signal-unaware retry loop can run extra attempts after cancel was requested. `run.retry`'s sleep is signal-aware and the task settles as `cancelled`, not `failed`. | +| [`06-hedge-tied-requests.mjs`](06-hedge-tied-requests.mjs) | `run.hedge` | `run.hedge` fires extra attempts only after the configured `after` interval, bounds them by `max`, and cancels every non-winning attempt. The fast scenario fires no hedge at all. | +| [`07-worker-hard-kill.mjs`](07-worker-hard-kill.mjs) | `offload` (article 03) | A non-cooperative CPU spin loop on the main thread cannot be aborted; the late-marker file is written. Inside `offload({ timeout })`, the worker thread is terminated by the host; the marker file does **not** exist on disk. The bench `stat()`s the marker file. | +| [`08-uncancellable-shield.mjs`](08-uncancellable-shield.mjs) | `run.uncancellable` (article 03) | Three scenarios: (A) parent cancel during body -- body completes, then original cancel rethrows; (B) shield's own timeout -- body sees `TimeoutError` on its local signal; (C) nested shields -- outer cancel reason preserved through both layers. | +| [`09-stream-1b-lazy.mjs`](09-stream-1b-lazy.mjs) | `work().stream()` (article 04) | An eager prefetch buffer pulls 281 items from a 1B source for 25 consumed on the captured run. `work().inParallel(16).map().stream()` produces <= 41 items (TAKE + CONCURRENCY) and respects the cap. | +| [`10-stream-slow-consumer.mjs`](10-stream-slow-consumer.mjs) | backpressure (article 04) | A slow consumer (~5 ms per item) holds the producer to <= `CONCURRENCY + 1` items of overshoot. Tracked: producer pacing, max active, post-break in-flight count. | +| [`11-channel-contract.mjs`](11-channel-contract.mjs) | `createChannel` (article 04) | Five contract scenarios: capacity backpressure (third send blocks on a 2-cap channel), close drains buffered values, close rejects pending sends with `ChannelClosedError`, signal cancels a pending receive, capacity validation rejects 0/-1/0.5/NaN/Infinity. | +| [`12-bracket-vs-try-finally.mjs`](12-bracket-vs-try-finally.mjs) | `run.bracket` (article 05) | Five scenarios: success, `use` throws, `acquire` throws (release does not run), parent cancel during `use`, hanging release. Native try/finally with a hanging cleanup never settles; `run.bracket` with `{ timeout }` settles within the bound and emits `task:cleanup_timeout`. | +| [`13-budget-atomicity-and-cancel.mjs`](13-budget-atomicity-and-cancel.mjs) | budgets (article 05) | 100 sibling charges of 0.01 land at exactly 1.00 (no double-charge). A budget set at scope depth 0 cancels with `kind: "budget"` even when the overrun happens at depth 5. The caller's input object is never mutated by the engine. | +| [`14-context-overlay-perf.mjs`](14-context-overlay-perf.mjs) | context overlay (article 05) | The inline Map-clone baseline takes ~32 ms for 100 `.with()` over 5,000 keys on representative runs. The WorkIt overlay is well under the <10 ms gate and at least 10x faster than the inline baseline. Same lookup result. | +| [`15-core-zero-network.mjs`](15-core-zero-network.mjs) | zero-network gate (article 06) | Static walk over `dist/` (excluding the explicit `observability`, `otel`, `worker` subpaths) finds zero matches for `node:http`, `node:https`, raw `http`/`https` imports, or `fetch(...)`. Same property as the production gate, applied to the published artifact. | +| [`16-sampling-and-aggregation.mjs`](16-sampling-and-aggregation.mjs) | sampling (article 06) | 100 root scopes x 5 child tasks. `mode: "all"` exports 1,300 events; `mode: "errors_and_slow"` exports 36 -- a ~36x reduction at 5% slow + 2% errored. Asserts >= 5x. | +| [`17-cardinality-safe-metrics.mjs`](17-cardinality-safe-metrics.mjs) | cardinality (article 06) | The cardinality-safe metric exporter rejects metric points whose label keys are not in the `allowedLabels` allow-list. `task.id` UUIDs and free-form `error.message` are caught at runtime. | +| [`18-diagnostics-finding-codes.mjs`](18-diagnostics-finding-codes.mjs) | diagnostics (article 06) | Five scenarios: healthy snapshot stays `ok`; `old_pending_task`, `scope_cancelling`, `pending_child_scope`, and `cleanup_timeout` (via the events window) each flip the report to `needs_attention`. | +| [`19-agent-scope.mjs`](19-agent-scope.mjs) | `runAgent` / `AgentScope` (article 07) | Five scenarios: tool events bracket execution with stable agentId and monotonic seq; `AgentToolCalls` budget overflow rejects with `BudgetExceededError` keyed `"AgentToolCalls"`; `OpenAITokens` charges via `{ tokens: N }` land at exact spent; parent scope cancel propagates into the tool body's `ctx.signal` with `CancelReason { kind: "manual", tag }`; replayable event log on a 3-tool run has 8 ordered events. | + +## Running + +From the repo root: + +```sh +npm run build # produces dist/ +node benchmarks/articles/run-all.mjs # full suite, JSON to stdout +node benchmarks/articles/01-run-all-vs-promise-all.mjs # one bench +``` + +Or from this folder: + +```sh +npm run bench # full suite +npm run bench:run-all # bench 01 +npm run bench:run-race # bench 02 +npm run bench:run-any # bench 03 +npm run bench:pool # bench 04 +npm run bench:retry # bench 05 +npm run bench:hedge # bench 06 +npm run bench:hard-kill # bench 07 +npm run bench:uncancellable # bench 08 +npm run bench:stream-1b # bench 09 +npm run bench:stream-slow # bench 10 +npm run bench:channel # bench 11 +npm run bench:bracket # bench 12 +npm run bench:budget # bench 13 +npm run bench:context # bench 14 +npm run bench:no-network # bench 15 +npm run bench:sampling # bench 16 +npm run bench:cardinality # bench 17 +npm run bench:diagnostics # bench 18 +npm run bench:agent # bench 19 +``` + +The bench folder has its own `package.json` and **does not appear in the main package's dependency graph**. It is run-only -- clone the repo, run. + +## Output shape + +Each individual bench prints one JSON object to stdout with the structure: + +```jsonc +{ + "bench": "01-run-all-vs-promise-all", + "native": { /* timings + flags for the native baseline */ }, + "workit": { /* timings + flags for the WorkIt impl */ } +} +``` + +The runner (`run-all.mjs`) wraps every individual report in a `benches[]` array along with `wallMs` and `exitCode`, so a regression is discoverable both by `assert` failure inside the bench and by the runner's exit code. + +## How the baselines stay honest + +`lib/baselines.mjs` contains: + +- `pLimitLike(N)` -- local semaphore baseline. It models a minimal queue without automatic sibling-failure cancellation. Queued items keep running unless the caller clears the queue. +- `signalUnawareRetryLike(fn, opts)` -- local retry baseline with `setTimeout`-based delay. It intentionally does not observe an abort signal. +- `promiseTimeoutLike(promise, ms)` -- local timeout baseline that wraps a Promise. It intentionally does not abort the underlying work. +- `naiveSleep(ms)` -- signal-unaware sleep used inside native baselines on purpose. +- `sleep(ms, signal)` -- signal-aware sleep used by the WorkIt-side bodies. + +These mirror the specific unsignaled behavior patterns under comparison. Some current `p-*` packages expose cancellation hooks; the article's claim is about ownership and composition, not that every helper is incapable of cancellation in every configuration. If an upstream library closes one of these gaps for the exact scenario being tested, the file is updated and so is the article. diff --git a/benchmarks/articles/lib/baselines.mjs b/benchmarks/articles/lib/baselines.mjs new file mode 100644 index 0000000..8c7e585 --- /dev/null +++ b/benchmarks/articles/lib/baselines.mjs @@ -0,0 +1,131 @@ +/** + * Minimal behavioral baselines for unsignaled promise-helper patterns. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + * + * Each baseline below avoids a runtime dependency. The point is to compare the + * unsignaled semantics the article calls out, not to claim these are complete + * clones of the corresponding npm packages: + * + * p-limit-style semaphore -- no automatic sibling-failure propagation. + * retry loop -- retry counter with delay. Sleep is not signal-aware. + * timeout wrapper -- wraps a Promise with a timeout. Returns a Promise, + * not a composable TaskFn. + * + * These behavioral baselines are not bug-for-bug clones; they are the smallest + * implementation of the specific behavior under comparison. If an upstream + * library feature closes one of these gaps for the exact scenario being tested, + * update this file and the article together. + */ + +/** + * pLimitLike(N) -- semaphore. Returns a wrapper `(fn) => Promise` that + * runs at most N concurrently. Has no cancellation. If a wrapped fn rejects, + * queued ones still run. + */ +export function pLimitLike(N) { + let active = 0; + const queue = []; + const drain = () => { + while (active < N && queue.length > 0) { + const next = queue.shift(); + active++; + Promise.resolve() + .then(next.run) + .then( + (value) => { active--; next.resolve(value); drain(); }, + (err) => { active--; next.reject(err); drain(); }, + ); + } + }; + return (fn) => + new Promise((resolve, reject) => { + queue.push({ run: fn, resolve, reject }); + drain(); + }); +} + +/** + * signalUnawareRetryLike(fn, { retries, minDelay }) -- retries on rejection up to `retries` + * times. Sleep between attempts uses raw setTimeout -- NOT signal-aware. If the + * caller wants to abort, the in-flight sleep does not see it. + */ +export async function signalUnawareRetryLike(fn, { retries = 3, minDelay = 100 } = {}) { + let lastErr; + for (let attempt = 1; attempt <= retries; attempt++) { + try { + return await fn(attempt); + } catch (err) { + lastErr = err; + if (attempt === retries) break; + await new Promise((r) => setTimeout(r, minDelay)); + } + } + throw lastErr; +} + +/** + * promiseTimeoutLike(promise, ms) -- rejects with TimeoutError after ms. Does NOT + * abort the underlying work. The promise keeps running. There is no signal + * to thread, so any I/O the body started continues. + */ +export class PTimeoutError extends Error { + constructor(ms) { + super(`Promise timed out after ${ms} milliseconds`); + this.name = "TimeoutError"; + } +} +export function promiseTimeoutLike(promise, ms) { + return new Promise((resolve, reject) => { + const t = setTimeout(() => reject(new PTimeoutError(ms)), ms); + promise.then( + (v) => { clearTimeout(t); resolve(v); }, + (e) => { clearTimeout(t); reject(e); }, + ); + }); +} + +/** Stamped now() relative to a t0 captured at module-import time. */ +export function makeClock() { + const t0 = Date.now(); + return { + t: () => Date.now() - t0, + fmt: () => `t=${(Date.now() - t0).toString().padStart(5)}ms`, + }; +} + +/** signal-aware sleep used by the WorkIt-side bench bodies. */ +export function sleep(ms, signal) { + return new Promise((resolve, reject) => { + const timer = setTimeout(resolve, ms); + signal?.addEventListener("abort", () => { + clearTimeout(timer); + reject(signal.reason); + }, { once: true }); + }); +} + +/** signal-UNAWARE sleep -- used inside native-baseline bodies on purpose. */ +export function naiveSleep(ms) { + return new Promise((resolve) => setTimeout(resolve, ms)); +} + +/** Track when a body actually settled so we can prove who kept running. */ +export function makeProbe(name) { + return { + name, + startedAt: -1, + settledAt: -1, + settledAs: "pending", // "fulfilled" | "rejected" | "cancelled" | "pending" + signalAbortedAt: -1, + deferRanAt: -1, + }; +} + +export function jsonReplacer(_key, value) { + if (value instanceof Error) { + return { name: value.name, message: value.message }; + } + return value; +} diff --git a/benchmarks/articles/lib/spinner.mjs b/benchmarks/articles/lib/spinner.mjs new file mode 100644 index 0000000..a12deec --- /dev/null +++ b/benchmarks/articles/lib/spinner.mjs @@ -0,0 +1,21 @@ +/** + * Non-cooperative CPU spinner used by bench 07. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + * + * The spin loop is intentionally signal-unaware. If the worker thread is + * terminated before `durationMs` elapses, the late-marker file is never + * written. The bench verifies the kill by checking the filesystem. + */ +import { writeFileSync } from "node:fs"; + +export function spin(opts) { + const { durationMs, markerPath } = opts; + const start = Date.now(); + while (Date.now() - start < durationMs) { + Math.sqrt(Math.random() * 1e6); + } + writeFileSync(markerPath, "late-marker-written-by-worker"); + return { completed: true, elapsedMs: Date.now() - start }; +} diff --git a/benchmarks/articles/package.json b/benchmarks/articles/package.json new file mode 100644 index 0000000..070c939 --- /dev/null +++ b/benchmarks/articles/package.json @@ -0,0 +1,31 @@ +{ + "name": "@workit/articles-bench", + "version": "0.0.0", + "private": true, + "type": "module", + "description": "Self-contained side-by-side benches for the WorkIt article series. Zero external dependencies; promise-helper baselines are inlined.", + "author": "Admilson B. F. Cossa", + "license": "Apache-2.0", + "scripts": { + "bench": "node run-all.mjs", + "bench:run-all": "node 01-run-all-vs-promise-all.mjs", + "bench:run-race": "node 02-run-race-vs-promise-race.mjs", + "bench:run-any": "node 03-run-any-vs-promise-any.mjs", + "bench:pool": "node 04-pool-vs-semaphore.mjs", + "bench:retry": "node 05-retry-on-cancel.mjs", + "bench:hedge": "node 06-hedge-tied-requests.mjs", + "bench:hard-kill": "node 07-worker-hard-kill.mjs", + "bench:uncancellable": "node 08-uncancellable-shield.mjs", + "bench:stream-1b": "node 09-stream-1b-lazy.mjs", + "bench:stream-slow": "node 10-stream-slow-consumer.mjs", + "bench:channel": "node 11-channel-contract.mjs", + "bench:bracket": "node 12-bracket-vs-try-finally.mjs", + "bench:budget": "node 13-budget-atomicity-and-cancel.mjs", + "bench:context": "node 14-context-overlay-perf.mjs", + "bench:no-network": "node 15-core-zero-network.mjs", + "bench:sampling": "node 16-sampling-and-aggregation.mjs", + "bench:cardinality": "node 17-cardinality-safe-metrics.mjs", + "bench:diagnostics": "node 18-diagnostics-finding-codes.mjs", + "bench:agent": "node 19-agent-scope.mjs" + } +} diff --git a/benchmarks/articles/run-all.mjs b/benchmarks/articles/run-all.mjs new file mode 100644 index 0000000..f73a7d1 --- /dev/null +++ b/benchmarks/articles/run-all.mjs @@ -0,0 +1,51 @@ +/** + * benchmarks/articles/run-all.mjs -- runs every bench in the folder and emits + * one consolidated JSON report to stdout. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + */ + +import { spawn } from "node:child_process"; +import { readdir } from "node:fs/promises"; +import path from "node:path"; +import url from "node:url"; + +const here = path.dirname(url.fileURLToPath(import.meta.url)); +const files = (await readdir(here)) + .filter((f) => /^\d{2}-.*\.mjs$/.test(f)) + .sort(); + +const summary = { author: "Admilson B. F. Cossa", spdxLicense: "Apache-2.0", benches: [] }; + +for (const file of files) { + const t0 = Date.now(); + const out = await new Promise((resolve, reject) => { + const child = spawn(process.execPath, [path.join(here, file)], { stdio: ["ignore", "pipe", "pipe"] }); + let stdout = "", stderr = ""; + child.stdout.on("data", (b) => { stdout += b.toString(); }); + child.stderr.on("data", (b) => { stderr += b.toString(); }); + child.on("error", reject); + child.on("exit", (code) => resolve({ code, stdout, stderr })); + }); + const wallMs = Date.now() - t0; + let parsed = null; + try { parsed = JSON.parse(out.stdout); } catch { /* leave null */ } + summary.benches.push({ + file, + exitCode: out.code, + wallMs, + stderr: out.stderr.trim() || null, + report: parsed, + }); + if (out.code !== 0) { + process.stderr.write(`FAIL ${file} (exit ${out.code})\n${out.stderr}\n`); + } +} + +const failures = summary.benches.filter((b) => b.exitCode !== 0).length; +summary.passed = summary.benches.length - failures; +summary.failed = failures; + +process.stdout.write(JSON.stringify(summary, null, 2) + "\n"); +process.exit(failures > 0 ? 1 : 0); diff --git a/benchmarks/results/articles.latest.json b/benchmarks/results/articles.latest.json new file mode 100644 index 0000000..c0c0b80 --- /dev/null +++ b/benchmarks/results/articles.latest.json @@ -0,0 +1,1037 @@ +{ + "author": "Admilson B. F. Cossa", + "spdxLicense": "Apache-2.0", + "benches": [ + { + "file": "01-run-all-vs-promise-all.mjs", + "exitCode": 0, + "wallMs": 295, + "stderr": null, + "report": { + "bench": "01-run-all-vs-promise-all", + "native": { + "outerRejectedAt": 34, + "A": { + "name": "A", + "startedAt": 0, + "settledAt": 64, + "settledAs": "fulfilled", + "signalAbortedAt": -1, + "deferRanAt": -1 + }, + "B": { + "name": "B", + "startedAt": 0, + "settledAt": 33, + "settledAs": "rejected", + "signalAbortedAt": -1, + "deferRanAt": -1 + }, + "C": { + "name": "C", + "startedAt": 0, + "settledAt": 110, + "settledAs": "fulfilled", + "signalAbortedAt": -1, + "deferRanAt": -1 + }, + "losersStillRanForMs": { + "A": 30, + "C": 76 + }, + "losersWereCancelled": false + }, + "workit": { + "outerRejectedAt": 33, + "A": { + "name": "A", + "startedAt": 1, + "settledAt": 32, + "settledAs": "cancelled", + "signalAbortedAt": 32, + "deferRanAt": 33 + }, + "B": { + "name": "B", + "startedAt": 1, + "settledAt": 32, + "settledAs": "rejected", + "signalAbortedAt": -1, + "deferRanAt": -1 + }, + "C": { + "name": "C", + "startedAt": 1, + "settledAt": 32, + "settledAs": "cancelled", + "signalAbortedAt": 32, + "deferRanAt": 33 + }, + "losersWereCancelled": true, + "cancellationLatencyFromBFailure": { + "A": 0, + "C": 0 + }, + "deferRanBeforeOuterReject": { + "A": true, + "C": true + } + } + } + }, + { + "file": "02-run-race-vs-promise-race.mjs", + "exitCode": 0, + "wallMs": 220, + "stderr": null, + "report": { + "bench": "02-run-race-vs-promise-race", + "native": { + "winner": "anthropic", + "winnerSettledAt": 13, + "probes": { + "openai": { + "name": "openai", + "startedAt": 0, + "settledAt": 61, + "settledAs": "fulfilled", + "signalAbortedAt": -1, + "deferRanAt": -1 + }, + "anthropic": { + "name": "anthropic", + "startedAt": 0, + "settledAt": 13, + "settledAs": "fulfilled", + "signalAbortedAt": -1, + "deferRanAt": -1 + }, + "gemini": { + "name": "gemini", + "startedAt": 0, + "settledAt": 92, + "settledAs": "fulfilled", + "signalAbortedAt": -1, + "deferRanAt": -1 + } + }, + "losersStillRanForMs": { + "openai": 48, + "gemini": 79 + }, + "losersWereCancelled": false + }, + "workit": { + "winner": "anthropic", + "winnerSettledAt": 17, + "probes": { + "openai": { + "name": "openai", + "startedAt": 1, + "settledAt": 17, + "settledAs": "cancelled", + "signalAbortedAt": 17, + "deferRanAt": 17, + "cancelReasonKind": "race_lost" + }, + "anthropic": { + "name": "anthropic", + "startedAt": 2, + "settledAt": 16, + "settledAs": "fulfilled", + "signalAbortedAt": -1, + "deferRanAt": 17 + }, + "gemini": { + "name": "gemini", + "startedAt": 2, + "settledAt": 17, + "settledAs": "cancelled", + "signalAbortedAt": 17, + "deferRanAt": 17, + "cancelReasonKind": "race_lost" + } + }, + "losersWereCancelled": true, + "cancelReasonKindForLosers": { + "openai": "race_lost", + "gemini": "race_lost" + } + } + } + }, + { + "file": "03-run-any-vs-promise-any.mjs", + "exitCode": 0, + "wallMs": 342, + "stderr": null, + "report": { + "bench": "03-run-any-vs-promise-any", + "native": { + "winner": "B", + "winnerSettledAt": 57, + "A": { + "name": "A", + "startedAt": 0, + "settledAt": 41, + "settledAs": "rejected", + "signalAbortedAt": -1, + "deferRanAt": -1 + }, + "B": { + "name": "B", + "startedAt": 0, + "settledAt": 57, + "settledAs": "fulfilled", + "signalAbortedAt": -1, + "deferRanAt": -1 + }, + "C": { + "name": "C", + "startedAt": 0, + "settledAt": 103, + "settledAs": "fulfilled", + "signalAbortedAt": -1, + "deferRanAt": -1 + }, + "cStillRanForMs": 46, + "losersWereCancelled": false + }, + "workit": { + "winner": "B", + "winnerSettledAt": 63, + "A": { + "name": "A", + "startedAt": 1, + "settledAt": 32, + "settledAs": "rejected", + "signalAbortedAt": 63, + "deferRanAt": 32 + }, + "B": { + "name": "B", + "startedAt": 1, + "settledAt": 63, + "settledAs": "fulfilled", + "signalAbortedAt": -1, + "deferRanAt": 63 + }, + "C": { + "name": "C", + "startedAt": 1, + "settledAt": 63, + "settledAs": "cancelled", + "signalAbortedAt": 63, + "deferRanAt": 63, + "cancelReasonKind": "race_lost" + }, + "cWasCancelled": true, + "cancelLatencyForC": 0, + "deferRanForC": true + } + } + }, + { + "file": "04-pool-vs-semaphore.mjs", + "exitCode": 0, + "wallMs": 417, + "stderr": null, + "report": { + "bench": "04-pool-vs-semaphore", + "native": { + "outerRejectedAt": 24, + "started": 10, + "fulfilledAfterRejection": 9, + "cancelled": 0, + "longestPostRejectionRunMs": 293, + "probes": [ + { + "name": "item-0", + "startedAt": 0, + "settledAt": 101, + "settledAs": "fulfilled", + "signalAbortedAt": -1, + "deferRanAt": -1 + }, + { + "name": "item-1", + "startedAt": 0, + "settledAt": 101, + "settledAs": "fulfilled", + "signalAbortedAt": -1, + "deferRanAt": -1 + }, + { + "name": "item-2", + "startedAt": 0, + "settledAt": 101, + "settledAs": "fulfilled", + "signalAbortedAt": -1, + "deferRanAt": -1 + }, + { + "name": "item-3", + "startedAt": 0, + "settledAt": 24, + "settledAs": "rejected", + "signalAbortedAt": -1, + "deferRanAt": -1 + }, + { + "name": "item-4", + "startedAt": 24, + "settledAt": 132, + "settledAs": "fulfilled", + "signalAbortedAt": -1, + "deferRanAt": -1 + }, + { + "name": "item-5", + "startedAt": 101, + "settledAt": 209, + "settledAs": "fulfilled", + "signalAbortedAt": -1, + "deferRanAt": -1 + }, + { + "name": "item-6", + "startedAt": 101, + "settledAt": 209, + "settledAs": "fulfilled", + "signalAbortedAt": -1, + "deferRanAt": -1 + }, + { + "name": "item-7", + "startedAt": 101, + "settledAt": 209, + "settledAs": "fulfilled", + "signalAbortedAt": -1, + "deferRanAt": -1 + }, + { + "name": "item-8", + "startedAt": 132, + "settledAt": 240, + "settledAs": "fulfilled", + "signalAbortedAt": -1, + "deferRanAt": -1 + }, + { + "name": "item-9", + "startedAt": 209, + "settledAt": 317, + "settledAs": "fulfilled", + "signalAbortedAt": -1, + "deferRanAt": -1 + } + ] + }, + "workit": { + "outerRejectedAt": 32, + "started": 4, + "fulfilledAfterRejection": 0, + "cancelled": 3, + "notStarted": 6, + "cancelReasonKindsForCancelled": [ + "sibling_failed" + ], + "probes": [ + { + "name": "item-0", + "startedAt": 1, + "settledAt": 32, + "settledAs": "cancelled", + "signalAbortedAt": 32, + "deferRanAt": 32, + "cancelReasonKind": "sibling_failed" + }, + { + "name": "item-1", + "startedAt": 1, + "settledAt": 32, + "settledAs": "cancelled", + "signalAbortedAt": 32, + "deferRanAt": 32, + "cancelReasonKind": "sibling_failed" + }, + { + "name": "item-2", + "startedAt": 1, + "settledAt": 32, + "settledAs": "cancelled", + "signalAbortedAt": 32, + "deferRanAt": 32, + "cancelReasonKind": "sibling_failed" + }, + { + "name": "item-3", + "startedAt": 1, + "settledAt": 31, + "settledAs": "rejected", + "signalAbortedAt": 32, + "deferRanAt": 32 + }, + { + "name": "item-4", + "startedAt": -1, + "settledAt": -1, + "settledAs": "pending", + "signalAbortedAt": -1, + "deferRanAt": -1 + }, + { + "name": "item-5", + "startedAt": -1, + "settledAt": -1, + "settledAs": "pending", + "signalAbortedAt": -1, + "deferRanAt": -1 + }, + { + "name": "item-6", + "startedAt": -1, + "settledAt": -1, + "settledAs": "pending", + "signalAbortedAt": -1, + "deferRanAt": -1 + }, + { + "name": "item-7", + "startedAt": -1, + "settledAt": -1, + "settledAs": "pending", + "signalAbortedAt": -1, + "deferRanAt": -1 + }, + { + "name": "item-8", + "startedAt": -1, + "settledAt": -1, + "settledAs": "pending", + "signalAbortedAt": -1, + "deferRanAt": -1 + }, + { + "name": "item-9", + "startedAt": -1, + "settledAt": -1, + "settledAs": "pending", + "signalAbortedAt": -1, + "deferRanAt": -1 + } + ] + } + } + }, + { + "file": "05-retry-on-cancel.mjs", + "exitCode": 0, + "wallMs": 824, + "stderr": null, + "report": { + "bench": "05-retry-on-cancel", + "native": { + "cancelRequestedAt": 65, + "outerSettledAt": 687, + "cancelLatencyMs": 622, + "attemptsAfterCancel": 7, + "settledAs": "rejected", + "signalAwareSleep": false + }, + "workit": { + "cancelRequestedAt": 63, + "outerSettledAt": 64, + "cancelLatencyMs": 1, + "attemptsAfterCancel": 0, + "settledAs": "cancelled", + "cancelReasonKind": "manual", + "signalAwareSleep": true + } + } + }, + { + "file": "06-hedge-tied-requests.mjs", + "exitCode": 0, + "wallMs": 310, + "stderr": null, + "report": { + "bench": "06-hedge-tied-requests", + "slow": { + "scenario": "slow", + "bodyMs": 200, + "hedgeOpts": { + "after": "50ms", + "max": 3 + }, + "winner": 1, + "attemptsFired": 3, + "fired": [ + { + "id": 1, + "t": 1 + }, + { + "id": 2, + "t": 61 + }, + { + "id": 3, + "t": 108 + } + ], + "settled": [ + { + "id": 1, + "kind": "fulfilled", + "t": 216 + }, + { + "id": 2, + "kind": "cancelled", + "t": 216, + "reason": "race_lost" + }, + { + "id": 3, + "kind": "cancelled", + "t": 216, + "reason": "race_lost" + } + ], + "losersCancelled": 2 + }, + "fast": { + "scenario": "fast", + "bodyMs": 30, + "hedgeOpts": { + "after": "50ms", + "max": 3 + }, + "winner": 1, + "attemptsFired": 1, + "fired": [ + { + "id": 1, + "t": 1 + } + ], + "settled": [ + { + "id": 1, + "kind": "fulfilled", + "t": 31 + } + ], + "losersCancelled": 0 + } + } + }, + { + "file": "07-worker-hard-kill.mjs", + "exitCode": 0, + "wallMs": 6097, + "stderr": null, + "report": { + "bench": "07-worker-hard-kill", + "native": { + "abortRequestedAt": -1, + "abortVisibleAt": -1, + "completedAt": 5002, + "bodyCompleted": true, + "elapsedMs": 5000, + "markerExists": true + }, + "workit": { + "timeoutMs": 200, + "rejectedAt": 211, + "rejectionClass": "TimeoutError", + "markerExistsAfterRejection": false, + "markerExistsAfterGrace": false + } + } + }, + { + "file": "08-uncancellable-shield.mjs", + "exitCode": 0, + "wallMs": 390, + "stderr": null, + "report": { + "bench": "08-uncancellable-shield", + "A_parent_cancel_during_body": { + "parentCancelRequestedAt": 44, + "bodyStartedAt": 2, + "bodyCompletedAt": 122, + "bodyObservedAbort": false, + "outerSettledAt": 123, + "outerSettledAs": "cancelled", + "outerCancelReasonKind": "manual", + "bodyOutlivedCancelByMs": 78 + }, + "B_shield_timeout": { + "bodyStartedAt": 0, + "bodyObservedAbort": true, + "bodyAbortReasonClass": "TimeoutError", + "outerSettledAt": 108, + "outerSettledClass": "TimeoutError" + }, + "C_nested_shields": { + "parentCancelRequestedAt": 31, + "innerCompletedAt": 93, + "outerInnerCompletedAt": 93, + "outerSettledAt": 93, + "outerCancelReasonKind": "manual" + } + } + }, + { + "file": "09-stream-1b-lazy.mjs", + "exitCode": 0, + "wallMs": 70, + "stderr": null, + "report": { + "bench": "09-stream-1b-lazy", + "naive": { + "settledAt": 3, + "consumed": 25, + "produced": 281, + "prefetch": 256, + "maxActive": 1, + "activeAfter": 0 + }, + "workit": { + "settledAt": 3, + "consumed": 25, + "produced": 40, + "concurrency": 16, + "maxActive": 1, + "activeAfter": 0, + "producedBound": 41 + } + } + }, + { + "file": "10-stream-slow-consumer.mjs", + "exitCode": 0, + "wallMs": 3195, + "stderr": null, + "report": { + "bench": "10-stream-slow-consumer", + "workit": { + "sourceSize": 5000, + "take": 200, + "concurrency": 16, + "consumeDelayMs": 5, + "consumed": 200, + "produced": 215, + "producedAtFirstConsume": 16, + "producedAtLastConsume": 215, + "maxActive": 1, + "activeAfterBreak": 0, + "elapsedMs": 3126, + "producerOvershoot": 15, + "producerOvershootBound": 17 + } + } + }, + { + "file": "11-channel-contract.mjs", + "exitCode": 0, + "wallMs": 78, + "stderr": null, + "report": { + "bench": "11-channel-contract", + "A_capacity_backpressure": { + "capacity": 2, + "sizeAfterTwoSends": 2, + "thirdSendStartedAt": 0, + "thirdSendCompletedAt": 0, + "thirdSettledBeforeReceive": false, + "firstReceived": { + "done": false, + "value": "a" + }, + "thirdSendUnblockedWithinMs": 0 + }, + "B_close_drains": { + "collected": [ + 1, + 2, + 3 + ], + "iterationEndedCleanly": true + }, + "C_close_rejects_pending": { + "rejectedClass": "ChannelClosedError", + "rejectionReason": { + "tag": "shutdown" + } + }, + "D_signal_cancels_receive": { + "rejectedClass": "Error" + }, + "E_capacity_validation": { + "rejected": [ + { + "bad": 0, + "error": "RangeError" + }, + { + "bad": -1, + "error": "RangeError" + }, + { + "bad": 0.5, + "error": "RangeError" + }, + { + "bad": null, + "error": "RangeError" + }, + { + "bad": null, + "error": "RangeError" + } + ] + } + } + }, + { + "file": "12-bracket-vs-try-finally.mjs", + "exitCode": 0, + "wallMs": 531, + "stderr": null, + "report": { + "bench": "12-bracket-vs-try-finally", + "A_success": { + "order": [ + "acquire", + "use", + "release:RES-A" + ], + "out": "RES-A:used", + "releaseCount": 1 + }, + "B_use_throws": { + "order": [ + "acquire", + "use", + "release:RES-B" + ], + "caughtMessage": "use-failed", + "releaseRanWithResource": true + }, + "C_acquire_throws": { + "order": [ + "acquire" + ], + "caughtMessage": "acquire-failed", + "releaseRan": false + }, + "D_parent_cancel_during_use": { + "order": [ + "acquire", + "use", + "release:RES-D" + ], + "releasedAt": 40, + "outerSettledClass": "CancellationError", + "outerCancelReasonKind": "manual" + }, + "E_hanging_release": { + "native": { + "outerSettledAt": 265, + "outcome": "still_pending_after_250ms", + "releaseCompleted": false + }, + "workit": { + "cleanupTimeoutMs": 150, + "bracketSettledAt": 156, + "bracketSettledClass": "fulfilled", + "cleanupEventsObserved": [ + "task:cleanup_timeout" + ] + } + } + } + }, + { + "file": "13-budget-atomicity-and-cancel.mjs", + "exitCode": 0, + "wallMs": 69, + "stderr": null, + "report": { + "bench": "13-budget-atomicity-and-cancel", + "A_atomic_concurrent_charges": { + "siblings": 100, + "perCharge": 0.01, + "finalSpent": 1, + "expectedSpent": 1, + "callerObjectSpentAfter": 0 + }, + "B_owning_scope_cancel": { + "chargeAttemptedAtDepth": 5, + "outerSettledAt": 2, + "outerCancelKind": "budget" + }, + "C_caller_immutability": { + "callerSpentAfter": 0, + "liveSpentAfter": 0.5, + "callerLimitAfter": 1 + } + } + }, + { + "file": "14-context-overlay-perf.mjs", + "exitCode": 0, + "wallMs": 676, + "stderr": null, + "report": { + "bench": "14-context-overlay-perf", + "naive": { + "keys": 5000, + "withCalls": 100, + "elapsedMs": 31.253, + "perCallMs": 0.3125, + "deepestLookup": "override-99" + }, + "workit": { + "keys": 5000, + "withCalls": 100, + "elapsedMs": 0.011, + "perCallMs": 0.0001, + "deepestLookup": "override-99", + "speedupVsNaive": 2741 + } + } + }, + { + "file": "15-core-zero-network.mjs", + "exitCode": 0, + "wallMs": 57, + "stderr": null, + "report": { + "bench": "15-core-zero-network", + "filesScanned": 14, + "hits": [], + "excluded": [ + "observability", + "otel", + "worker" + ], + "passed": true + } + }, + { + "file": "16-sampling-and-aggregation.mjs", + "exitCode": 0, + "wallMs": 199, + "stderr": null, + "report": { + "bench": "16-sampling-and-aggregation", + "unsampled_per_task": { + "taskEvents": 1300 + }, + "errors_and_slow_per_task": { + "taskEvents": 36 + }, + "reduction_factor": 36.11, + "workload": { + "rootScopes": 100, + "tasksPerRoot": 5, + "slowRate": 0.05, + "errorRate": 0.02, + "slowMs": 60 + } + } + }, + { + "file": "17-cardinality-safe-metrics.mjs", + "exitCode": 0, + "wallMs": 50, + "stderr": null, + "report": { + "bench": "17-cardinality-safe-metrics", + "emitted": [ + { + "name": "task.duration", + "value": 12, + "labels": { + "task.kind": "io", + "outcome": "succeeded" + } + }, + { + "name": "task.duration", + "value": 35, + "labels": { + "task.kind": "llm", + "outcome": "failed" + } + }, + { + "name": "task.duration", + "value": 21, + "labels": { + "task.kind": "evil" + } + } + ], + "rejected": [ + { + "name": "task.duration", + "labels": { + "task.kind": "tool", + "task.id": "uuid-abc" + }, + "errorClass": "Error", + "errorMessage": "Metric label \"task.id\" is not in the allowed label set" + }, + { + "name": "task.duration", + "labels": { + "task.kind": "io", + "error.message": "EHOSTUNREACH at 10.0.0.42 retrying" + }, + "errorClass": "Error", + "errorMessage": "Metric label \"error.message\" is not in the allowed label set" + } + ], + "summary": { + "emittedCount": 3, + "rejectedCount": 2, + "emittedKinds": [ + "io", + "llm", + "evil" + ] + } + } + }, + { + "file": "18-diagnostics-finding-codes.mjs", + "exitCode": 0, + "wallMs": 50, + "stderr": null, + "report": { + "bench": "18-diagnostics-finding-codes", + "scenarios": { + "healthy": { + "status": "ok", + "findingCodes": [] + }, + "old_pending_task": { + "status": "needs_attention", + "findingCodes": [ + "old_pending_task" + ] + }, + "scope_cancelling": { + "status": "needs_attention", + "findingCodes": [ + "scope_cancelling" + ] + }, + "pending_child_scope": { + "status": "needs_attention", + "findingCodes": [ + "pending_child_scope", + "old_pending_task" + ] + }, + "cleanup_timeout": { + "status": "needs_attention", + "findingCodes": [ + "cleanup_timeout" + ] + } + } + } + }, + { + "file": "19-agent-scope.mjs", + "exitCode": 0, + "wallMs": 879, + "stderr": null, + "report": { + "bench": "19-agent-scope", + "A_tool_events": { + "finalResult": 9, + "eventCount": 4, + "eventTypesInOrder": [ + "agent:started", + "agent:tool_started", + "agent:tool_succeeded", + "agent:completed" + ], + "toolName": "calc", + "seqs": [ + 1, + 2, + 3, + 4 + ], + "monotonicAt": true, + "sameAgentId": true + }, + "B_tool_calls_budget": { + "class": "BudgetExceededError", + "budgetKey": "AgentToolCalls", + "limit": 1, + "attempted": 1 + }, + "C_tokens_budget": { + "final": { + "spent": 75, + "limit": 1000, + "unit": "tokens" + } + }, + "D_parent_cancel": { + "winner": "done", + "toolSignalAborted": true, + "outerErrorClass": "CancellationError", + "cancelReasonKind": "manual", + "cancelReasonTag": "user-stop" + }, + "E_replayable_log": { + "eventCount": 8, + "eventTypes": [ + "agent:started", + "agent:tool_started", + "agent:tool_succeeded", + "agent:tool_started", + "agent:tool_succeeded", + "agent:tool_started", + "agent:tool_succeeded", + "agent:completed" + ], + "seqs": [ + 1, + 2, + 3, + 4, + 5, + 6, + 7, + 8 + ], + "monotonicAt": true, + "sameAgentId": true, + "toolStartedNames": [ + "plan", + "fetch", + "summarize" + ], + "toolSucceededNames": [ + "plan", + "fetch", + "summarize" + ] + } + } + } + ], + "passed": 19, + "failed": 0 +} From 8c58b42e512de4d73a7257fbeefcf60a4de780b7 Mon Sep 17 00:00:00 2001 From: acossa Date: Fri, 8 May 2026 19:34:42 +0200 Subject: [PATCH 4/5] test(evidence): add publication claim proofs --- evidence/README.md | 62 +++++++ evidence/claims.json | 161 ++++++++++++++++++ .../correctness/runtime-contracts.mjs | 129 ++++++++++++++ tests/evidence/harness.mjs | 77 +++++++++ tests/evidence/lifecycle/owned-work.mjs | 123 +++++++++++++ .../performance/benchmark-contracts.mjs | 55 ++++++ tests/evidence/release/release-integrity.mjs | 75 ++++++++ tests/evidence/run-all.mjs | 69 ++++++++ tests/evidence/security/cpu-spinner.mjs | 17 ++ tests/evidence/security/worker-boundary.mjs | 80 +++++++++ 10 files changed, 848 insertions(+) create mode 100644 evidence/README.md create mode 100644 evidence/claims.json create mode 100644 tests/evidence/correctness/runtime-contracts.mjs create mode 100644 tests/evidence/harness.mjs create mode 100644 tests/evidence/lifecycle/owned-work.mjs create mode 100644 tests/evidence/performance/benchmark-contracts.mjs create mode 100644 tests/evidence/release/release-integrity.mjs create mode 100644 tests/evidence/run-all.mjs create mode 100644 tests/evidence/security/cpu-spinner.mjs create mode 100644 tests/evidence/security/worker-boundary.mjs diff --git a/evidence/README.md b/evidence/README.md new file mode 100644 index 0000000..db76f84 --- /dev/null +++ b/evidence/README.md @@ -0,0 +1,62 @@ + + +# WorkIt Claim Evidence + +`claims.json` is the publication source of truth for WorkIt claims. README and +articles consume this ledger; they do not invent new claim status. + +The evidence hierarchy is: + +```txt +runtime source + npm run verify +benchmarks/articles/run-all.mjs +tests/evidence/run-all.mjs +evidence/claims.json +-> README, articles +``` + +## Claim Classes + +| Class | Use | +|---|---| +| `security` | Abuse-resistance or boundary-hardening proof. | +| `correctness` | Runtime behavior invariant. | +| `lifecycle` | Cancellation, cleanup, ownership, orphan prevention. | +| `release` | Package, provenance, SBOM, public artifact, policy gate. | +| `performance` | Latency, memory, throughput, benchmark contract. | +| `product-decision` | Explicit design choice rather than a bug. | + +Do not label every adversarial proof as security. A proof is security only when +the impact and invariant are security-relevant. + +## Commands + +```sh +npm run verify +npm run bench:articles +npm run test:evidence +``` + +`benchmarks/results/articles.latest.json` stores the captured article benchmark +run used by README and articles for representative values. The benchmark +assertions remain the portable proof. + +## Evidence Stack + +| Layer | Source of truth | Role | +|---|---|---| +| Runtime | `npm run verify` | package, tests, coverage, API, size, security, and release gates | +| Article benches | `benchmarks/articles/run-all.mjs` | side-by-side behavior used in public articles | +| Captured bench run | `benchmarks/results/articles.latest.json` | representative publication values for this revision | +| Claim ledger | `evidence/claims.json` | claim IDs, class, proof path, invariant, status, and limitation | +| Evidence tests | `tests/evidence/run-all.mjs` | curated lifecycle, correctness, security, release, and performance proofs | + +## Publication Rule + +README summarizes. Articles teach. Neither invents claim status. Public prose +must cite one of the executable sources above, and security claims must stay +security-specific rather than using "security" as a label for every adversarial +or lifecycle proof. diff --git a/evidence/claims.json b/evidence/claims.json new file mode 100644 index 0000000..bad91c1 --- /dev/null +++ b/evidence/claims.json @@ -0,0 +1,161 @@ +{ + "author": "Admilson B. F. Cossa", + "spdxLicense": "Apache-2.0", + "artifact": "workit-claim-ledger", + "version": 1, + "authority": [ + "runtime source and npm run verify", + "benchmarks/articles/run-all.mjs", + "tests/evidence/run-all.mjs" + ], + "allowedClasses": [ + "security", + "correctness", + "lifecycle", + "release", + "performance", + "product-decision" + ], + "claims": [ + { + "id": "LIFE-001", + "title": "run.race cancels losing branches", + "class": "lifecycle", + "status": "proven", + "proof": "tests/evidence/lifecycle/owned-work.mjs", + "command": "npm run test:evidence", + "expectedInvariant": "losing branches receive race_lost cancellation", + "limitations": "Provider billing or remote execution stops only when the underlying provider honors AbortSignal." + }, + { + "id": "LIFE-002", + "title": "run.retry stops on parent cancellation", + "class": "lifecycle", + "status": "proven", + "proof": "tests/evidence/lifecycle/owned-work.mjs", + "command": "npm run test:evidence", + "expectedInvariant": "no extra retry attempt runs after cancellation is observed", + "limitations": "The guarantee applies to WorkIt retry sleeps and task bodies that observe ctx.signal." + }, + { + "id": "LIFE-003", + "title": "run.bracket cleanup timeout is bounded and observable", + "class": "lifecycle", + "status": "proven", + "proof": "tests/evidence/lifecycle/owned-work.mjs", + "command": "npm run test:evidence", + "expectedInvariant": "a hanging cleanup emits task:cleanup_timeout and the owner settles", + "limitations": "The timeout bounds WorkIt cleanup waiting; external systems still need their own timeout and idempotency policies." + }, + { + "id": "CORR-001", + "title": "budget inputs are immutable boundary values", + "class": "correctness", + "status": "proven", + "proof": "tests/evidence/correctness/runtime-contracts.mjs", + "command": "npm run test:evidence", + "expectedInvariant": "caller input object is not mutated; final spent value is read from run.context.budget", + "limitations": "Applications must read runtime budget state through the context API rather than expecting mutation of the original object." + }, + { + "id": "CORR-002", + "title": "channel capacity applies producer backpressure", + "class": "correctness", + "status": "proven", + "proof": "tests/evidence/correctness/runtime-contracts.mjs", + "command": "npm run test:evidence", + "expectedInvariant": "third send to capacity-two channel blocks until a receive drains one item", + "limitations": "This proof covers local channel capacity, not distributed queue semantics." + }, + { + "id": "CORR-003", + "title": "diagnostics report stable cleanup finding codes", + "class": "correctness", + "status": "proven", + "proof": "tests/evidence/correctness/runtime-contracts.mjs", + "command": "npm run test:evidence", + "expectedInvariant": "cleanup timeout events produce cleanup_timeout findings", + "limitations": "Rich rendering and exporter behavior are separate optional surfaces." + }, + { + "id": "CORR-004", + "title": "retry policy rejects unsafe attempt counts", + "class": "correctness", + "status": "proven", + "proof": "tests/evidence/correctness/runtime-contracts.mjs", + "command": "npm run test:evidence", + "expectedInvariant": "unbounded retry counts are rejected at the policy boundary", + "limitations": "The cap prevents accidental huge retry policies; application-level retry budgets may be stricter." + }, + { + "id": "SEC-001", + "title": "worker offload rejects remote and executable URL schemes", + "class": "security", + "status": "proven", + "proof": "tests/evidence/security/worker-boundary.mjs", + "command": "npm run test:evidence", + "expectedInvariant": "data, http, https, and javascript worker inputs are rejected before import", + "limitations": "Local file URLs remain application authority; WorkIt does not sandbox arbitrary trusted local code." + }, + { + "id": "SEC-002", + "title": "worker timeout terminates non-cooperative CPU work", + "class": "security", + "status": "proven", + "proof": "tests/evidence/security/worker-boundary.mjs", + "command": "npm run test:evidence", + "expectedInvariant": "late marker is not written after offload timeout", + "limitations": "The hard boundary applies to worker offload; JavaScript cannot preempt a CPU loop on the main thread." + }, + { + "id": "REL-001", + "title": "public proof artifact exposes release evidence", + "class": "release", + "status": "proven", + "proof": "tests/evidence/release/release-integrity.mjs", + "command": "npm run test:evidence", + "expectedInvariant": "benchmarks/public-proof.json exposes commands, fixtures, guides, and runtime matrix", + "limitations": "This verifies repository artifact shape, not an external registry publication event." + }, + { + "id": "REL-002", + "title": "release policy documents signed tags and worker boundary", + "class": "release", + "status": "proven", + "proof": "tests/evidence/release/release-integrity.mjs", + "command": "npm run test:evidence", + "expectedInvariant": "SECURITY.md documents signed tags and structured-clone worker input", + "limitations": "Documentation must remain aligned with the actual release workflow." + }, + { + "id": "REL-003", + "title": "release provenance gate verifies tag policy", + "class": "release", + "status": "proven", + "proof": "tests/evidence/release/release-integrity.mjs", + "command": "npm run test:evidence", + "expectedInvariant": "release-provenance script contains signed-tag verification logic", + "limitations": "A dry-run gate cannot guarantee a future maintainer will sign a tag; it verifies that the repository policy gate exists." + }, + { + "id": "PERF-001", + "title": "article benchmark suite has expected executable coverage", + "class": "performance", + "status": "proven", + "proof": "tests/evidence/performance/benchmark-contracts.mjs", + "command": "npm run test:evidence", + "expectedInvariant": "benchmarks/articles contains exactly 19 numbered benchmark scripts", + "limitations": "Coverage count does not prove a benchmark result; individual benchmark assertions do." + }, + { + "id": "PERF-002", + "title": "captured article benchmark result is machine-readable", + "class": "performance", + "status": "proven", + "proof": "tests/evidence/performance/benchmark-contracts.mjs", + "command": "npm run test:evidence", + "expectedInvariant": "benchmarks/results/articles.latest.json records 19 passing benches", + "limitations": "Timing values are representative for the captured machine and run; semantic assertions are the portable claim." + } + ] +} diff --git a/tests/evidence/correctness/runtime-contracts.mjs b/tests/evidence/correctness/runtime-contracts.mjs new file mode 100644 index 0000000..d766b35 --- /dev/null +++ b/tests/evidence/correctness/runtime-contracts.mjs @@ -0,0 +1,129 @@ +/** + * Correctness evidence: budgets, channels, diagnostics, and retry policy. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + */ + +import { createChannel } from "../../../dist/channel/index.js"; +import { diagnoseSnapshot } from "../../../dist/diagnostics/index.js"; +import { CostBudget, run } from "../../../dist/index.js"; +import { assert, createSuite } from "../harness.mjs"; + +const suite = createSuite("correctness"); + +await suite.proof( + "CORR-001", + "budget input is immutable and runtime budget is explicit", + "caller input object is not mutated; final spent value is read from run.context.budget", + async () => { + const input = { spent: 0, limit: 100, unit: "credits" }; + let finalBudget; + + await run.context.with(CostBudget, input, async () => { + await run.scope(async (scope) => { + await Promise.all([ + scope.spawn((ctx) => { + ctx.consume(CostBudget, 25); + return "a"; + }), + scope.spawn((ctx) => { + ctx.consume(CostBudget, 25); + return "b"; + }), + ]); + }); + finalBudget = run.context.budget(CostBudget); + }); + + return { + ok: input.spent === 0 && finalBudget?.spent === 50, + input, + finalBudget, + }; + }, +); + +await suite.proof( + "CORR-002", + "channel capacity applies producer backpressure", + "third send to capacity-two channel blocks until a receive drains one item", + async () => { + const channel = createChannel({ capacity: 2 }); + await channel.send("a"); + await channel.send("b"); + + let thirdSettled = false; + const third = channel.send("c").then(() => { + thirdSettled = true; + }); + await Promise.resolve(); + const blockedBeforeReceive = thirdSettled === false; + const first = await channel.receive(); + await third; + + return { + ok: blockedBeforeReceive && first?.value === "a" && thirdSettled, + blockedBeforeReceive, + first, + thirdSettled, + }; + }, +); + +await suite.proof( + "CORR-003", + "diagnostics report stable finding codes", + "cleanup timeout events produce cleanup_timeout findings", + async () => { + const report = diagnoseSnapshot({ + id: "scope-evidence", + name: "evidence", + status: "closing", + startedAt: 1_000, + pendingCount: 0, + completedCount: 1, + failedCount: 0, + cancelledCount: 0, + tasks: [], + scopes: [], + }, { + now: 2_000, + events: [ + { type: "task:cleanup_timeout", taskId: "task-a", timeoutMs: 25, at: Date.now() }, + ], + }); + const codes = report.findings.map((finding) => finding.code); + + return { + ok: report.status === "needs_attention" && codes.includes("cleanup_timeout"), + status: report.status, + codes, + }; + }, +); + +await suite.proof( + "CORR-004", + "retry policy rejects unsafe attempt counts", + "unbounded retry counts are rejected at the policy boundary", + async () => { + let error; + try { + run.retry(async () => "never", { times: 1_000_000 }); + } catch (caught) { + error = caught; + } + assert(error instanceof RangeError, "unsafe retry count must throw RangeError"); + + return { + ok: /between 1 and 1000/.test(error.message), + errorClass: error.constructor.name, + message: error.message, + }; + }, +); + +const summary = suite.summary(); +process.stdout.write(JSON.stringify(summary, null, 2) + "\n"); +process.exit(summary.failed > 0 ? 1 : 0); diff --git a/tests/evidence/harness.mjs b/tests/evidence/harness.mjs new file mode 100644 index 0000000..5507f78 --- /dev/null +++ b/tests/evidence/harness.mjs @@ -0,0 +1,77 @@ +/** + * Publication evidence harness for WorkIt claim proofs. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + */ + +import { performance } from "node:perf_hooks"; + +export function createSuite(area) { + const results = []; + + return { + async proof(id, title, expectedInvariant, fn) { + const started = performance.now(); + try { + const evidence = await fn(); + const ok = evidence?.ok === true; + const elapsedMs = Math.round(performance.now() - started); + results.push({ + id, + area, + title, + expectedInvariant, + status: ok ? "pass" : "fail", + elapsedMs, + evidence, + }); + printResult(ok, id, title, elapsedMs, evidence); + } catch (error) { + const elapsedMs = Math.round(performance.now() - started); + const evidence = { error: error?.message ?? String(error) }; + results.push({ + id, + area, + title, + expectedInvariant, + status: "fail", + elapsedMs, + evidence, + }); + printResult(false, id, title, elapsedMs, evidence); + } + }, + summary() { + const failed = results.filter((result) => result.status !== "pass"); + return { + area, + passed: results.length - failed.length, + failed: failed.length, + results, + }; + }, + }; +} + +export function assert(condition, message) { + if (!condition) throw new Error(message); +} + +export function sleep(ms, signal) { + if (signal?.aborted) return Promise.reject(signal.reason); + + return new Promise((resolve, reject) => { + const timer = setTimeout(resolve, ms); + signal?.addEventListener("abort", () => { + clearTimeout(timer); + reject(signal.reason); + }, { once: true }); + }); +} + +function printResult(ok, id, title, elapsedMs, evidence) { + const status = ok ? "PASS" : "FAIL"; + process.stdout.write(`${status} ${id} ${title} (${elapsedMs}ms)\n`); + process.stdout.write(` evidence: ${JSON.stringify(evidence)}\n`); +} diff --git a/tests/evidence/lifecycle/owned-work.mjs b/tests/evidence/lifecycle/owned-work.mjs new file mode 100644 index 0000000..3eec4ff --- /dev/null +++ b/tests/evidence/lifecycle/owned-work.mjs @@ -0,0 +1,123 @@ +/** + * Lifecycle evidence: owned cancellation, retry, and cleanup behavior. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + */ + +import { CancellationError, run } from "../../../dist/index.js"; +import { createSuite, sleep } from "../harness.mjs"; + +const suite = createSuite("lifecycle"); + +await suite.proof( + "LIFE-001", + "run.race cancels losing branches", + "losing branches receive race_lost cancellation", + async () => { + const losers = []; + const make = (name, ms) => async (ctx) => { + try { + await sleep(ms, ctx.signal); + return name; + } catch (error) { + losers.push({ + name, + className: error?.constructor?.name, + reasonKind: error instanceof CancellationError ? error.reason.kind : null, + }); + throw error; + } + }; + + const winner = await run.race([ + make("slow-a", 80), + make("fast", 5), + make("slow-b", 90), + ]); + + return { + ok: winner === "fast" + && losers.length === 2 + && losers.every((loser) => loser.reasonKind === "race_lost"), + winner, + losers, + }; + }, +); + +await suite.proof( + "LIFE-002", + "run.retry stops on parent cancellation", + "no extra retry attempt runs after cancellation is observed", + async () => { + let scopeRef; + let attemptsAfterCancel = 0; + let cancelRequested = false; + + const retried = run.retry(async (ctx) => { + if (cancelRequested && ctx.attempt > 1) attemptsAfterCancel++; + await sleep(20, ctx.signal); + throw new Error(`attempt ${ctx.attempt}`); + }, { times: 8, initialDelay: "50ms", jitter: false, backoff: "fixed" }); + + const promise = run.scope(async (scope) => { + scopeRef = scope; + await scope.spawn(retried, { name: "retry-proof" }); + }); + + setTimeout(() => { + cancelRequested = true; + scopeRef.cancel({ kind: "manual", tag: "evidence" }); + }, 45); + + let error; + try { + await promise; + } catch (caught) { + error = caught; + } + + return { + ok: error instanceof CancellationError + && error.reason.kind === "manual" + && attemptsAfterCancel === 0, + errorClass: error?.constructor?.name, + reasonKind: error?.reason?.kind, + attemptsAfterCancel, + }; + }, +); + +await suite.proof( + "LIFE-003", + "run.bracket cleanup timeout is bounded and observable", + "a hanging cleanup emits task:cleanup_timeout and the owner settles", + async () => { + const events = []; + const startedAt = Date.now(); + await run.scope(async (scope) => { + scope.onEvent((event) => events.push(event)); + await scope.spawn(run.bracket( + async () => "resource", + async () => "used", + async () => new Promise(() => {}), + { timeout: 10 }, + )); + }); + const elapsedMs = Date.now() - startedAt; + const cleanupTimeout = events.find((event) => event.type === "task:cleanup_timeout"); + + return { + ok: Boolean(cleanupTimeout) && elapsedMs < 500, + elapsedMs, + cleanupTimeout: cleanupTimeout + ? { type: cleanupTimeout.type, timeoutMs: cleanupTimeout.timeoutMs } + : null, + }; + }, +); + +const summary = suite.summary(); +process.stdout.write(JSON.stringify(summary, null, 2) + "\n"); +process.exit(summary.failed > 0 ? 1 : 0); diff --git a/tests/evidence/performance/benchmark-contracts.mjs b/tests/evidence/performance/benchmark-contracts.mjs new file mode 100644 index 0000000..f93b181 --- /dev/null +++ b/tests/evidence/performance/benchmark-contracts.mjs @@ -0,0 +1,55 @@ +/** + * Performance evidence: benchmark suite metadata and captured result contract. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + */ + +import { readdir, readFile } from "node:fs/promises"; + +import { createSuite } from "../harness.mjs"; + +const suite = createSuite("performance"); +const root = new URL("../../../", import.meta.url); + +await suite.proof( + "PERF-001", + "article benchmark suite has the expected executable coverage", + "benchmarks/articles contains exactly 19 numbered benchmark scripts", + async () => { + const files = (await readdir(new URL("benchmarks/articles/", root))) + .filter((file) => /^\d{2}-.*\.mjs$/.test(file)) + .sort(); + + return { + ok: files.length === 19 && files[0].startsWith("01-") && files.at(-1).startsWith("19-"), + count: files.length, + first: files[0], + last: files.at(-1), + }; + }, +); + +await suite.proof( + "PERF-002", + "captured article benchmark result is machine-readable", + "benchmarks/results/articles.latest.json records 19 passing benches", + async () => { + const text = await readFile(new URL("benchmarks/results/articles.latest.json", root), "utf8"); + const result = JSON.parse(text); + + return { + ok: result.passed === 19 + && result.failed === 0 + && Array.isArray(result.benches) + && result.benches.length === 19, + passed: result.passed, + failed: result.failed, + count: result.benches?.length ?? 0, + }; + }, +); + +const summary = suite.summary(); +process.stdout.write(JSON.stringify(summary, null, 2) + "\n"); +process.exit(summary.failed > 0 ? 1 : 0); diff --git a/tests/evidence/release/release-integrity.mjs b/tests/evidence/release/release-integrity.mjs new file mode 100644 index 0000000..8473f96 --- /dev/null +++ b/tests/evidence/release/release-integrity.mjs @@ -0,0 +1,75 @@ +/** + * Release evidence: public proof artifact and release policy documentation. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + */ + +import { readFile } from "node:fs/promises"; + +import { createSuite } from "../harness.mjs"; + +const suite = createSuite("release"); +const root = new URL("../../../", import.meta.url); + +await suite.proof( + "REL-001", + "public proof artifact has required release evidence keys", + "benchmarks/public-proof.json exposes commands, fixtures, guides, and runtime matrix", + async () => { + const artifact = JSON.parse(await readFile(new URL("benchmarks/public-proof.json", root), "utf8")); + const required = [ + "author", + "spdxLicense", + "artifact", + "evidenceCommands", + "benchmarkFixtures", + "migrationGuides", + "crossRuntimeMatrix", + ]; + const missing = required.filter((key) => artifact[key] === undefined); + + return { + ok: missing.length === 0 + && artifact.author === "Admilson B. F. Cossa" + && artifact.spdxLicense === "Apache-2.0", + missing, + fixtureCount: artifact.benchmarkFixtures?.length ?? 0, + }; + }, +); + +await suite.proof( + "REL-002", + "release policy documents signed tags and worker boundary", + "SECURITY.md documents signed tags and structured-clone worker input", + async () => { + const text = await readFile(new URL("SECURITY.md", root), "utf8"); + const hasSignedTag = /release tags must be signed|signed release tags|git tag -s/i.test(text); + return { + ok: hasSignedTag + && /structured.clone/i.test(text) + && /worker/i.test(text), + hasSignedTag, + hasStructuredClone: /structured.clone/i.test(text), + hasWorker: /worker/i.test(text), + }; + }, +); + +await suite.proof( + "REL-003", + "release provenance gate verifies tag policy", + "release-provenance script contains signed-tag verification logic", + async () => { + const text = await readFile(new URL("scripts/check-release-provenance.mjs", root), "utf8"); + return { + ok: /tag\s+-v|tag\s+--verify|signed/i.test(text), + hasVerifyHook: /tag\s+-v|tag\s+--verify|signed/i.test(text), + }; + }, +); + +const summary = suite.summary(); +process.stdout.write(JSON.stringify(summary, null, 2) + "\n"); +process.exit(summary.failed > 0 ? 1 : 0); diff --git a/tests/evidence/run-all.mjs b/tests/evidence/run-all.mjs new file mode 100644 index 0000000..94cd2c9 --- /dev/null +++ b/tests/evidence/run-all.mjs @@ -0,0 +1,69 @@ +/** + * Runs all tracked publication evidence proofs. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + */ + +import { spawn } from "node:child_process"; +import path from "node:path"; +import { fileURLToPath } from "node:url"; + +const here = path.dirname(fileURLToPath(import.meta.url)); +const files = [ + "lifecycle/owned-work.mjs", + "correctness/runtime-contracts.mjs", + "security/worker-boundary.mjs", + "release/release-integrity.mjs", + "performance/benchmark-contracts.mjs", +]; + +const summary = { + author: "Admilson B. F. Cossa", + spdxLicense: "Apache-2.0", + artifact: "workit-publication-evidence", + proofs: [], +}; + +for (const file of files) { + const startedAt = Date.now(); + const childResult = await new Promise((resolve, reject) => { + const child = spawn(process.execPath, [path.join(here, file)], { + stdio: ["ignore", "pipe", "pipe"], + }); + let stdout = ""; + let stderr = ""; + child.stdout.on("data", (chunk) => { stdout += chunk.toString(); }); + child.stderr.on("data", (chunk) => { stderr += chunk.toString(); }); + child.on("error", reject); + child.on("exit", (code) => resolve({ code, stdout, stderr })); + }); + + const jsonStart = childResult.stdout.lastIndexOf('{\n "area":'); + let report = null; + if (jsonStart >= 0) { + try { + report = JSON.parse(childResult.stdout.slice(jsonStart)); + } catch { + report = null; + } + } + + summary.proofs.push({ + file, + exitCode: childResult.code, + wallMs: Date.now() - startedAt, + stderr: childResult.stderr.trim() || null, + report, + }); + + process.stderr.write(childResult.stderr); + process.stdout.write(childResult.stdout); +} + +const failures = summary.proofs.filter((proof) => proof.exitCode !== 0).length; +summary.passed = summary.proofs.length - failures; +summary.failed = failures; + +process.stdout.write(JSON.stringify(summary, null, 2) + "\n"); +process.exit(failures > 0 ? 1 : 0); diff --git a/tests/evidence/security/cpu-spinner.mjs b/tests/evidence/security/cpu-spinner.mjs new file mode 100644 index 0000000..7b5ce7c --- /dev/null +++ b/tests/evidence/security/cpu-spinner.mjs @@ -0,0 +1,17 @@ +/** + * Worker fixture that ignores cooperative cancellation and writes a late marker. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + */ + +import { writeFileSync } from "node:fs"; + +export function spinForever({ durationMs, markerPath }) { + const start = Date.now(); + while (Date.now() - start < durationMs) { + // Intentionally non-cooperative. + } + writeFileSync(markerPath, "completed"); + return { completed: true }; +} diff --git a/tests/evidence/security/worker-boundary.mjs b/tests/evidence/security/worker-boundary.mjs new file mode 100644 index 0000000..7457bde --- /dev/null +++ b/tests/evidence/security/worker-boundary.mjs @@ -0,0 +1,80 @@ +/** + * Security evidence: worker URL guard and hard timeout boundary. + * + * @author Admilson B. F. Cossa + * SPDX-License-Identifier: Apache-2.0 + */ + +import { existsSync, mkdtempSync, rmSync } from "node:fs"; +import { tmpdir } from "node:os"; +import { join } from "node:path"; + +import { TimeoutError, run } from "../../../dist/index.js"; +import { offload } from "../../../dist/worker/index.js"; +import { createSuite } from "../harness.mjs"; + +const suite = createSuite("security"); +const spinnerURL = new URL("./cpu-spinner.mjs", import.meta.url); + +await suite.proof( + "SEC-001", + "worker offload rejects remote and executable URL schemes", + "data, http, https, and javascript worker inputs are rejected before import", + async () => { + const rejected = []; + for (const candidate of [ + "data:text/javascript,export const x = 1", + "http://example.test/worker.js", + "https://example.test/worker.js", + "javascript:globalThis.x=1", + ]) { + try { + offload(candidate, "x", undefined); + } catch { + rejected.push(candidate.split(":")[0]); + } + } + + return { + ok: rejected.length === 4, + rejected, + }; + }, +); + +await suite.proof( + "SEC-002", + "worker timeout terminates non-cooperative CPU work", + "late marker is not written after offload timeout", + async () => { + const dir = mkdtempSync(join(tmpdir(), "workit-evidence-worker-")); + const markerPath = join(dir, "late-marker.txt"); + let error; + try { + await run.scope(async (scope) => { + await scope.spawn(offload( + spinnerURL, + "spinForever", + { durationMs: 5_000, markerPath }, + { timeout: "200ms" }, + )); + }); + } catch (caught) { + error = caught; + } + + await new Promise((resolve) => setTimeout(resolve, 350)); + const markerExists = existsSync(markerPath); + rmSync(dir, { recursive: true, force: true }); + + return { + ok: error instanceof TimeoutError && markerExists === false, + errorClass: error?.constructor?.name, + markerExists, + }; + }, +); + +const summary = suite.summary(); +process.stdout.write(JSON.stringify(summary, null, 2) + "\n"); +process.exit(summary.failed > 0 ? 1 : 0); From 3882ce8d889b0d2ef831b267359503ab856a7799 Mon Sep 17 00:00:00 2001 From: acossa Date: Fri, 8 May 2026 19:34:46 +0200 Subject: [PATCH 5/5] docs(articles): add WorkIt publication series --- articles/01-owned-async-work.md | 228 +++++++++ articles/02-concurrency-retry-timeout.md | 341 +++++++++++++ .../03-cancellation-and-worker-boundaries.md | 256 ++++++++++ ...04-backpressure-for-streaming-pipelines.md | 305 ++++++++++++ .../05-resource-safety-and-budgeted-work.md | 231 +++++++++ .../06-observability-without-core-bloat.md | 459 ++++++++++++++++++ .../07-agent-scope-and-tool-lifecycles.md | 230 +++++++++ articles/README.md | 81 ++++ 8 files changed, 2131 insertions(+) create mode 100644 articles/01-owned-async-work.md create mode 100644 articles/02-concurrency-retry-timeout.md create mode 100644 articles/03-cancellation-and-worker-boundaries.md create mode 100644 articles/04-backpressure-for-streaming-pipelines.md create mode 100644 articles/05-resource-safety-and-budgeted-work.md create mode 100644 articles/06-observability-without-core-bloat.md create mode 100644 articles/07-agent-scope-and-tool-lifecycles.md create mode 100644 articles/README.md diff --git a/articles/01-owned-async-work.md b/articles/01-owned-async-work.md new file mode 100644 index 0000000..aa29c3a --- /dev/null +++ b/articles/01-owned-async-work.md @@ -0,0 +1,228 @@ + + +# Owned Async Work In TypeScript + +Three providers. One winner. Three invoices. + +```ts +const winner = await Promise.race([ + fetch(OPENAI, { body }), + fetch(ANTHROPIC, { body }), + fetch(GEMINI, { body }), +]); +``` + +It is easy to assume a race cancels the losers. Native `Promise.race` resolves on the first settlement and leaves every other promise running unless the caller wires cancellation into each branch. TCP completes. Tokens bill. `.then` callbacks fire. Cleanup remains a manual convention. `Promise.any` and `Promise.all` have the same ownership gap. + +Native promises model values, not ownership. There is no ownership parent, no built-in cancellation path, and no scope cleanup contract. + +There is still work running after the value settles. + +--- + +## The one line that fixes the 80% case + +Before we fix racing, fix the thing every codebase does first: process a list of items with concurrency, retries, and a timeout. + +```ts +import { work } from "@workit/core"; + +await work(items) + .inParallel(8) + .withRetry(3) + .withTimeout("5s") + .do(async (item, ctx) => { + ctx.report({ message: `processing ${item.id}` }); + return apiCall(item, { signal: ctx.signal }); + }); +``` + +That isn't a custom Promise chain. It isn't a standalone concurrency helper where cancellation, retry, timeout, and cleanup have to be wired separately. It's a runtime contract: + +- At most **8 inflight** at any instant. The cap is a hard property test, not a hint. +- Each item retries up to **3 times**, exponential backoff with jitter, signal-aware sleep. +- Any item exceeding **5 seconds** cancels its own operation with `TimeoutError`. +- First uncaught failure cancels the queued and the in-flight, by default. Switch policy on a single line and the **return type changes** so you can't ignore failures: + +```ts +const out = await work(items).inParallel(8).onError("collect").do(fn); +// ^^^ WorkOutput -- discriminated union: "fail" | "continue" | "collect" +if (out.mode === "collect") { + for (const r of out.results) if (r.status === "rejected") log(r.reason); +} +``` + +- Every body receives a `ctx.signal` linked to the scope, so the `fetch`, the database query, or the LLM call **actually aborts** at the I/O boundary. +- Progress events flow to your logger, metrics, or UI through `ctx.report(...)` -- zero allocation when nobody listens. + +That is the surface. Five chained methods. No new vocabulary. Now stack it. + +--- + +## `run.race`: same shape, different contract + +```ts +import { run } from "@workit/core"; + +const winner = await run.race([callOpenAI, callAnthropic, callGemini]); +``` + +Same six tokens you wrote with `Promise.race`. Different runtime contract: + +- Each body receives a `ctx.signal` linked to the race. +- First settlement cancels the rest at the `AbortSignal` boundary, **before** TCP completes. +- Each loser sees `CancelReason { kind: "race_lost", winnerId }` -- typed, exhaustively narrowed, not a string. +- Each loser's `ctx.defer(...)` cleanup runs LIFO before `run.race` resolves. +- `await run.race(...)` returns only after losers have finished cleaning up. + +That is the first ownership boundary. Now stack it again. + +--- + +### Cancel a 200-tool agent on client disconnect + +```ts +import { run } from "@workit/core"; + +await run.scope(async (scope) => { + request.signal.addEventListener("abort", () => + scope.cancel({ kind: "manual", tag: "client_disconnect" })); + + for (const step of plan.steps) { + scope.spawn(async (ctx) => callTool(step, ctx.signal), + { name: step.name, kind: "tool" }); + } + scope.spawn.background(async () => auditLog(plan)); +}, { deadline: "30s" }); +``` + +Every in-flight tool aborts. Every `ctx.defer` runs LIFO. The audit task flushes. The reason -- `{ kind: "manual", tag: "client_disconnect" }` -- carries down the tree so your dashboard distinguishes a stop from a `deadline` from a `budget` overrun. + +### A socket close cancels an STT stream and closes the microphone + +```ts +import { transcribeStream } from "@workit/core/ai"; + +for await (const text of transcribeStream(microphone, { + async transcribe(chunk, ctx) { + return provider.transcribe(chunk, { signal: ctx.signal }); + }, +}, { signal: socket.signal })) { + socket.send(text); +} +``` + +Socket disconnects. `ctx.signal` aborts. The provider's HTTP request aborts. The async generator's `finally` runs and closes the microphone. The sample asserts that the source closes and no provider call remains active after disconnect. + +### 100,000 documents under a hard token cap + +```ts +import { group, run } from "@workit/core"; +import { OpenAITokens, embedAll } from "@workit/core/ai"; + +await run.context.with( + OpenAITokens, { spent: 0, limit: 1_000_000, unit: "tokens" }, + () => group(() => embedAll(documents, { concurrency: 32 })), +); +``` + +Bounded concurrency. Per-item retry. Token budget enforced atomically across all 32 inflight workers. Blow the cap mid-pipeline and the scope cancels with `CancelReason { kind: "budget", limit, spent }`. Partial results stay. The rest abort. + +--- + +## No Orphans Means No Unowned Background Work + +A `background` child is still scoped. The parent operation does **not** finish while owned background work keeps running. The receipt is one of the smallest samples in the repo: + +```ts +// samples/no-orphan.sample.js +const result = await group(async (task) => { + task.background(async (ctx) => { + await sleep(20, ctx.signal); + backgroundCompleted = true; + }); + return "body-returned"; +}); + +// Asserted by the sample: +// result === "body-returned" +// backgroundCompleted === true +// elapsedMs >= 15 +``` + +The body returns its value at t=0. The owned background task takes 20 ms. The `await group(...)` does **not** resolve until both finish. If you want to escape the scope, you call `run.detached(...)` and accept the orphan trade-off explicitly. There is no third option. + +```sh +npm run sample:no-orphan +``` + +--- + +## Why not just use X + +The right tool depends on what part of the lifecycle you actually own. + +| Tool | Bounded concurrency | Scope-owned loser / sibling cancellation | Typed cancel reason | Scope cleanup | +|---|---|---|---|---| +| **WorkIt** | yes `work().inParallel(N)` / `run.pool(N, ...)` | yes at the `AbortSignal` boundary | yes `CancelReason` discriminated union | yes `ctx.defer` LIFO | +| `Promise.all` / `race` / `any` | no | no | no | no | +| `p-limit` | yes | manual; queue ownership is separate from task cancellation | no | no | +| `p-map` | yes | partial/manual; queue and in-flight work have separate policies | no | no | +| `RxJS.mergeMap` | yes | yes on unsubscribe | partial | per-subscription, not per-scope | +| Effection | yes via generator ops | yes (structured) | partial | yes | +| Effect-TS | yes via fibers | yes | yes (typed `Cause`) | yes | + +If your problem is "process this array with N concurrency" and nothing else ever fails, `p-limit` is fine. If your problem is "this list is part of a request that can time out, the user can disconnect, and one bad item must cancel the rest with cleanup", you want a runtime contract. Effection and Effect-TS provide one -- through generators and a fiber DSL respectively. WorkIt provides one **without leaving `async` / `await`**. + +--- + +## Receipts + +The release-readiness claim above is a CI gate, not a tagline. Each row maps to a command in `npm run verify`. + +| Measurement | Value | What it includes | +|---|---|---| +| `core-group-import` bundle | **14,175 B min * 4,835 B gzip** | The full `group` + `run` + `work` + retry/timeout/race/all/any/pool surface, tree-shaken | +| Runtime dependencies | **0** | Zero. The compiled core does not import `node:http`, `node:https`, or `fetch`. Static check enforced. | +| Tests / coverage | **214 tests * 100% statements / branches / functions / lines** | Cancellation invariants, channel semantics, AI-subpath mocks, exporter stress, scope tree, budget atomicity | +| Hot-path heap, 100k tasks, no signal read | **0.9 MB post-GC** | Was 298 MB before lazy `AbortController` allocation -- ~330x reduction | +| Tracked soak gate, 100k tasks @ concurrency 128 | **126,136 B** max heap growth | The `npm run check:soak` gate fails the build if this regresses | +| Stream backpressure, 1,000,000 logical items, slow consumer | **maxActive <= inParallel(N)**, producer paused, heap bounded | The `npm run check:stream-memory` gate | +| `offload({ timeout: "200ms" })` against an infinite CPU spin loop | rejects at the worker timeout boundary, **late-marker file does not exist** | AbortController cannot preempt a CPU loop; the worker is terminated at the host boundary. CI `stat()`s for the marker. | +| Claim evidence suite | `npm run test:evidence` | Curated lifecycle, correctness, security, release, and performance proofs mapped in `evidence/claims.json` | + +--- + +## The series + +1. **You are here** -- *Promise.race does not own the losing work. The fluent surface and why ownership matters.* +2. *Nine composables. One ownership contract.* +3. *AbortController cannot preempt a CPU loop. WorkIt uses a worker boundary.* +4. *A 1,000,000,000-row pipeline. 25 consumed. The producer noticed.* +5. *A 0.50 USD agent. A connection that closes on ctrl-C. A receipt the user never sees.* +6. *100K agent runs a day. Bounded observability cost without core bloat.* +7. *An agent loop in 12 lines. A typed tool contract. A 50-cent ceiling.* + + +--- + +## Try it + +```sh +npm install @workit/core +``` + +The API is stable. The tests pass. The bundle is tiny. + +*Next: `run.all`, `run.race`, `run.any`, `run.pool`, `run.series` side-by-side with `Promise.all`, `Promise.race`, `Promise.any`, `p-limit`, and `p-map`. We measure which contracts still hold when one sibling throws mid-flight.* + +--- + +## Source, Benchmarks, And Evidence + +- Source: https://github.com/WorkRuntime/workit +- Article source: https://github.com/WorkRuntime/workit/blob/main/articles/01-owned-async-work.md +- Reproduce: `npm run bench:articles` and `npm run test:evidence` diff --git a/articles/02-concurrency-retry-timeout.md b/articles/02-concurrency-retry-timeout.md new file mode 100644 index 0000000..ad817b5 --- /dev/null +++ b/articles/02-concurrency-retry-timeout.md @@ -0,0 +1,341 @@ + + +# Concurrency, Retry, And Timeout Under One Owner + +Last time we showed `work(items).inParallel(8).withRetry(3).withTimeout("5s").do(fn)` -- the one-line fluent surface for processing a list. That handles the 80% case. + +This article is about the other 20%: orchestrating heterogeneous tasks that race, fall back, hedge, and retry -- **with ownership**. + +Open `package.json` in many AI codebases and you'll find some subset of: + +```json +"p-limit": "^5.0.0", +"p-map": "^7.0.0", +"p-retry": "^6.2.0", +"p-timeout": "^6.1.2", +"p-queue": "^8.0.1", +"bottleneck": "^2.19.5", +"async-retry": "^1.3.3" +``` + +Six libraries. Six lifecycles. Some expose cancellation hooks. None of them gives the whole tree one ownership contract by default. When a sibling throws, when a timeout fires, when the user hits stop, you have to stitch together queue state, retry delay, timeout wrapper, underlying I/O, cleanup, and error shape yourself. + +That is the comparison in this article: not "those tools are useless", but "they are separate primitives." WorkIt's claim is ownership and composition. The runnable benches at the end of each section verify the WorkIt invariants on your machine. + +WorkIt has five core composables, all sharing one runtime contract: + +```ts +run.all // Promise.all that actually cancels losers on first failure. +run.race // Promise.race that actually cancels losers. +run.any // Promise.any that actually cancels remaining tasks. +run.pool // p-limit + p-map, but children belong to the scope. +run.series // sequential, with shared cancellation. +``` + +Plus four more that compose with them: + +```ts +run.retry // backoff with signal-aware sleep. +run.timeout // deadline that returns a TaskFn. +run.fallback // primary -> secondary, type-safe. +run.hedge // bounded speculative execution for tail-latency control. +``` + +Same familiar names. Different runtime contract: **everything below the call belongs to a scope, and the scope owns the cancel.** + +The composition property under all nine: every WorkIt resilience helper takes a `TaskFn` and returns a `TaskFn`. That makes the algebra closed -- `run.timeout(run.retry(callProvider, 3), "5s")` is just function composition. Promise helpers usually return promises or independent wrapper functions, so crossing from timeout to retry to race means you own the glue and the signal threading. + +--- + +## `run.all` -- the safer Promise.all + +```ts +import { run } from "@workit/core"; + +const [profile, plan, sources] = await run.all([ + (ctx) => fetchProfile({ signal: ctx.signal }), + (ctx) => planLLM(question, { signal: ctx.signal }), + (ctx) => retrieveContext(question, { signal: ctx.signal }), +]); +``` + +`Promise.all` rejects on first failure and **leaves the other two requests running** unless each branch has its own cancellation wiring. Their `.then` handlers can fire after your error handler already returned a 500, producing completion events that are no longer attached to the owning request. + +`run.all` rejects on first failure and **cancels the other two**. `ctx.signal` aborts. `defer` cleanups run. The reason is typed: `CancelReason { kind: "sibling_failed", siblingId, error }`. + +You can pivot a dashboard on that. You cannot pivot on `Error: AggregateError`. + +> **Bench [`01-run-all-vs-promise-all.mjs`](../benchmarks/articles/01-run-all-vs-promise-all.mjs).** A succeeds at 50 ms. **B fails at 30 ms.** C succeeds at 100 ms. +> +> | Implementation | Outer rejected | A still ran past reject | C still ran past reject | Defer ran for losers | +> |---|---|---|---|---| +> | `Promise.all` | t=35 ms | **+16 ms** | **+79 ms** | n/a | +> | `run.all` | t=32 ms | **0 ms** (cancelled at +1 ms) | **0 ms** (cancelled at +1 ms) | yes before outer reject | + +--- + +## `run.race` -- the race that actually races + +```ts +const winner = await run.race([callOpenAI, callAnthropic, callGemini]); +``` + +Six tokens you wrote with `Promise.race`. Different runtime contract: + +- Each body receives a `ctx.signal` linked to the race. +- First settlement cancels the rest at the `AbortSignal` boundary, **before TCP completes**. +- Each loser sees `CancelReason { kind: "race_lost", winnerId }` -- typed, exhaustively narrowed. +- `await run.race(...)` returns only after losers have finished cleaning up. + +> **Bench [`02-run-race-vs-promise-race.mjs`](../benchmarks/articles/02-run-race-vs-promise-race.mjs).** Anthropic at 10 ms, OpenAI at 50 ms, Gemini at 80 ms. +> +> | Implementation | Winner at | OpenAI loser still ran | Gemini loser still ran | Loser reason | +> |---|---|---|---|---| +> | `Promise.race` | t=14 ms | **+47 ms** (61 ms total) | **+77 ms** (91 ms total) | none | +> | `run.race` | t=17 ms | **0 ms** (cancelled at t=16 ms) | **0 ms** (cancelled at t=16 ms) | `race_lost` | + +That loser runtime x N parallel agents x P requests per second is the line on your invoice that nobody wrote. + +--- + +## `run.any` -- first success, rest cancelled + +```ts +const cheapest = await run.any([callExpensive, callCheap, callCheaper]); +``` + +`Promise.any` resolves with the first **success** and ignores the rest. The slower siblings keep running. The faster failing ones got logged and forgotten. `run.any` does the same -- except the slower siblings actually stop. + +> **Bench [`03-run-any-vs-promise-any.mjs`](../benchmarks/articles/03-run-any-vs-promise-any.mjs).** A fails at 30 ms. B succeeds at 50 ms. C succeeds at 100 ms. +> +> | Implementation | Resolved at | C kept running | Defer ran for C | +> |---|---|---|---| +> | `Promise.any` | t=61 ms | **+47 ms** (108 ms total) | n/a | +> | `run.any` | t=65 ms | **0 ms** (cancelled at t=65 ms) | yes | + +--- + +## `run.pool` -- bounded concurrency that cancels + +```ts +const results = await run.pool(8, files.map((file) => async (ctx) => { + return uploadOne(file, { signal: ctx.signal }); +})); +``` + +`p-limit(8)` is a semaphore. That's useful, and current versions can clear pending queue items when you ask them to. What it is not is a structured scope: it does not automatically turn a sibling failure into in-flight cancellation, typed cancel reasons, cleanup, and a partial-result contract. + +`run.pool(8, tasks)` is a semaphore + a scope. Default policy is `Promise.all`-style fail-fast: first throw cancels queued and in-flight. Results are positionally indexed regardless of completion order. Switch policy with one line and the **return type changes** so you can't ignore failures: + +```ts +const out = await work(files).inParallel(8).onError("collect").do(uploadOne); + +if (out.mode === "collect") { + for (const r of out.results) { + if (r.status === "rejected") logFailure(r.reason); + } +} +``` + +`WorkOutput` is a discriminated union -- `mode: "fail" | "continue" | "collect"`. Change `.onError("continue")` and the return type forces you to handle `errors[]`. The compiler is your audit log. + +> **Bench [`04-pool-vs-semaphore.mjs`](../benchmarks/articles/04-pool-vs-semaphore.mjs).** 10 items, concurrency 4. Item 3 throws at 20 ms; the rest take 100 ms each. +> +> | Implementation | Outer rejected | Started | Fulfilled AFTER rejection | Cancelled | Never started | Longest post-rejection run | +> |---|---|---|---|---|---|---| +> | local `pLimitLike(4)` semaphore baseline | t=31 ms | 10 | **9** | 0 | 0 | **+295 ms** | +> | `run.pool(4, ...)` | t=33 ms | 4 | 0 | 3 | 6 | **0 ms** | + +295 ms of post-rejection work, multiplied across a fleet, becomes avoidable runtime and provider cost. + +--- + +## `run.retry` -- composable, cancel-aware backoff + +```ts +const callWithRetry = run.retry(callProvider, { + times: 4, + backoff: "exponential", + initialDelay: "200ms", + maxDelay: "5s", + jitter: true, + retryIf: (err) => isTransient(err), +}); + +const answer = await callWithRetry(ctx); +``` + +Three things WorkIt makes part of the retry contract: + +1. **Stop retrying on scope cancellation.** When the parent scope cancels mid-attempt, `run.retry` does not enqueue another attempt. The task settles as `cancelled`, not `failed`. +2. **Validate input at the boundary.** `run.retry({ times: 1e9 })` would create an unbounded retry policy. `run.retry` rejects it: `RangeError: retry attempts must be an integer between 1 and 1000`. Bound is `MAX_RETRY_ATTEMPTS`. +3. **Sleep with the scope signal.** Backoff sleep is interruptible -- abort the signal, the sleep rejects, the loop exits. The benchmark below compares against a signal-unaware retry loop; current retry libraries may expose their own abort hooks, but they still do not own WorkIt's scope tree, cleanup, and cancel-reason contract. + +> **Bench [`05-retry-on-cancel.mjs`](../benchmarks/articles/05-retry-on-cancel.mjs).** Body throws on every attempt. External cancel fires around t=50 ms. Up to 8 retries with 50 ms backoff. +> +> | Implementation | Cancel observed | Outer settled | Cancel latency | Extra attempts after cancel | Settled as | +> |---|---|---|---|---|---| +> | signal-unaware retry loop | t=63 ms | t=701 ms | **638 ms** | **7** | `rejected` | +> | `run.retry` | t=61 ms | t=61 ms | **0 ms** | **0** | `cancelled` (kind: `manual`) | + +638 ms of wasted retry work after the user already cancelled. Per request. Multiply by the agent fan-out. + +--- + +## `run.timeout` -- composes with retry, race, and pool + +```ts +const fastest = await run.race([ + run.timeout(callPrimary, "800ms"), + run.timeout(callSecondary, "800ms"), +]); +``` + +`run.timeout(task, "800ms")` returns a `TaskFn`. It composes. You can wrap it in `run.retry`. You can put it inside `run.race`. You can hand it to `run.pool`. The signature is closed under composition. + +Promise timeout helpers return promises or decorated promises. Some expose `AbortSignal` support. They still do not return a WorkIt `TaskFn`, so crossing timeout, retry, race, pool, and cleanup means you own the composition boundary. + +--- + +## `run.fallback` -- primary, secondary, type-safe + +```ts +const callWithFallback = run.fallback( + run.retry(callProvider, 3), + callBackupProvider, +); +``` + +Primary fails (after retries) -> secondary runs. Same `ctx.signal`. Same scope. Same cancel reason if the parent stops. No nested `try/catch`. No "did I forget to await the fallback" Slack message at 2 a.m. + +--- + +## `run.supervise` -- restart policy for long-lived work + +`run.retry` is for one operation that may fail transiently and then succeed. `run.supervise` is for a **long-lived task** -- a heartbeat, a queue consumer, a connection watcher, an agent keep-alive -- that may need restart semantics with bounded backoff. + +```ts +import { run } from "@workit/core"; + +const result = await run.supervise(async () => { + attempts++; + if (attempts < 3) throw new Error("transient worker failure"); + return "stable"; +}, { + restartOn: "error", + maxRestarts: 3, + backoff: () => 1, +}); +``` + +```ts +// samples/supervision.sample.js -- asserted in CI: +// result === "stable" +// attempts === 3 +``` + +The supervised body fails twice, restarts each time under the policy, and stabilises on the third attempt. The parent scope can still cancel everything at once, and the cancel reason carries down through the supervision wrapper. Restart policies cap at `maxRestarts` per `resetWindow` so a permanently broken body doesn't infinite-loop. + +```sh +npm run sample:supervise +``` + +The decision rule: use `run.retry` for a single call that can hiccup; use `run.supervise` for a process that should keep running. + +--- + +## `run.hedge` -- bounded speculative requests + +```ts +const ranked = await run.hedge( + (ctx) => reranker.rank(question, sources, { signal: ctx.signal }), + { after: "2s", max: 2 }, +); +``` + +If the first call hasn't returned in 2 seconds, fire a second one. First success wins; the rest cancel. Bounded by `max`, this is a measured way to reduce tail latency without paying for every speculative fan-out. + +> **Bench [`06-hedge-tied-requests.mjs`](../benchmarks/articles/06-hedge-tied-requests.mjs).** Two scenarios, opts `{ after: "50ms", max: 3 }`. +> +> | Scenario | Body latency | Attempts fired (timestamps) | Winner | Losers cancelled | Cancel reason | +> |---|---|---|---|---|---| +> | slow | 200 ms | 3 (t=2 ms, 62 ms, 107 ms) | id=1 at 217 ms | 2 | `race_lost` | +> | fast | 30 ms | **1** (no hedge fired) | id=1 at 31 ms | 0 | n/a | + +The fast path doesn't pay for hedging at all. The slow path bounded by `max`. Every loser tagged with `race_lost`. + +--- + +## Side-by-side -- who actually cancels + +```ts +// 3 tasks. B fails at 30 ms. A succeeds at 50 ms. C succeeds at 100 ms. + +await Promise.all([A, B, C]); // rejects at 30 ms. A and C keep running for ~16/63 ms. +await Promise.race([A, B, C]); // rejects at 30 ms. A and C keep running. +await Promise.any([A, B, C]); // resolves at 50 ms. C keeps running for ~44 ms. + +await run.all([A, B, C]); // rejects at 30 ms. A and C cancelled in 1 ms, defer ran. +await run.race([A, B, C]); // rejects at 30 ms. A and C cancelled in 0-1 ms. +await run.any([A, B, C]); // resolves at 50 ms. C cancelled in 0 ms, defer ran. +``` + +Same shape. Different contract. Native primitives return a value. WorkIt primitives own the tree underneath the value. + +--- + +## How do other libraries compare + +| Tool | Cancels on sibling failure | Signal-aware retry | Composable timeout (returns `TaskFn`) | Hedged requests | Bundle | +|---|---|---|---|---|---| +| **WorkIt** | yes | yes | yes | yes built-in | **14,175 B / 4,835 B gz** for all nine composables | +| `Promise.all` / `race` / `any` | no | n/a | n/a | n/a | 0 | +| `p-limit` + `p-retry` + `p-timeout` | partial/manual wiring | partial/manual wiring | separate abstractions | no | three deps | +| `RxJS` | yes on `unsubscribe` | partial via operators | yes via operators | no | large | +| Effection | yes structured (generator ops) | yes | yes | no | medium | +| Effect-TS | yes structured (fibers + typed `Cause`) | yes | yes | no | large | + +For simple array processing where nothing else can fail, `p-limit` is fine. For full-stack apps where you want a broader effect or operation model, Effection and Effect-TS are solid. WorkIt's distinction is narrower: structured-concurrency composition without leaving `async`/`await`, with nine composables in one ownership tree and the bundle size shown above. + +--- + +## Receipts + +The WorkIt runtime claims above are verified by either `npm run verify` (the production gate) or `npm run bench:articles` (the side-by-side suite that produced the representative timing tables in this article). + +```sh +npm run bench:articles +# full article suite: 19 passed, 0 failed +``` + +This article cites benches 01-06 from the full suite. Each bench script is **~100 lines, zero external dependencies**, and asserts the WorkIt invariant in-line. Timings are representative captured runs; the assertions guard semantic invariants, not exact milliseconds. The folder has its own `package.json` so the published package's dependency graph stays empty. Read the README at [`benchmarks/articles/`](../benchmarks/articles/README.md) for how the promise-helper baselines stay honest. + +Production-side gates that back the same composables: + +| Claim | Evidence | +|---|---| +| Cancellation safety, all composables | Benches 01-06 plus [`tests/evidence/lifecycle/owned-work.mjs`](../tests/evidence/lifecycle/owned-work.mjs) verify parent cancellation, sibling failure, retry cancellation, race loser cleanup, and owned background work. | +| `run.retry` validation | `RangeError` for `times` <= 0, > 1000, NaN, Infinity, fractional. Identical rejection on numeric and object form. | +| `run.race` / `run.any` loser cleanup | LIFO `defer` blocks observed in test for every loser; outer promise does not resolve until cleanup completes | +| Bundle of all nine composables | Included in 14,175 B min / 4,835 B gzip core-group-import. Tree-shaken if unused. | + +--- + +## What's next + +Nine composables share one engine. Tomorrow we open the engine. + +We'll look at what cooperative cancellation can do, what it cannot do, and where the hard boundary starts. Then we put a CPU spin loop that ignores every signal in front of `offload({ timeout: "200ms" })` and verify that worker termination prevents a late marker file from appearing. The CI gate runs `stat()` on it. + +AbortController cannot preempt a CPU loop. WorkIt cannot change that language boundary, but a worker thread can be terminated by its host. + +--- + +## Source, Benchmarks, And Evidence + +- Source: https://github.com/WorkRuntime/workit +- Article source: https://github.com/WorkRuntime/workit/blob/main/articles/02-concurrency-retry-timeout.md +- Reproduce: `npm run bench:articles` and `npm run test:evidence` diff --git a/articles/03-cancellation-and-worker-boundaries.md b/articles/03-cancellation-and-worker-boundaries.md new file mode 100644 index 0000000..8425221 --- /dev/null +++ b/articles/03-cancellation-and-worker-boundaries.md @@ -0,0 +1,256 @@ + + +# Cancellation, Cooperation, And Worker Boundaries + +*Last time we showed nine composables that cancel siblings, retry with signal-aware backoff, and hedge tied requests. That's cooperative cancellation -- it works when the body checks the signal and the I/O it makes is signal-aware. This article answers the hard question: what happens when code does not cooperate?* + +Drop this in a worker module: + +```js +// benchmarks/articles/lib/spinner.mjs -- ignores every signal you throw at it +import { writeFileSync } from "node:fs"; + +export function spin({ durationMs, markerPath }) { + const start = Date.now(); + while (Date.now() - start < durationMs) { + Math.sqrt(Math.random() * 1e6); + } + writeFileSync(markerPath, "late-marker-written-by-worker"); + return { completed: true, elapsedMs: Date.now() - start }; +} +``` + +This is the canonical "non-cooperative" body. Tight loop, no `await`, no `signal.aborted` check. In a single-threaded JS runtime, cooperative cancellation cannot stop it from the outside. `AbortController` can record cancellation, but the callback cannot run until the event loop yields. + +```ts +import { offload } from "@workit/core/worker"; +import { run } from "@workit/core"; + +await run.scope(async (scope) => scope.spawn( + offload(spinnerURL, "spin", { durationMs: 5_000, markerPath }, { timeout: "200ms" }), +)); +``` + +200 ms later: `TimeoutError`. The worker thread is terminated by the host. The late-marker file **does not exist** on disk. We `stat()` for it in CI and fail the gate if it does. + +That's the only honest answer to "can JS forcibly stop work". The answer is *no on the main thread* -- but you can move the work to a worker and have the host kill the thread. + +WorkIt has both layers. They are labeled honestly. + +> **Bench [`07-worker-hard-kill.mjs`](../benchmarks/articles/07-worker-hard-kill.mjs).** 5,000 ms spin loop. 200 ms timeout. Late-marker file written *after* the loop completes. Timings are representative; the invariant is the marker-file result. +> +> | Implementation | Settled at | Late-marker file on disk | +> |---|---|---| +> | Main-thread `AbortController.abort()` | t=5,001 ms (full duration) | **yes exists** -- the abort callback never even fired; the event loop was starved | +> | `offload({ timeout: "200ms" })` | t=206 ms with `TimeoutError` | **no does not exist** -- and stays absent through an 800 ms grace window | + +The native baseline is the receipt for "AbortController cannot preempt a CPU loop." The body completed all 5 seconds and wrote the marker, and the `setTimeout` that was supposed to fire `controller.abort()` at 200 ms could not be delivered because the event loop never returned. The abort callback never ran. + +--- + +## Layer 1 -- Cooperative cancellation, in-process + +```ts +async function callTool(input, ctx) { + ctx.signal.throwIfAborted(); // explicit checkpoint + const res = await fetch(url, { signal: ctx.signal }); // signal-aware + ctx.signal.throwIfAborted(); // checkpoint after I/O + return res.json(); +} +``` + +Cooperative cancellation works when the body checks the signal at safe points and the I/O it makes is signal-aware. WorkIt handles the checkpoints automatically through `await` boundaries inside `run.retry`, `run.timeout`, `run.race`, and the `work()` builder. You just have to thread `ctx.signal` into the I/O calls. + +This works for 95% of code: HTTP, database, filesystem, streams, child processes, sleeps, channel sends. They all take an `AbortSignal`. + +It does not work for the 5% that does CPU loops, sync `crypto`, sync `JSON.parse` of a 200 MB string, or a fitness test in a genetic algorithm. + +For that 5%, you need Layer 2. + +--- + +## Layer 2 -- Hard kill at the worker boundary + +```ts +import { offload } from "@workit/core/worker"; + +const transcoded = offload( + new URL("./ffmpeg-transcode.js", import.meta.url), + "transcode", + { input, format: "webm" }, + { timeout: "30s" }, +); + +await run.scope(async (scope) => scope.spawn(transcoded)); +``` + +`offload(...)` returns a `TaskFn`. Spawn it into a scope. The named export runs in a Worker thread. When the timeout fires the worker is **terminated** -- not signalled, not asked nicely. The host process keeps running. The promise rejects with `TimeoutError`. If the parent scope cancels first, the worker is terminated with a `CancellationError` carrying the parent's reason. + +What `offload` accepts: + +- Local file URLs (`new URL("./mod.js", import.meta.url)`). +- Named export only. +- Structured-cloneable input: primitives, arrays, plain objects, `Map`, `Set`, `Date`, `RegExp`, `ArrayBuffer`, `SharedArrayBuffer`, typed array views. + +What `offload` rejects, before the worker spins up: + +- Remote and inline URL schemes (`https:`, `data:`, `blob:`). +- Path traversal segments. +- Functions, symbols, class instances, custom-prototype objects -- including buried inside `Map` values, `Set` members, or cycles. + +The worker boundary is covered by unit tests and by [`tests/evidence/security/worker-boundary.mjs`](../tests/evidence/security/worker-boundary.mjs). The two interesting subtleties: `Object.create(null)` is accepted (a null-prototype object is "plain enough"), and a class with a clean-looking shape is rejected at deep walk because the prototype check runs on the cloneable graph, not just the top level. + +### Worker offload -- the happy path + +The hard-kill is the headline, but the everyday use of `offload` is mundane CPU work. The repo ships a sample that runs two Fibonacci computations on real worker threads through `run.pool`: + +```ts +// samples/worker-offload.sample.js +const moduleURL = new URL("./cpu-worker.sample-worker.js", import.meta.url); + +const results = await run.pool(2, [ + offload(moduleURL, "fibonacci", 20), + offload(moduleURL, "fibonacci", 21), +]); + +// Asserted by the sample: +// results.map(r => r.value) === [6_765, 10_946] +// results.every(r => r.threadId > 0) +``` + +Different OS thread per task. Both results returned. No `try/catch` around `Worker`. No `parentPort` plumbing. Just `offload(modURL, "fnName", input)` composed through the same `run.pool` you saw in article 02. The same primitive that terminates a CPU spinner at the worker boundary is also the one you use to take a heavy sync transform off the event loop. + +```sh +npm run sample:worker +``` + +--- + +## The shield: `run.uncancellable` + +Some code must run to completion even when the parent scope is being cancelled. Database commit. Stripe webhook receipt. Distributed lock release. Audit log flush. + +```ts +import { run } from "@workit/core"; + +const commit = run.uncancellable(async (ctx) => { + await db.commit({ signal: ctx.signal }); + await flushReceipt({ signal: ctx.signal }); +}, { timeout: "2s" }); + +await run.scope(async (scope) => scope.spawn(commit)); +``` + +Inside the shielded body, `ctx.signal` is a fresh signal local to the shield -- the parent's cancel does not propagate in. The shield has its own bounded lifetime (`timeout: "2s"`). When the shield finishes, if the parent had cancelled during the shield, the original `CancellationError` rethrows after the body completes. **Cancellation is delayed, not hidden.** + +What this is not: `run.uncancellable` is **cooperative**. It cannot stop a non-cooperative CPU loop inside the shielded body. For that, use `offload`. + +> **Bench [`08-uncancellable-shield.mjs`](../benchmarks/articles/08-uncancellable-shield.mjs).** Three scenarios -- measured. +> +> | Scenario | What we measure | Result | +> |---|---|---| +> | A. Parent cancel mid-body | Body started t=1 ms, parent cancelled at t=41 ms, body sleeping 120 ms | Body completed naturally at t=136 ms (**outlived cancel by 95 ms**), `bodyObservedAbort: false`, outer settled `cancelled` with `reason.kind === "manual"` | +> | B. Shield timeout while body runs | Shield `{ timeout: "100ms" }`, body sleeping 2,000 ms | Body **observed abort** at ~100 ms, `bodyAbortReasonClass === "TimeoutError"`, outer settled `TimeoutError` | +> | C. Nested shields, outer scope cancels | Inner sleep 80 ms, outer scope cancels at t=20 ms | Inner completed at t=92 ms, outer-shield body completed at t=92 ms, outer settled `cancelled` at t=93 ms with `reason.kind === "manual"` -- preserved through both shields | + +A is the "delayed cancel" contract. B is "the shield is bounded by its own timeout, which the body sees as a `TimeoutError` on its local signal". C is "nested shields don't lose the outer cancel reason." + +--- + +## Cancellation reasons are typed, not strings + +```ts +type CancelReason = + | { kind: "user"; message: string } + | { kind: "deadline"; deadlineAt: number; elapsedMs: number } + | { kind: "timeout"; timeoutMs: number } + | { kind: "parent_failed"; error: unknown } + | { kind: "sibling_failed"; siblingId: TaskId; error: unknown } + | { kind: "race_lost"; winnerId: TaskId } + | { kind: "budget"; budgetKey: string; limit: number; spent: number } + | { kind: "scope_ended" } + | { kind: "manual"; tag: string; data: unknown }; +``` + +Every cancellation in WorkIt carries one of these. You can pivot a metric on `cancelReason.kind`. You can route a runbook on `tag`. You can build a "why did my agent stop" dashboard with seven buckets and an exhaustive `switch`. TypeScript will tell you when you forgot a case. + +Compare: + +```ts +controller.abort("user_clicked_stop"); // string. lossy. arbitrary. +controller.abort(new DOMException("...", "AbortError")); // class with "Abort" name. that's it. +``` + +`AbortSignal.reason` was a stringly-typed escape hatch that won. WorkIt closes it with a discriminated union and tests that every `kind` is exercised in the suite. + +--- + +## How do other libraries handle non-cooperative work + +| Library | Cooperative cancellation | Hard kill (CPU loops) | Mechanism | +|---|---|---|---| +| **WorkIt** | yes signal-aware | yes built-in | `offload({ timeout })` terminates the worker thread | +| Effection | yes generator ops | no | bring your own worker | +| Effect-TS | yes fibers | no | bring your own worker | +| Native `AbortController` | yes | no | the event loop is single-threaded | + +If you need to kill a sync CPU loop today and you're not on WorkIt, you're hand-rolling worker management -- module URL validation, structured-clone classification, timeout-driven termination, parent-cancel propagation, error propagation back to the host. WorkIt's `offload` is ~50 lines of public surface and the runtime contract is in CI. + +--- + +## Receipts + +Two layers. Two benches. One evidence path per claim. + +```sh +node benchmarks/articles/07-worker-hard-kill.mjs # main-thread vs offload +node benchmarks/articles/08-uncancellable-shield.mjs # 3 shield contracts +node benchmarks/articles/run-all.mjs # full article suite +``` + +Production-side gates that back the same contracts: + +| Claim | Evidence | +|---|---| +| Worker hard-kill on CPU spinner | [`07-worker-hard-kill.mjs`](../benchmarks/articles/07-worker-hard-kill.mjs) runs `offload({ timeout: "200ms" })` against the spinner module, asserts bounded rejection, and verifies the late-marker file does not exist. | +| Worker hard-kill on parent cancel | [`tests/evidence/security/worker-boundary.mjs`](../tests/evidence/security/worker-boundary.mjs) verifies parent cancellation terminates worker-owned CPU work. | +| 5 concurrent offloads | Worker unit coverage exercises mixed fast and spinning workers without cross-talk between results. | +| Input validation | [`tests/evidence/security/worker-boundary.mjs`](../tests/evidence/security/worker-boundary.mjs) verifies remote and executable worker URLs are rejected; unit coverage exercises structured-clone classification. | +| `run.uncancellable` semantics | [`08-uncancellable-shield.mjs`](../benchmarks/articles/08-uncancellable-shield.mjs) covers parent cancel during body, shield timeout, nested shields, signal isolation, and reason preservation. | +| `CancelReason.kind` coverage | Every kind in the discriminated union has at least one tracked test that produces it. | + +The ergonomic version of cooperative cancellation: + +```ts +await sleep(ms, ctx.signal); // signal-aware sleep +await fetch(url, { signal: ctx.signal }); // signal-aware fetch +``` + +The ergonomic version of hard cancellation: + +```ts +await scope.spawn(offload(modUrl, "fn", input, { timeout: "Xs" })); +``` + +That's the API. Two layers. Honest labels. + +--- + +## What's coming + +Tomorrow: backpressure. + +You're consuming a billion-row source you'll never materialize. You want to read 25 from the front, run them through a 16-wide map, and have the producer pause when the consumer can't keep up. You want a transcription stream that exits cleanly when the user closes the tab. You want CSP-style channels for the part of your pipeline that's actually a pipeline. + +The slow-consumer memory gate runs **a million items** through a paused consumer in CI and asserts the heap doesn't move. That's the next bench. + +--- + +## Source, Benchmarks, And Evidence + +- Source: https://github.com/WorkRuntime/workit +- Article source: https://github.com/WorkRuntime/workit/blob/main/articles/03-cancellation-and-worker-boundaries.md +- Reproduce: `npm run bench:articles` and `npm run test:evidence` diff --git a/articles/04-backpressure-for-streaming-pipelines.md b/articles/04-backpressure-for-streaming-pipelines.md new file mode 100644 index 0000000..8054575 --- /dev/null +++ b/articles/04-backpressure-for-streaming-pipelines.md @@ -0,0 +1,305 @@ + + +# Backpressure For Streaming Pipelines + +*Last time we showed how to terminate non-cooperative CPU work at the worker boundary. This article stays cooperative but adds the missing piece: backpressure, the runtime contract that lets a producer pause the moment the consumer can't keep up.* + +A RAG ingest pipeline has a billion candidate documents. You only need the 25 that match a downstream filter. A naive promise collection can materialize far more work than the consumer needs; a hand-rolled async iterator can still fill a prefetch buffer before the first result arrives. With WorkIt: + +```ts +import { work } from "@workit/core"; + +async function* billionDocuments() { + for (let i = 0; i < 1_000_000_000; i++) yield { id: i, text: `doc ${i}` }; +} + +const results = []; +for await (const processed of work(billionDocuments()) + .inParallel(16) + .map(async (doc, ctx) => enrich(doc, { signal: ctx.signal })) + .stream()) { + results.push(processed); + if (results.length === 25) break; +} +``` + +Two things to notice: + +- **`work()` accepts an async iterable directly.** No `.from()`, no `Readable.from(...)` shim. The signature is `Iterable | AsyncIterable -> WorkBuilder`. +- **`.map().stream()` is the streaming pipeline form.** `.do(fn)` returns a `Promise>` (full batch result). `.map(fn)` returns a new builder; `.stream()` on a builder returns an `AsyncIterable` that respects backpressure. Both terminals exist; you pick by what the consumer is doing. + +What the producer actually does: + +> **Bench [`09-stream-1b-lazy.mjs`](../benchmarks/articles/09-stream-1b-lazy.mjs).** 1,000,000,000-row generator. `inParallel(16)`. Consumer takes 25, breaks. +> +> | Implementation | Consumed | **Items pulled from the generator** | maxActive | In-flight after break | +> |---|---|---|---|---| +> | Naïve eager prefetch buffer (256-deep) | 25 | **281** | 1 | 0 (all let to settle) | +> | `work().inParallel(16).map().stream()` | 25 | **40** | 1 | 0 (cancelled at break) | + +These are representative captured values. The bench `assert`s the invariant: produced items stay bounded by `TAKE + CONCURRENCY`. The naïve baseline pulled 281 items because once the prefetch buffer is full it doesn't pause the producer -- it pauses the worker pool, which is a different question. + +That's **backpressure**: the producer pauses when the consumer slows down or stops, not when the worker pool fills. + +--- + +## `work().stream()` -- bounded, lazy, cancellable + +```ts +for await (const summary of work(documents) + .inParallel(8) + .withRetry(2) + .withTimeout("15s") + .map(async (doc, ctx) => summarize(doc, { signal: ctx.signal })) + .stream()) { + ui.append(summary); +} +``` + +Properties the runtime guarantees: + +- **`inParallel(N)` is a hard cap.** `maxActive` never exceeds `N`. Property test runs 1..20 wide x 1..100 items, asserts the cap holds across every shape. +- **`stream()` is lazy.** The producer iterator pulls only when an inflight slot is free. +- **`break` is cancellation.** The remaining inflight tasks abort with `CancelReason { kind: "manual", tag: "stream_consumer_closed" }`. Their `ctx.defer` runs. The producer iterator's `return()` runs. +- **A throw inside the body** triggers `CancelReason { kind: "manual", tag: "stream_failed" }` for siblings -- typed, distinguishable from the consumer-break path on a dashboard. +- **Slow consumer pauses producer.** Tracked under `check:stream-memory`: 1,000,000 logical items, slow consumer, bounded heap growth, and no unbounded producer advance. + +> **Bench [`10-stream-slow-consumer.mjs`](../benchmarks/articles/10-stream-slow-consumer.mjs).** 5,000-item source, `inParallel(16)`, consumer ~5 ms per item, take 200. +> +> | Metric | Value | +> |---|---| +> | Consumed | 200 | +> | Produced | **215** | +> | Producer overshoot | **15** (bound: `CONCURRENCY + 1` = 17) | +> | maxActive | 1 | +> | In-flight after break | 0 | +> | Wall time | ~3,108 ms | + +The interesting detail: even with `inParallel(16)`, `maxActive` stayed at 1 because the consumer was the bottleneck. The runtime didn't speculatively saturate the worker pool -- it paced the producer to consumer demand. That is what "backpressure" actually means. A pool that always runs at capacity isn't backpressure; it's a pool. + +### Streaming map: stop after 12, produce only what demand requires + +The most practical reader-facing form of the same property -- a real summarizer pipeline, the size of a real prompt: + +```ts +// samples/streaming-summarizer.sample.js +const TAKE = 12; +const CONCURRENCY = 5; + +for await (const summary of work(documents()) + .inParallel(CONCURRENCY) + .withRetry(2) + .withTimeout("500ms") + .map(async (doc, ctx) => `summary:${doc.id}`) + .stream()) { + summaries.push(summary); + if (summaries.length === TAKE) break; +} + +// Asserted by the sample: +// summaries.length === TAKE +// produced <= TAKE + CONCURRENCY - 1 +// maxActive === CONCURRENCY +// active === 0 // all in-flight cancelled cleanly on break +``` + +50-doc generator. Consume 12. Producer never advances past 16. Concurrency cap exact. Active count zero after `break`. Retry and timeout policy attached without breaking the pull cadence. + +```sh +npm run sample:stream +``` + +--- + +## Defaults that don't surprise + +| Setting | Default | Why | +|---|---|---| +| `inParallel` | `1` (sequential) | Auto-concurrency surprises rate-limited APIs. Sequential is correct. | +| `withRetry` | none | Retrying non-idempotent ops silently is a footgun. | +| `withTimeout` | none | Cancelling work the user didn't ask to cancel is worse than no timeout. | +| `onError` | `"fail"` | Matches `Promise.all` intuition. The discriminated `WorkOutput` return type forces explicit handling on the others. | + +You opt **into** resilience. Nothing is implicit. + +--- + +## CSP-style channels -- `@workit/core/channel` + +`work().stream()` is the right shape when the producer-consumer relationship is one fluent pipeline. When the producer and consumer are independent tasks running side by side -- fan-in, fan-out, work-queue -- you want a channel. + +```ts +import { createChannel } from "@workit/core/channel"; +import { group } from "@workit/core"; + +const orders = createChannel({ capacity: 100 }); + +await group(async (task) => { + task(async (ctx) => { + for await (const o of orderSource()) { + await orders.send(o, { signal: ctx.signal }); + } + orders.close(); + }); + + task(async (ctx) => { + for await (const o of orders) { + await processOrder(o, { signal: ctx.signal }); + } + }); +}); +``` + +Channel contract, all five rows verified by [`11-channel-contract.mjs`](../benchmarks/articles/11-channel-contract.mjs): + +| # | Scenario | Bench observation | +|---|---|---| +| A | `send` blocks when the channel is full | On a `capacity: 2` channel, the third `send` is still pending after a microtask turn and completes only after a `receive` frees a slot | +| B | `close()` drains buffered values | `[1, 2, 3]` delivered, then iteration ended cleanly | +| C | Pending `send` after `close(reason)` rejects | `ChannelClosedError` with `reason: { tag: "shutdown" }` | +| D | A `signal` cancels a pending `receive` | Pending receive rejects when the controller aborts | +| E | Capacity validation | `0`, `-1`, `0.5`, `NaN`, `Infinity` all rejected with `RangeError` at `createChannel(...)` | + +**Cancellation composes with the parent scope.** If the consumer task throws inside `group`, sibling cancellation aborts the producer's pending `send`. The producer's `for await` exits cleanly through the rejection. No orphaned sends, no leaked consumers, no half-drained buffer. + +This is Go's `chan` with structured-concurrency parents. Kotlin's `Channel` without coroutines. It fills the gap between "raw async iterator" and "RxJS observable" for owned producer-consumer work. + +--- + +## Bad-batch bisection -- one rotten document doesn't poison the embedding + +A real RAG pipeline failure mode: the provider returns 400 for a mixed batch because **one** of the documents is malformed. With `Promise.all`, the whole batch fails, the budget is spent on nothing, and the next 99 documents get re-embedded on retry. + +WorkIt ships `embedAllBisection` that splits the failed batch and recovers the good vectors: + +```ts +// samples/embed-bisection.sample.js +const result = await group( + async () => embedAllBisection(["alpha", "bad-doc", "gamma"], { + async embedBatch(inputs) { + if (inputs.includes("bad-doc")) throw new BadBatchError("provider rejected mixed batch"); + return inputs.map((input) => [input.length]); + }, + }, { + batchSize: 3, + onError: "continue", + countTokens: (input) => input.length, + }), + { context } +); + +// Asserted by the sample: +// result.results contains the vectors for "alpha" and "gamma" +// result.errors contains exactly one entry pointing at "bad-doc" +// tokensSpent reflects only the successful work +``` + +`BadBatchError` is the contract. Throw it from `embedBatch` and the helper bisects: split the batch in halves, retry each half, isolate the rotten document, keep the good vectors. Token budget accounting follows the actual successful work -- you don't pay for the failed mixed batch twice. + +```sh +npm run sample:bisection +``` + +This is the difference between "batch job dies at 2 a.m. and the on-call resyncs the warehouse" and "batch job logs the bad ID and keeps going." + +--- + +## Streaming STT with disconnect cleanup (revisited) + +Article 1 showed this. Now you can read the backpressure underneath it: + +```ts +import { transcribeStream } from "@workit/core/ai"; + +for await (const text of transcribeStream(microphone, { + async transcribe(chunk, ctx) { + return provider.transcribe(chunk, { signal: ctx.signal }); + }, +}, { signal: socket.signal })) { + socket.send(text); +} +``` + +When the user closes their laptop: + +1. `socket.signal` aborts. +2. `transcribeStream` propagates the abort to the inflight `transcribe()` body. +3. The provider's HTTP request aborts at the `AbortSignal` boundary. +4. The async generator's `finally` runs, closing the microphone source. +5. The `for await` loop exits. + +Tracked sample: **`sample:stt-disconnect`** -- disconnects mid-second-chunk, asserts the provider was cancelled, the source was closed, and the cancel reason kind is `manual`. + +--- + +## How WorkIt's streaming primitives compare + +| Library | Backpressure | Cancellation | Structured concurrency | Note | +|---|---|---|---|---| +| **WorkIt `work().stream()`** | yes producer pauses on consumer | yes via `ctx.signal` and `break` | yes scope-owned | Backpressure between producer and consumer in one pipeline | +| **WorkIt `createChannel`** | yes blocking `send`/`receive` | yes via signal + scope cancel | yes scope-owned | Backpressure between independent tasks | +| Node.js `Readable` stream | yes via `highWaterMark` | partial via `destroy()` | no no scope | No structured cancel propagation | +| RxJS observable | no by default; pressure operators are opt-in | yes on `unsubscribe` | per-subscription, not per-scope | Different model: events, not owned tasks | +| `p-queue` | partial (concurrency limit) | no | no | Bounds in-flight, not producer pull | +| Async generator (raw) | yes pull-based | partial via `return()` | no | No bounded concurrency without manual scaffolding | + +WorkIt's streaming and channel primitives are the only ones in the table that tie backpressure **to ownership** -- cancel the scope, the channel closes, the in-flight work aborts, and cleanup runs. + +--- + +## Receipts + +```sh +node benchmarks/articles/09-stream-1b-lazy.mjs # naive 281 vs WorkIt 40 +node benchmarks/articles/10-stream-slow-consumer.mjs # producer overshoot 15 vs bound 17 +node benchmarks/articles/11-channel-contract.mjs # 5 channel scenarios +node benchmarks/articles/run-all.mjs # full article suite +``` + +Production-side gates that back the same primitives: + +| Claim | Evidence | +|---|---| +| 1 B virtual stream consumed = 25 | `sample:1b` produces <= TAKE+CONCURRENCY items, asserted in CI. Reproduced by [`09-stream-1b-lazy.mjs`](../benchmarks/articles/09-stream-1b-lazy.mjs). | +| 1 M item slow-consumer gate | `check:stream-memory` -- heap growth bounded, max active capped, and producer pull remains demand-limited. | +| Channel backpressure on capacity 2 | [`11-channel-contract.mjs`](../benchmarks/articles/11-channel-contract.mjs) verifies the third send blocks until the first receive. | +| Channel close + drain | [`tests/evidence/correctness/runtime-contracts.mjs`](../tests/evidence/correctness/runtime-contracts.mjs) verifies buffered values drain before `done: true`. | +| Channel cancel via signal | Channel contract coverage verifies pending receives reject with the cancel reason. | +| Channel composes with `group()` | Channel contract coverage verifies producer/consumer pipelines deliver values in order. | +| `work().inParallel(N)` cap | Property test (`fast-check`): for any (N, total), `maxActive <= N`. | +| STT disconnect | `sample:stt-disconnect`: provider cancelled, source closed, reason kind = `manual`. | + +Run them: + +```sh +npm run sample:1b +npm run sample:stream +npm run sample:embed100k +npm run sample:bisection +npm run sample:stt-disconnect +``` + +--- + +## What's coming + +Now you have a producer that paces itself to the consumer, a channel that closes when its scope cancels, and a stream that exits cleanly when the user closes the tab. + +Tomorrow we add the next ownership primitive on top: **the budget**. + +A `$0.50` `CostBudget`. A `100,000`-token `OpenAITokens`. A `5`-tool-call `AgentToolCalls`. Atomic across all parallel children. Inheritable through scope context. Shadowed by inner scopes for sub-budgets. Overrun cancels with `CancelReason { kind: "budget" }` and partial results stay. + +The runtime change underneath this is context overlay lookup: 100 `.with()` calls over a 5,000-key context bag moved from tens of milliseconds in the inline clone baseline to well under the 10 ms gate, without changing a line of public API. The bench in the next article shows the representative timing. + +The point is not simply "we have budgets." Many frameworks expose budgets. The stronger claim is **budgets that compose with cancellation, race, retry, hedge, fallback, channels, and streams** under one ownership tree. + +--- + +## Source, Benchmarks, And Evidence + +- Source: https://github.com/WorkRuntime/workit +- Article source: https://github.com/WorkRuntime/workit/blob/main/articles/04-backpressure-for-streaming-pipelines.md +- Reproduce: `npm run bench:articles` and `npm run test:evidence` diff --git a/articles/05-resource-safety-and-budgeted-work.md b/articles/05-resource-safety-and-budgeted-work.md new file mode 100644 index 0000000..9565ed9 --- /dev/null +++ b/articles/05-resource-safety-and-budgeted-work.md @@ -0,0 +1,231 @@ + + +# Resource Safety And Budgeted Work + +*Last time we showed backpressure with channels and `work().stream()` -- pausing the producer the moment the consumer slows down. This article puts hard boundaries on cost and guarantees that cleanup always runs, even when the user hits Ctrl-C, the deadline fires, or a sibling throws.* + +Three primitives. One ownership tree. + +- `run.bracket` -- open, use, release. Always release. +- `run.uncancellable` -- short critical sections that survive cancellation. +- Budgets -- hard caps on cost, tokens, or any metric, enforced atomically across parallel work. + +No `try/finally` you'll forget to write. No "did the connection close" post-mortem. + +--- + +## `run.bracket` -- acquire, use, release. Always release. + +```ts +import { run } from "@workit/core"; + +const rows = await run.scope(async (scope) => scope.spawn(run.bracket( + async () => db.connect(), // acquire + async (conn, ctx) => conn.query("select 1", { signal: ctx.signal }), // use + async (conn) => conn.close(), // release + { timeout: "5s" }, // bounded cleanup +))); +``` + +`release` runs **once** on every exit path: success, throw, parent cancel, timeout, sibling failure. The release receives the resource. The release also receives `cleanupCtx.signal` so it can give up if the cleanup itself hangs. Nested brackets release LIFO. + +> **Bench [`12-bracket-vs-try-finally.mjs`](../benchmarks/articles/12-bracket-vs-try-finally.mjs).** Five scenarios -- measured. +> +> | # | Scenario | Result | +> |---|---|---| +> | A | Success path | order: `[acquire, use, release:RES-A]`, release ran exactly once | +> | B | `use` throws | order: `[acquire, use, release:RES-B]`, release runs with the resource, error propagates | +> | C | `acquire` throws | order: `[acquire]`, **release does NOT run**, error propagates | +> | D | Parent cancel during `use` | order: `[acquire, use, release:RES-D]`, outer settled `CancellationError` with `kind: "manual"` | +> | E | **Hanging release** | Native `try/finally` with a non-resolving cleanup is **still pending after 250 ms** (would deadlock forever). `run.bracket(..., { timeout: "150ms" })` settles at **t=157 ms** and emits `task:cleanup_timeout`. | + +Bundle cost: **+58 B min, +15 B gzip** on `public-api`. Effectively free. + +**When to use `bracket`** -- anything with a single resource that must be closed exactly once: database connections, file handles, distributed locks, HTTP client sessions, ML model contexts. + +--- + +## `run.uncancellable` -- receipts that always commit + +```ts +const receipt = run.uncancellable(async (ctx) => { + const intent = await stripe.confirmIntent({ signal: ctx.signal }); + await db.recordReceipt({ id: intent.id }, { signal: ctx.signal }); + return intent.receipt_url; +}, { timeout: "2s" }); + +const url = await run.scope(async (scope) => scope.spawn(receipt)); +``` + +The user hits Ctrl-C. The parent scope cancels. The deadline fires. Inside the shielded body, **none of those are visible** -- `ctx.signal` is a fresh signal local to the shield. The body runs to completion (or to its own `timeout`). When the shield finishes, if the parent had cancelled during the shield, the original `CancellationError` rethrows after the body completes. + +Cancellation is **delayed**, not hidden. + +This is the line that lets you write a Stripe webhook handler, a distributed-lock release, or a database commit without relying on ordinary task cancellation to preserve the critical section. **Bench [`08-uncancellable-shield.mjs`](../benchmarks/articles/08-uncancellable-shield.mjs)** (article 03) measured the body running 95 ms past a parent cancel before the original reason was rethrown. + +**When to use `uncancellable`** -- short, critical sections that must finish atomically: Stripe charges, audit log flushes, idempotency-key writes, distributed-lock release. + +> **`bracket` vs `uncancellable` decision rule:** +> +> - Use `run.bracket` when there is a **resource you opened and must close**. Cleanup is the contract; the body is just what runs in between. +> - Use `run.uncancellable` when there is **a critical section that must run to the end** even if the parent cancels. There may be no resource. +> - Use both together when a critical section needs a resource: spawn a `run.bracket` whose `use` body is a `run.uncancellable`. + +`run.uncancellable` is **cooperative**. It cannot stop a non-cooperative CPU loop inside the body. For that, see article 03 -- `offload` with worker termination. + +--- + +## Budgeted Agent Work: A 50-Cent Ceiling + +```ts +import { CostBudget, group, run } from "@workit/core"; + +const answer = await run.context.with( + CostBudget, { spent: 0, limit: 0.50, unit: "USD" }, + () => group(async (task) => reactLoop(task, goal)), +); +``` + +Inside any task body, charge with `ctx.consumeCost`: + +```ts +async function callLLM(ctx, prompt) { + const res = await openai.chat({ messages: [{ role: "user", content: prompt }] }, { signal: ctx.signal }); + ctx.consumeCost(res.usage.total_cost); // throws + cancels owning scope on overrun + return res; +} +``` + +`ctx.consumeCost(amount)` is atomic. Concurrent charges across siblings serialize through the budget cell. Overrun throws `BudgetExceededError` and cancels the **owning scope** -- the scope that set the budget, even if the charge happened five levels deeper. The cancel reason is typed: `CancelReason { kind: "budget", budgetKey, limit, spent }`. + +Built-in budgets: + +```ts +import { CostBudget, TokenBudget, OpenAITokens, AgentToolCalls } from "@workit/core"; +``` + +Custom budgets: + +```ts +import { createBudget } from "@workit/core"; +const Anthropic = createBudget("anthropic-tokens", { unit: "tokens" }); +``` + +> **Bench [`13-budget-atomicity-and-cancel.mjs`](../benchmarks/articles/13-budget-atomicity-and-cancel.mjs).** Three rules, measured. +> +> | Rule | Bench observation | +> |---|---| +> | Atomic concurrent charges | 100 sibling tasks each consume 0.01 from a 1.00 cap -> final spent = **1.0000...** exactly | +> | Owning scope cancellation | Budget set at depth 0; overrun attempted at depth 5; outer scope cancelled with `kind: "budget"` | +> | Caller-object immutability | After 0.5 of charges, the caller's input object stays `{ spent: 0, limit: 1, unit: "USD" }` (engine clones); live snapshot reflects the actual spend | + +Three more rules complete the contract (each tracked in the production suite): + +| Rule | Where it's enforced | +|---|---| +| Inner scope can shadow parent budget | Evidence coverage verifies an inner budget cell can charge independently while the outer budget remains unchanged. | +| Live read via `run.context.budget(key)` | Returns a fresh snapshot. Mutating the snapshot does not affect future reads. | +| Snapshots are `Readonly` at the type | Consumer cannot mutate; engine routes mutation through `ctx.consume()` only | + +--- + +## 100,000 documents under a token cap + +```ts +import { run, group } from "@workit/core"; +import { OpenAITokens, embedAll } from "@workit/core/ai"; + +await run.context.with( + OpenAITokens, { spent: 0, limit: 1_000_000, unit: "tokens" }, + () => group(() => embedAll(documents, { + concurrency: 32, + countTokens: (doc) => doc.tokens, + async embed(doc, ctx) { + return openai.embed(doc.text, { signal: ctx.signal }); + }, + })), +); +``` + +`embedAll` is a thin helper built on `work().inParallel()` from article 02 plus `ctx.consume(OpenAITokens, count)` per item. Hit the cap mid-stream -> scope cancels with `CancelReason { kind: "budget", limit: 1_000_000, spent: 1_000_000 }`. The 32 inflight embeddings see the abort on their `ctx.signal`. Provider calls that honor the signal cancel at the transport boundary. Partial results return and no additional budget is consumed after the cap. + +Tracked: `sample:embed100k` runs the full 100,000-document pipeline against a deterministic provider fixture in CI. Asserts `maxActive <= concurrency`, `finalBudget.spent === total`, `output.results.length === total`. + +--- + +## The Context Overlay Speedup + +Budgets, cancellation reasons, request scopes, idempotency keys, agent identity, deadlines -- every cross-cutting concern lives in `ContextBag`. The first-pass implementation cloned the underlying `Map` on every `.with()` call. That's quadratic when you have a deep agent stack. + +The fix: an **overlay-based** context. Think of it as a linked list of single-key deltas pointing at the parent bag. `.with(key, value)` returns a child that stores one entry and points at its parent. Lookup walks up the chain. Memory and cost per `.with()` are O(1). + +> **Bench [`14-context-overlay-perf.mjs`](../benchmarks/articles/14-context-overlay-perf.mjs).** 100 `.with()` calls over a 5,000-key bag. +> +> | Implementation | Wall time | Per call | +> |---|---|---| +> | Naïve clone-on-`with` (inline baseline) | **32.6 ms** | ~0.33 ms | +> | WorkIt overlay context | **0.011 ms** | ~0.0001 ms | +> +> The representative run shows a large constant-factor improvement over the inline clone baseline. Same public API. Same lookup result. The CI gate `npm run check:context-performance` asserts the overlay completes the workload in **< 10 ms** and the bench additionally asserts the inline baseline is at least 10x slower. + +Evidence coverage verifies a deep shadow chain still resolves correctly and child shadows do not leak into the parent. + +--- + +## How WorkIt compares on resource safety + +| Pattern | Cancel-aware | Cleanup runs on every exit | Bounded cleanup time | Notes | +|---|---|---|---|---| +| **WorkIt `run.bracket`** | yes | yes | yes via `CleanupOpts.timeout` + `task:cleanup_timeout` event | release receives the resource and a cleanup signal | +| **WorkIt `run.uncancellable`** | yes (delayed rethrow) | n/a (it's the body, not a release) | yes via shield `timeout` | for short critical sections, not resource cleanup | +| `try { } finally { }` | no -- finally cannot run after a hard cancel; cannot be bounded | partial -- runs only if the awaiter completes settlement | no -- a hanging cleanup deadlocks | bench 12-E: still pending after 250 ms | +| ES2024 `using` / `await using` | no -- disposal hooks have no signal awareness | yes on scope exit | no -- no timeout | best when the resource has an `[Symbol.dispose]` and no cleanup timeout is required | +| Effect-TS `acquireRelease` | yes | yes | partial (no built-in timeout on release in the public surface) | richer, but inside the Effect DSL | + +WorkIt is the only row that gives you cancel-aware cleanup, guaranteed-to-run release, and a **bounded timeout for the cleanup itself**, surfaced as a typed event. + +--- + +## Receipts + +```sh +node benchmarks/articles/12-bracket-vs-try-finally.mjs # 5 bracket scenarios +node benchmarks/articles/13-budget-atomicity-and-cancel.mjs # atomic + owning + immutable +node benchmarks/articles/14-context-overlay-perf.mjs # 32.6 ms vs 0.011 ms +node benchmarks/articles/run-all.mjs # full article suite +``` + +Production-side gates that back the same primitives: + +| Claim | Evidence | +|---|---| +| `run.bracket` scenarios | [`12-bracket-vs-try-finally.mjs`](../benchmarks/articles/12-bracket-vs-try-finally.mjs) covers success, throw, cancel, timeout, hanging cleanup, and bounded release. | +| `run.uncancellable` scenarios | [`08-uncancellable-shield.mjs`](../benchmarks/articles/08-uncancellable-shield.mjs) covers parent cancel during body, shield timeout, nested shields, and signal isolation. | +| Budget atomicity | Property test: 100 concurrent charges of 0.01 -> spent = 1.00 exactly. Reproduced by [`13-budget-atomicity-and-cancel.mjs`](../benchmarks/articles/13-budget-atomicity-and-cancel.mjs). | +| Budget snapshot immutability | [`tests/evidence/correctness/runtime-contracts.mjs`](../tests/evidence/correctness/runtime-contracts.mjs) verifies caller objects remain unchanged and snapshots are read-only views of budget state. | +| Budget owning-scope cancellation | Charge attempted at depth 5 cancels the owning scope at depth 0 with `kind: "budget"`. | +| Context overlay perf | `npm run check:context-performance` asserts < 10 ms; bench records a representative ~0.01 ms run with a large speedup over the inline baseline. | +| 100K embeddings sample | `sample:embed100k`: 100,000 docs, concurrency 32, token budget enforced, in-CI assertion. | + +--- + +## What's coming + +Now you can build an agent that costs 50 cents max, holds a database connection that always closes, and confirms a Stripe charge through a user disconnect. + +Tomorrow: **observability with bounded telemetry cost.** + +`scope.tree()` as a print statement for agents. The four-layer cost-control architecture -- sampling, batching, summarization, budgeting -- that takes a 100K-runs/day workload from $9,125/year of CloudWatch ingestion down to **$456/year** while preserving slow/error traces. 20x less data. One config object. + +The headline: a structured-concurrency runtime where observability is **sampled, batched, summarized, and budgeted by default** -- and you opt out of cost protection, not in. + +--- + +## Source, Benchmarks, And Evidence + +- Source: https://github.com/WorkRuntime/workit +- Article source: https://github.com/WorkRuntime/workit/blob/main/articles/05-resource-safety-and-budgeted-work.md +- Reproduce: `npm run bench:articles` and `npm run test:evidence` diff --git a/articles/06-observability-without-core-bloat.md b/articles/06-observability-without-core-bloat.md new file mode 100644 index 0000000..9ce4ce8 --- /dev/null +++ b/articles/06-observability-without-core-bloat.md @@ -0,0 +1,459 @@ + + +# Observability Without Core Bloat + +*Last time we put hard budgets on cost and guaranteed cleanup. This time we make sure you can inspect an agent run without making telemetry volume the default cost center.* + +Take a real number. An agent does 200 tool calls per run. Each tool emits 5 events. That's 1,000 events per run. 100K runs/day = **100 million events/day**. + +At CloudWatch Logs ingestion ($0.50/GB) with 500-byte structured JSON, that is roughly **$9,125/year of telemetry that may not be needed on successful runs.** Datadog APM, Application Insights, and Honeycomb have different pricing models, but the engineering issue is the same: unbounded event volume turns observability into a cost surface. + +WorkIt's core has zero networking imports. + +```sh +$ grep -E "node:http|node:https|fetch" dist/index.js +$ +``` + +That's not a feature claim. That's a gate. + +> **Bench [`15-core-zero-network.mjs`](../benchmarks/articles/15-core-zero-network.mjs).** Walks the published `dist/` tree (excluding the explicit `observability`, `otel`, and `worker` subpaths), greps every `.js`/`.cjs`/`.mjs` for `node:http`, `node:https`, raw `http`/`https` imports, and `fetch(...)`. +> +> | Metric | Result | +> |---|---| +> | Files scanned | 14 | +> | Forbidden imports found | **0** | +> | Excluded subpaths | `observability`, `otel`, `worker` | +> | `assert.equal(hits.length, 0)` | passed | + +If a PR adds a networking import to core, the production gate `npm run check:no-network` fails. The bench above verifies the same property in the artifact a consumer installs from npm. + +--- + +## Layer 1 -- Local-first by default (cost: $0) + +```ts +import { run, renderTree } from "@workit/core"; + +const result = await run.scope(async (scope) => { + // your agent code + console.log(renderTree(scope.status())); + return await doWork(scope); +}); +``` + +Zero network calls. Zero log lines. Zero telemetry export cost. `scope.status()` returns a snapshot. `renderTree(...)` prints an ASCII tree. That is the built-in observability surface; exporters are opt-in. + +When you do want telemetry, you opt in: + +```ts +import { attachTelemetryExporter } from "@workit/core/observability"; + +const attachment = attachTelemetryExporter(scope, async (event) => { + await otlp.write(event); +}, { + sampling: { mode: "errors_and_slow", slowThresholdMs: 2_000 }, + circuitBreaker: { failureThreshold: 3, openForMs: 60_000 }, + sanitize: (event) => stripPII(event), +}); +``` + +Four words: **sampled, aggregated, budgeted, circuit-broken.** + +--- + +## Layer 2 -- Sampling: errors_and_slow is the production default + +Same workload, same agent. With `errors_and_slow` (slow threshold 2 seconds) and 95% of runs completing fast and successful: + +| Workload | Without sampling | With `errors_and_slow` (2 s) | Reduction | +|---|---|---|---| +| 100K runs/day, 5% slow/errored | 100K x 1,000 ev x 500 B = **50 GB/day** | 5K x 1,000 ev x 500 B = **2.5 GB/day** | **20x** | +| CloudWatch Logs ingestion | $25/day = **$9,125/year** | $1.25/day = **$456/year** | **$8,669/yr saved** | + +The intended debugging signal is preserved for slow and failing runs. A passing run rarely needs full trace inspection -- you need it when something breaks or hangs, which is exactly what this policy keeps. + +> **Bench [`16-sampling-and-aggregation.mjs`](../benchmarks/articles/16-sampling-and-aggregation.mjs).** 100 root scopes x 5 child tasks each. 5% slow, 2% errored. Both modes attach `attachTelemetryExporter` to the same workload. +> +> | `sampling.mode` | Exported events | Reduction factor | +> |---|---|---| +> | `"all"` | **1,300** | baseline | +> | `"errors_and_slow"` (slowThresholdMs: 55) | **36** | **~36x** | +> +> The bench asserts >= 5x to stay tolerant of jitter. The measured ratio came out higher than the article's nominal 20x because the synthetic workload concentrates slow/errored scopes; the savings table above uses the conservative ratio. + +**Sampling modes -- how to choose:** + +| Mode | Use case | +|---|---| +| `"off"` | Local dev, high-volume tests -- same shape as Layer 1 ($0) | +| `"errors_and_slow"` | **Production default** -- keep failing and slow traces, drop the rest | +| `"head"` | Random sampling at scope start -- cheap, no buffering, good for high-throughput service tracing | +| `"all"` | Debugging session for one run -- opt-in firehose | + +A child scope cannot upgrade itself to "kept" if its root was sampled out. This is the rule that keeps traces causally intact. + +--- + +## Layer 3 -- Aggregation, not enumeration + +By default, an aggregated exporter receives **one summary record per scope**, not per task: + +```ts +interface ScopeSummary { + scopeId: string; + parentId: string | null; + durationMs: number; + outcome: "completed" | "errored" | "cancelled"; + taskCounts: { + started: number; succeeded: number; failed: number; + cancelled: number; retried: number; cleanupFailed: number; + }; + droppedTelemetryEvents: number; +} +``` + +200 tasks succeed -> **1 summary record exported**. Not 200. Not 1,000. + +```ts +import { attachScopeSummaryExporter } from "@workit/core/observability"; + +attachScopeSummaryExporter(scope, async (summary) => { + await otlp.writeAggregate(summary); +}, { /* same circuit breaker / queue / sampling options */ }); +``` + +| Aggregation level | Records per scope | When to use | +|---|---|---| +| Scope summary (default) | 1 per closed scope | Production -- cost-efficient | +| Hybrid (summary + per-task for failures/slow) | summary + N | Investigating a specific failure pattern | +| Per-task firehose | one per event | Short opt-in debug session | + +The summary aggregator is exercised end-to-end by `npm run check:exporter-stress` -- 100,000 events with bounded queue and drop-front under back-pressure. The bench in this folder skips the summary path on purpose: `attachScopeSummaryExporter` needs the `scope:opened` event, which fires before user code can attach inside `run.scope`. Companion packages wire it earlier; the production gate covers it. + +--- + +## Layer 4 -- Telemetry budget (the safety net) + +```ts +import { TelemetryBudget, run } from "@workit/core"; + +await run.context.with( + TelemetryBudget, + { spent: 0, limit: 100_000, unit: "events" }, + () => agentRun(), +); +``` + +The exporter checks the budget before emitting each event. When a scope's event count would exceed the limit, the event is **dropped silently**. The first overrun emits one warning. Subsequent overruns are counted into the next scope summary's `droppedTelemetryEvents` field. Tasks **continue executing normally**. + +**Telemetry overrun must never affect application behaviour.** The companion `withOTel` wrap sets a default budget at the wrap-call boundary, so the floor is opt-out, not opt-in. + +--- + +## Cardinality Discipline And Cost Control + +Every exported field is classified bounded or unbounded. Unbounded fields are **never** emitted as metric labels. + +| Field | Bound | In metric labels | +|---|---|---| +| `scope.name` | dev-chosen | yes | +| `task.kind` | 5 fixed values (`io`/`llm`/`tool`/`cpu`/`custom`) | yes | +| `outcome` | 3 values | yes | +| `cancelReason.kind` | 9 values | yes | +| `error.name` | finite by class | yes | +| `attempt` | bounded by retry limit | yes (bucketed) | +| `task.id` | unbounded UUID | **no** | +| `error.message` | unbounded text | **no** | +| `meta.*` | user-controlled | explicit opt-in only | + +Wrap your metric exporter with `createCardinalitySafeMetricExporter` and pass an `allowedLabels` allow-list -- anything outside is rejected at runtime. + +> **Bench [`17-cardinality-safe-metrics.mjs`](../benchmarks/articles/17-cardinality-safe-metrics.mjs).** Five candidate metric points, allow-list `["task.kind", "outcome", "scope.name"]`. +> +> | Point | Labels | Outcome | +> |---|---|---| +> | 1 | `{ "task.kind": "io", "outcome": "succeeded" }` | yes emitted | +> | 2 | `{ "task.kind": "llm", "outcome": "failed" }` | yes emitted | +> | 3 | `{ "task.kind": "tool", "task.id": "uuid-abc" }` | no rejected -- `Metric label "task.id" is not in the allowed label set` | +> | 4 | `{ "task.kind": "io", "error.message": "EHOSTUNREACH at 10.0.0.42 retrying" }` | no rejected -- `Metric label "error.message" is not in the allowed label set` | +> | 5 | `{ "task.kind": "evil" }` | yes emitted (label-key check only -- out-of-enum *value* validation is the OTel-adapter's job) | + +The wrapper rejects unbounded label **keys** at runtime. Out-of-enum value rejection (`taskKind: "evil"` failing in OTel) is enforced by the adapter contract, while the wrapper keeps cardinality control at the label-key boundary. + +**Correct usage:** + +```ts +task(() => fetch(url), { kind: "io" }); // yes +task(() => llm.call(prompt), { kind: "llm" }); // yes +task(() => runTool(input), { kind: "tool" }); // yes +task(() => heavyCalc(), { kind: "cpu" }); // yes +task(() => custom(), { kind: "custom" }); // yes +``` + +--- + +## Exporter circuit breaker -- the OOM defence + +```ts +{ + circuitBreaker: { failureThreshold: 10, openForMs: 5 * 60_000 }, + queue: { maxItems: 10_000, maxBytes: 10 * 1024 * 1024 }, +} +``` + +OTLP backend goes down. The exporter sees N consecutive failures -> opens for `openForMs` -> events drop (counted, not queued). The bounded queue uses **drop-front** when full, so you keep the most recent context. After `openForMs` elapses -> half-open -> trial export -> close on success. + +Result: **process memory growth bounded under 50 MB** through 1,000 scopes against a backend that returns 503 for every request. Tracked under `tests/perf/exporter-failure.test.ts`. This is the feature that prevents the most embarrassing observability incident -- the telemetry agent eating all process memory because the backend is unreachable. Datadog had this. New Relic had this. We design it out. + +--- + +## `scope.tree()` -- the print statement for agents + +``` +agent-run +|- ok planLLM (243ms) +|- retry fetchTool (attempt 2/3) +| `- [running] retry-delay (120ms) +|- [running] summarize (running, 45% -- "embedding chunks") +`- failed auditLog (TimeoutError) + +5 tasks * 1 ok * 1 failed * 2 running * 1 retrying * deadline in 12s +``` + +Print this when something breaks. Print this in your test runner. Print this from a SIGUSR1 dump. + +| Icon | Meaning | +|---|---| +| `ok` | succeeded (durationMs) | +| `failed` | failed (error.name) | +| `cancelled` | cancelled (reason.kind) | +| `[running]` | running (elapsed -- message) | +| `retry` | retrying (attempt N/total) | +| `pending` | pending | + +```ts +import { renderTree } from "@workit/core"; +console.log(renderTree(scope.status())); +``` + +### Progress is a typed event, not a log line + +Inside any task body, `ctx.report({ pct, message, data })` emits a typed `task:progress` event tagged with the task's stable id and name. Your exporter, your dashboard, your test assertion all pivot on the same shape. No `console.log`. No string parsing. No grep. + +```ts +// samples/progress-parallel.sample.js +const TARGET = "embed.batch.7"; + +await run.scope(async (scope) => { + scope.onEvent((event) => { + if (event.type === "task:progress" && taskNames.get(event.taskId) === TARGET) { + targetProgress.push({ pct: event.pct, message: event.message }); + } + }); + + const handles = Array.from({ length: 16 }, (_, i) => scope.spawn(async (ctx) => { + if (i === 7) { + for (const step of [1, 2, 3, 4]) { + ctx.report({ pct: step / 4, message: `chunk-${step}` }); + await sleep(1, ctx.signal); + } + } else { await sleep(8, ctx.signal); } + }, { name: `embed.batch.${i}`, kind: "llm" })); + + return await Promise.all(handles); +}); + +// Asserted by the sample: +// targetProgress.map(e => e.pct) === [0.25, 0.5, 0.75, 1] +// maxActive === 16 +``` + +16 parallel embeddings. One of them is `embed.batch.7`. We filter the event stream to that task's id and watch the progress sequence land -- `[0.25, 0.5, 0.75, 1]`, with the messages `chunk-1`..`chunk-4` attached. The other 15 siblings run concurrently and don't interleave their reports into our channel because every event carries the typed `taskId`. + +```sh +npm run sample:progress +``` + +That's the difference between "tail the log file and hope" and "subscribe to a typed event stream and pivot on the shape that the type system already validated." + +### Snapshots: the stable view of a live runtime + +The hot runtime exposes **stable snapshots**, not live references. Pull one whenever you need to inspect -- the engine doesn't mutate it, you can hand it to anything (logger, diagnostics, test assertion, JSON wire). + +```ts +import type { ScopeSnapshot, TaskSnapshot } from "@workit/core"; + +const snapshot: ScopeSnapshot = scope.status(); + +interface ScopeSnapshot { + id: ScopeId; + name: string; + status: "running" | "cancelling" | "closed"; + startedAt: number; + deadlineAt: number; + pendingCount: number; + completedCount: number; + failedCount: number; + cancelledCount: number; + tasks: TaskSnapshot[]; // every task currently owned by this scope + scopes: ScopeSnapshot[]; // every child scope, recursively +} + +interface TaskSnapshot { + id: TaskId; + name: string; + kind: TaskKind; // "io" | "llm" | "tool" | "cpu" | "custom" + status: "pending" | "running" | "succeeded" | "failed" | "cancelled"; + attempt: number; + startedAt: number; + durationMs: number; + progress: ProgressReport; + meta: Record; +} +``` + +Three properties matter: + +- **Snapshot is immutable** -- taking one twice gives you two independent objects. The runtime never mutates one you already hold. +- **Snapshot is recursive** -- `scopes` is the same shape as the root, all the way down. Tools that walk it (the diagnoser, the renderer, your test assertions) share one shape. +- **Snapshot is the only public surface** -- `renderTree` consumes it, `diagnoseSnapshot` consumes it, you consume it. The hot runtime owns the live state; rich tooling lives outside core. + +That's the architectural boundary that keeps core small. Anything that wants to inspect the runtime asks for a snapshot. + +--- + +## `@workit/core/diagnostics` -- the stuck-task detector + +```ts +import { diagnoseSnapshot } from "@workit/core/diagnostics"; + +const report = diagnoseSnapshot(scope.status(), { + staleTaskMs: 30_000, + events: recentEvents, +}); + +if (report.findings.length > 0) { + console.warn("agent stalled:", report.findings); +} +``` + +Diagnoses live, on demand, against an existing snapshot. Subpath-only so the root runtime stays small. + +> **Bench [`18-diagnostics-finding-codes.mjs`](../benchmarks/articles/18-diagnostics-finding-codes.mjs).** Five hand-crafted snapshots, one per finding code. +> +> | Scenario | `report.status` | Finding codes emitted | +> |---|---|---| +> | Healthy snapshot | `ok` | (none) | +> | Task running > 30 s | `needs_attention` | `old_pending_task` | +> | Scope status `cancelling` | `needs_attention` | `scope_cancelling` | +> | Pending child scope | `needs_attention` | `pending_child_scope` + recursive `old_pending_task` | +> | `task:cleanup_timeout` event in window | `needs_attention` | `cleanup_timeout` | + +Wire it to a SIGUSR1 handler in production: + +```ts +process.on("SIGUSR1", () => { + console.error(renderTree(rootScope.status())); + console.error(JSON.stringify(diagnoseSnapshot(rootScope.status(), { staleTaskMs: 30_000 }), null, 2)); +}); +``` + +Something hangs at 3 a.m. -> one signal gives you the live tree and a list of suspicious tasks. No APM. No external service. Stderr. + +--- + +## OpenTelemetry -- opt-in, with optional peer + +```ts +import { attachOpenTelemetry } from "@workit/core/otel"; + +const detach = attachOpenTelemetry(scope, { tracer, meter }); +``` + +`@opentelemetry/api` is an optional peer. The root WorkIt package stays at zero runtime dependencies. Install OTel only when you need it: + +```sh +npm install @opentelemetry/api +``` + +If the peer is missing, `attachOpenTelemetry` throws a message that names exactly the install command -- no cryptic "cannot resolve module" trace. + +--- + +## Three canonical configurations + +```ts +// Local development -- default. No setup needed. +await run.scope(async (scope) => { /* ... */ }); +// Inspect: console.log(renderTree(scope.status())) + +// Production -- sampled, aggregated, budgeted, circuit-broken. +await withOTel( + { + exporter: otlpExporter, + sampling: { mode: "errors_and_slow", slowThresholdMs: 2_000 }, + aggregation: "per_scope", + eventBudget: 50_000, + redact: ["context.user.email"], + circuitBreaker: { failureThreshold: 10, openForMs: 300_000 }, + }, + () => agentRun(), +); +// Typical bill, 100K runs/day: ~$456/year. + +// Full trace debugging: one-off, manual, full event stream. +await withOTel( + { exporter: otlpExporter, sampling: { mode: "all" }, aggregation: "per_task" }, + () => problemReproduction(), +); +``` + +--- + +## Receipts + +```sh +node benchmarks/articles/15-core-zero-network.mjs # 0 hits in 14 dist files +node benchmarks/articles/16-sampling-and-aggregation.mjs # 1,300 -> 36 events +node benchmarks/articles/17-cardinality-safe-metrics.mjs # 2 of 5 unbounded labels rejected +node benchmarks/articles/18-diagnostics-finding-codes.mjs # 4 finding codes proven +node benchmarks/articles/run-all.mjs # full article suite +``` + +Production-side gates that back the same surface: + +| Claim | Evidence | +|---|---| +| Core has zero networking imports | Static gate finds no `node:http`/`node:https`/`fetch` in `dist/index.js`. Reproduced by [`15-core-zero-network.mjs`](../benchmarks/articles/15-core-zero-network.mjs) over the full published `dist/` tree minus the explicit network-bridge subpaths. | +| Sampling reduction (`errors_and_slow` @ slowThreshold) | 100 root scopes / 5 children, >= 5x reduction asserted; measured ~36x. Production exporter stress test runs 100,000 events. | +| Aggregation collapses N tasks -> 1 record | `npm run check:exporter-stress` exercises the full summary path with bounded queue. | +| Telemetry budget never throws | Property test: any budget x any event volume -> tasks complete normally. | +| Cardinality enforcement at runtime | [`17-cardinality-safe-metrics.mjs`](../benchmarks/articles/17-cardinality-safe-metrics.mjs) verifies unbounded label keys are rejected at the wrapper boundary; adapter coverage owns enum-value validation. | +| Circuit breaker memory bound under 503 backend | `tests/perf/exporter-failure.test.ts`: < 50 MB heap growth across 1,000 scopes with backend down. | +| Diagnostics finding codes | [`18-diagnostics-finding-codes.mjs`](../benchmarks/articles/18-diagnostics-finding-codes.mjs) verifies healthy snapshots, old pending tasks, cancelling scopes, pending child scopes, and cleanup timeout findings. | +| OTel optional peer | Missing peer throws explicit install message; not a cryptic resolver error. | + +--- + +## What's coming + +You now have an agent that holds itself to a budget, releases its connections, and tells you what's happening inside without sending a byte to the cloud unless you ask. + +Tomorrow: the final article. **Agent scopes and tool lifecycles.** + +The same ownership tree that cancels streams and bounds observability also +governs agent tool calls, token budgets, progress events, and replayable +execution logs. The point is not a new agent framework; it is one lifecycle +contract for the agent loop and the rest of the application. + +--- + +## Source, Benchmarks, And Evidence + +- Source: https://github.com/WorkRuntime/workit +- Article source: https://github.com/WorkRuntime/workit/blob/main/articles/06-observability-without-core-bloat.md +- Reproduce: `npm run bench:articles` and `npm run test:evidence` diff --git a/articles/07-agent-scope-and-tool-lifecycles.md b/articles/07-agent-scope-and-tool-lifecycles.md new file mode 100644 index 0000000..ffe2d19 --- /dev/null +++ b/articles/07-agent-scope-and-tool-lifecycles.md @@ -0,0 +1,230 @@ + + +# Agent Scopes And Tool Lifecycles + +*Five articles built the runtime. The sixth made it observable. This one introduces the agent primitive: `runAgent` plus `AgentScope`, with budgets, replayable events, and structured cancellation in the box.* + +The whole loop: + +```ts +import { runAgent, AgentToolCalls, OpenAITokens } from "@workit/core/ai"; +import { CostBudget, run } from "@workit/core"; + +const { result, events } = await runAgent(async (agent, ctx) => { + const plan = await agent.tool("plan", goal, planLLM, + { tokens: 600, cost: 0.001, retry: 2 }); + + for (const step of plan.steps) { + await agent.tool(step.tool, step.input, tools[step.tool], + { tokens: 1_200, toolCalls: 1, timeout: "10s" }); + } + + return await agent.tool("synthesize", workspace, synthesizeLLM, + { tokens: 2_000, cost: 0.004 }); +}); +``` + +That call returns two things -- `result` (whatever the body returned) and `events` (the **complete, ordered, type-discriminated trace** of the run). No external tracing setup. No DSL. The body is plain `async`/`await`, the tools are plain functions, and every `agent.tool(...)` call is a typed primitive whose budget, retry, and timeout policy live in the call site. + +This is the practical lifecycle primitive between *"I wired up an LLM call and a tool router"* and *"I can explain, bound, cancel, and replay the run."* + +--- + +## The contract -- `agent.tool(name, input, fn, opts)` + +```ts +interface AgentScope { + readonly id: string; + readonly events: readonly AgentEvent[]; + tool( + name: string, + input: I, + fn: (input: I, ctx: TaskContext) => O | Promise, + opts: AgentToolOptions, + ): Promise; +} + +interface AgentToolOptions { + tokens: number; // charged against OpenAITokens budget + cost: number; // charged against CostBudget budget + toolCalls: number; // charged against AgentToolCalls budget + retry: number | RetryOpts; + timeout: Duration; +} +``` + +Five things to notice: + +- **The tool function is a plain `(input, ctx) => Promise`.** No generators. No effect type. No "tool description JSON schema" to feed an LLM -- that's your application's job, not the runtime's. +- **Budgets are charged before the call returns.** Overrun rejects synchronously and cancels the owning scope with `CancelReason { kind: "budget", budgetKey, limit, spent }`. Runtime budget accounting stops at the cap. +- **`retry`/`timeout` are per-tool**, composing with the same engine described in articles 02 and 05. +- **`ctx.signal` inside the tool body is linked to the parent scope.** Client disconnects, deadline fires, sibling fails -- all aborts propagate into the tool body so its `fetch` / `db.query` / `provider.call` aborts at the I/O boundary. +- **`agent.events` is a readonly buffer** that mirrors the event stream. After the run, `events` is a replayable log of the whole loop. + +--- + +## A 50-cent agent with a hard tool-call cap + +```ts +import { runAgent, AgentToolCalls, OpenAITokens } from "@workit/core/ai"; +import { CostBudget, run } from "@workit/core"; + +await run.context.with(CostBudget, { spent: 0, limit: 0.50, unit: "USD" }, +() => run.context.with(OpenAITokens, { spent: 0, limit: 100_000, unit: "tokens" }, +() => run.context.with(AgentToolCalls, { spent: 0, limit: 20, unit: "tool_calls" }, + () => runAgent(async (agent) => reactLoop(agent, goal)), +))); +``` + +Three caps, three reasons: + +| Budget | What it bounds | What overrun does | +|---|---|---| +| `CostBudget` | Aggregate USD across the whole run | Rejects with `BudgetExceededError` and cancels the owning scope. The 32 inflight LLM/tool calls see the abort on `ctx.signal`; provider-side billing depends on the provider honoring cancellation. | +| `OpenAITokens` | Total tokens across all LLM calls | Same shape. Use a dedicated key per provider when you want separate caps. | +| `AgentToolCalls` | Total tool calls -- fan-out limiter | Stops a runaway agent from invoking tools forever. Bench 19-B caps it at 1 and the second tool call fails closed. | + +> **Bench [`19-agent-scope.mjs`](../benchmarks/articles/19-agent-scope.mjs).** Five scenarios -- measured. +> +> | # | Scenario | Result | +> |---|---|---| +> | A | Tool events bracket execution | Single `agent.tool("calc", 3, x => x*x)` call -> 4 events `[agent:started, agent:tool_started, agent:tool_succeeded, agent:completed]`, sequential `seq: [1,2,3,4]`, monotonic `at`, stable `agentId`. | +> | B | `AgentToolCalls` cap hit | `limit: 1`. Second call rejects with `BudgetExceededError`, `budgetKey: "AgentToolCalls"`, `limit: 1`. | +> | C | `OpenAITokens` charged via opts | `{ tokens: 50 }` then `{ tokens: 25 }` -> final `spent: 75` exactly. | +> | D | Parent scope cancel during tool | `ctx.scope.cancel({ kind: "manual", tag: "user-stop" })` mid-tool -> tool body's `ctx.signal` aborts, outer settles `CancellationError` with `reason.kind: "manual"`, `tag: "user-stop"`. | +> | E | Replayable log, 3-tool run | 8 events: `started -> (tool_started -> tool_succeeded) x 3 -> completed`. Seq `[1..8]`. Same agentId. Tool names captured in order. | + +--- + +## Replayable events -- the typed trace + +```ts +type AgentEvent = + | { type: "agent:started"; seq: number; agentId: string; at: number } + | { type: "agent:tool_started"; seq: number; agentId: string; tool: string; at: number } + | { type: "agent:tool_succeeded"; seq: number; agentId: string; tool: string; at: number } + | { type: "agent:tool_failed"; seq: number; agentId: string; tool: string; error: string; at: number } + | { type: "agent:tool_cancelled"; seq: number; agentId: string; tool: string; reason: CancelReason; at: number } + | { type: "agent:completed"; seq: number; agentId: string; at: number } + | { type: "agent:failed"; seq: number; agentId: string; error: string; at: number }; +``` + +Seven variants. Discriminated by `type`. Every variant carries `seq` and `at`. Cancelled events carry the typed `CancelReason`. + +What you can do with that: + +- **Pivot a dashboard** on `tool` x `type` for failure heatmaps without parsing logs. +- **Replay a run** in a test by walking the events array -- you have the order, the names, the timing. +- **Audit a charge** by reconstructing the budget timeline from `tool_succeeded` events tagged with the tokens / cost charged at the call site. +- **Diff two runs** on the event sequence to see exactly which tool path diverged. + +The events array on the `AgentRunResult` is `readonly` and mirrors the same event stream that flows through `scope.onEvent(...)` -- so live observers see the same shape the post-run audit log sees. + +--- + +## How does this compare + +| Stack | Tool primitive | Budget primitive | Replayable event log | Scope cancellation | Bundle | +|---|---|---|---|---|---| +| **WorkIt `runAgent`** | yes typed `(input, ctx) => O` | yes `CostBudget` / `OpenAITokens` / `AgentToolCalls` / `createBudget(...)` composable | yes `AgentRunResult.events` typed union | yes `ctx.signal` aborts each tool body | included in `@workit/core/ai` (~8 KB gzip with the rest of `/ai`) | +| LangChain agents | yes but typed loosely; many tools as JSON | no no first-class budget primitive | partial via callbacks | no no scope tree | ~hundreds of KB | +| Vercel AI SDK | yes tool schemas | no no first-class budget | events on stream | yes via `AbortSignal`, no scope tree | medium | +| Mastra | yes generators-based | partial | yes trace store | yes | medium | +| Roll-your-own with `for`-loop + `fetch` | yes, by definition | DIY | DIY | DIY | minimal but you wrote the runtime | + +The design point: **the agent primitive composes with the same `CancelReason`, `ctx.signal`, `defer`, budget, and `scope.tree()` machinery from articles 01-06**. There is no second runtime. You don't choose between "the agent loop's lifecycle" and "the rest of your app's lifecycle" -- they share one tree. + +--- + +## A complete, runnable example + +```ts +import { runAgent, AgentToolCalls, OpenAITokens } from "@workit/core/ai"; +import { CostBudget, run, renderTree } from "@workit/core"; + +const tools = { + search: async ({ q }, ctx) => + fetch(`https://api.search.dev/q=${q}`, { signal: ctx.signal }).then(r => r.json()), + + fetchPage: async ({ url }, ctx) => + fetch(url, { signal: ctx.signal }).then(r => r.text()), + + summarize: async ({ text }, ctx) => + openai.chat({ messages: [{ role: "user", content: `tl;dr: ${text}` }] }, + { signal: ctx.signal }), +}; + +const { result, events } = await run.context.with( + CostBudget, { spent: 0, limit: 0.50, unit: "USD" }, + () => run.context.with( + AgentToolCalls, { spent: 0, limit: 12, unit: "tool_calls" }, + () => runAgent(async (agent) => { + const hits = await agent.tool("search", + { q: "structured concurrency typescript" }, tools.search, + { toolCalls: 1, timeout: "5s", retry: 2 }); + + const docs = await Promise.all(hits.slice(0, 3).map((hit, i) => + agent.tool(`fetchPage[${i}]`, + { url: hit.url }, tools.fetchPage, + { toolCalls: 1, timeout: "10s" }))); + + return await agent.tool("summarize", + { text: docs.join("\n\n") }, tools.summarize, + { tokens: 4_000, cost: 0.02, toolCalls: 1, timeout: "30s" }); + }), + ), +); + +console.log(result); +console.log(events.map(e => + `${e.seq.toString().padStart(2)} ${e.type}${"tool" in e ? ` (${e.tool})` : ""}`, +).join("\n")); +``` + +That's an agent that searches, fetches three pages, summarises, and stops at 50 cents or 12 tool calls -- whichever comes first. Cancel the parent scope and every in-flight `fetch` and LLM stream aborts at the TCP layer. No manual `AbortController` plumbing. No "did I forget to thread the signal." No `try/catch` around the agent loop. + +--- + +## Receipts + +```sh +node benchmarks/articles/19-agent-scope.mjs # 5 contract scenarios +node benchmarks/articles/run-all.mjs # full 19-bench suite +``` + +Production-side gates that back the same surface: + +| Claim | Evidence | +|---|---| +| Tool events bracket execution with monotonic seq | [`19-agent-scope.mjs`](../benchmarks/articles/19-agent-scope.mjs) A verifies four ordered events, sequential `seq`, stable `agentId`, and monotonic `at`. | +| `AgentToolCalls` overflow rejects with `BudgetExceededError` | Bench 19 B sets `limit: 1`; the second tool call throws with `budgetKey: "AgentToolCalls"`. | +| `OpenAITokens` consumed via `{ tokens: N }` | Bench 19 C verifies the final token budget `spent` is exactly `75`. | +| Parent scope cancel propagates into tool body | Bench 19 D verifies the tool body observes abort and the outer scope settles with the original manual reason. | +| Replayable, ordered, typed event log | Bench 19 E verifies eight events, sequential `seq`, monotonic `at`, and tool names in call order. | +| Tool failure surfaces as `agent:tool_failed` | Unit coverage verifies tool errors propagate and are captured in the typed event log. | + +--- + +## Closing The Series + +The important part is not that WorkIt has an agent helper. The important part is +that the agent helper is not a second runtime. Tool calls, token budgets, +timeouts, retries, cancellation, progress events, and cleanup all use the same +ownership tree as the rest of the library. + +The public claims behind this series are tracked in +[`evidence/claims.json`](../evidence/claims.json), exercised by +`npm run test:evidence`, and benchmarked by `npm run bench:articles`. The prose +is intentionally not the evidence store; it is the readable path through the +engineering tradeoffs. + +--- + +## Source, Benchmarks, And Evidence + +- Source: https://github.com/WorkRuntime/workit +- Article source: https://github.com/WorkRuntime/workit/blob/main/articles/07-agent-scope-and-tool-lifecycles.md +- Reproduce: `npm run bench:articles` and `npm run test:evidence` diff --git a/articles/README.md b/articles/README.md new file mode 100644 index 0000000..a57e3ea --- /dev/null +++ b/articles/README.md @@ -0,0 +1,81 @@ + + +# WorkIt Article Series + +Seven articles. Each opens with code and a concrete problem, then links the +claim to executable evidence. The argument builds from plain `async` / `await` +ownership to worker boundaries, streaming backpressure, budgeted cleanup, +observability, and finally agent tool lifecycles. + +The sequence focuses on practical high-pressure workloads: AI agents, provider +racing, streaming STT, 100K-document budget caps, 1-billion-row pipelines, +worker hard-kill against a CPU spinner, local-first observability, and the +`runAgent` / `AgentScope` primitive. + +## Reading Order + +1. [`01-owned-async-work.md`](01-owned-async-work.md) -- why promises model values, not ownership. +2. [`02-concurrency-retry-timeout.md`](02-concurrency-retry-timeout.md) -- composing pool, race, any, retry, timeout, fallback, and hedge policies. +3. [`03-cancellation-and-worker-boundaries.md`](03-cancellation-and-worker-boundaries.md) -- cooperative cancellation and hard worker termination. +4. [`04-backpressure-for-streaming-pipelines.md`](04-backpressure-for-streaming-pipelines.md) -- bounded producers for RAG, STT, and large streams. +5. [`05-resource-safety-and-budgeted-work.md`](05-resource-safety-and-budgeted-work.md) -- bracketed cleanup, uncancellable sections, and request budgets. +6. [`06-observability-without-core-bloat.md`](06-observability-without-core-bloat.md) -- diagnostics, telemetry, sampling, and exporter isolation. +7. [`07-agent-scope-and-tool-lifecycles.md`](07-agent-scope-and-tool-lifecycles.md) -- agent tools, budgets, events, and replayable execution logs. + +## Editorial Rules + +- **Code first.** Open with a runnable snippet that is the point. +- **Claims map to gates.** Every number cited maps to `npm run verify`, `npm run bench:articles`, `npm run test:evidence`, or [`evidence/claims.json`](../evidence/claims.json). +- **No theatrical comparisons.** Do not say "10x faster than X" without a benchmark. Say what the executable invariant verifies. +- **Ownership/composition framing.** External libraries may expose cancellation hooks; WorkIt's claim is that cancellation, cleanup, retry, timeout, budgets, backpressure, and diagnostics compose under one owner. +- **Agent and data-plane scenarios stay first-class.** Provider racing, agent cancellation, RAG ingest, streaming STT, embedding pipelines, token/cost/tool-call budgets are the core scenarios. +- **Honesty about layers.** Cooperative cancellation is labeled cooperative. Worker hard-kill is labeled worker-boundary hard kill. Browser/edge are labeled unsupported. + +## Headline Numbers + +These numbers are reproducible from the gates and captured benchmark result. +Use representative timing language unless a value is asserted by a gate. + +```txt +214 unit tests, 100% line/branch/function coverage +0 production dependencies, 0 install scripts, 0 networking imports in core dist +14,175 B core-group-import minified / 4,835 B gzip +29,255 B public-api minified / 9,694 B gzip +126,136 B max heap growth in 100k task soak @ concurrency 128 +1,000,000 logical items in stream memory gate, bounded heap +1,000,000,000 logical items in 1B claim sample, <= TAKE+CONCURRENCY produced +well under the 10 ms gate for 100 .with() calls over 5,000 keys +200ms timeout vs CPU spinner: late-marker file does not exist +19 article-series benchmarks, all green +tracked claim evidence suite classified by lifecycle/correctness/security/release/performance +``` + +## Reproducing The Receipts + +```sh +npm run verify +npm run bench:articles +npm run test:evidence + +npm run sample:race +npm run sample:agent +npm run sample:embed100k +npm run sample:1b +npm run sample:worker +npm run sample:stt-disconnect +npm run check:public-proof +``` + +The captured article benchmark result for this publication revision is +[`benchmarks/results/articles.latest.json`](../benchmarks/results/articles.latest.json). + +## The Single Labeled Gap + +Browser and edge runtimes resolve to an explicit unsupported boundary today. +A dedicated edge-safe context runtime, semantic invariant tests, and +installed-tarball Worker fixtures are future work. Node 20+ ESM/CJS, supported +server-shaped fixtures, and the package-consumer matrix run from the installed +artifact under the repository gates.