From 67622f2a6f5d29c219fb1dc2c19b3469eb04cdc7 Mon Sep 17 00:00:00 2001 From: skobeltsyn Date: Mon, 15 Jun 2026 01:32:16 +0300 Subject: [PATCH 1/2] =?UTF-8?q?release:=200.8.0=20=E2=80=94=20interoperabl?= =?UTF-8?q?e,=20multimodal=20agents=20(+=20capability=20grants)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cut 0.8.0 from main (75 commits since v0.7.24): A2A v1, full multimodal (audio/vision/image-gen/TTS), RAG seam, composition operators (handoff/firstOf/ speculative/loopUntil/aggregators/forum captains), HITL + eval, history compression, the Gemini provider (#1917), agent.json (#4516), capability grants (#4545), and the agentic-web standards groundwork (AGNTCY/AG-UI/x402/NLWeb). - build.gradle.kts: 0.7.25-SNAPSHOT -> 0.8.0 - CHANGELOG: [Unreleased] -> [0.8.0] - 2026-06-15 (+ fresh [Unreleased]) - roadmap: 0.8.0 theme reframed (sandbox backends slipped); Docker/proxy/ read-confinement -> 0.9.0; WasmSandbox closed won't-do; agent->WASM = #4547 - RELEASE_NOTES.md rewritten for 0.8.0 README dependency snippet intentionally NOT bumped yet (stays 0.7.24) — per docs/RELEASE_RUNBOOK.md step 6 it advances only AFTER 0.8.0 resolves on Maven Central. checkReadmeVersion is therefore expected-RED on this PR until then. Co-Authored-By: Claude Opus 4.8 (1M context) --- CHANGELOG.md | 16 +++++ RELEASE_NOTES.md | 155 +++++++++++++++++------------------------------ build.gradle.kts | 2 +- docs/roadmap.md | 15 ++++- 4 files changed, 86 insertions(+), 102 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 49e21512..98ca4aaf 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,6 +4,22 @@ All notable changes to Agents.KT are documented here. The format follows [Keep a ## [Unreleased] +## [0.8.0] — 2026-06-15 + +**Interoperable, multimodal agents — with capability grants.** The largest minor since 0.5.0: +agent-to-agent interop (**A2A v1**), full **multimodal** (audio STT/TTS, vision, image generation), +a **RAG** seam, richer **composition** (`handoff` / `firstOf` / `.speculative` / `loopUntil` / +built-in aggregators / forum captains), **human-in-the-loop** gates, an **eval** harness, history +compression, an **eighth model provider (Google Gemini)**, **agent.json** definition serialization, +and the **capability-grants** DSL (`grants { allow / confirm }`). Plus the planning groundwork for +the agentic-web standards (AGNTCY / AG-UI / x402 / NLWeb — PRD §12.6–§12.9). Additive: existing +public API surfaces are preserved. + +*Deferred to 0.9.0:* the remaining Layer-2 **sandbox backends** — `DockerSandbox` (#2895), the +network hostname-allowlist **proxy** (#2893), and **read confinement** (#4546). `WasmSandbox` (#2894) +was closed won't-do; the rational WASM direction (agent → WASM export, #4547) is a separate +forward-looking track. + ### Added — Google Gemini provider adapter (#1917) Eighth built-in `ModelClient`: `model { gemini("gemini-2.5-flash"); apiKey = ... }` for Google's diff --git a/RELEASE_NOTES.md b/RELEASE_NOTES.md index 392212d9..a9fd0178 100644 --- a/RELEASE_NOTES.md +++ b/RELEASE_NOTES.md @@ -1,103 +1,58 @@ -# Agents.KT v0.7.24 — Perplexity: web-grounded search with citations + the truth-surface pass - -**Release date:** 2026-06-12 - -0.7.24's headline is the **Perplexity connector** and the **`perplexitySearch` tool**: agents can now reach the Sonar models directly via `model { perplexity("sonar") }`, or stay on their own model and fetch live, cited web facts through a typed search tool that records both the answer and its sources in the audit row. The release also lands the **truth-surface pass** — a docs/version-identity trust patch that brings SECURITY.md, production-hardening, skill-routing and HITL documentation in line with the shipped runtime, backed by a new `DocsConsistencyTest`. Plus a docs-accuracy pass that fixes false-negative roadmap signals, a small detekt-baseline reduction, and a Kotlin 2.4.0 upgrade. - -Drop-in on the 0.7.x line — additive only, no public-API change to existing surfaces. - -## Added — Perplexity connector + web-grounded search (epic #3674) - -### `PerplexityClient` — seventh model provider (#3675) - -A thin OpenAI-compatible `OpenAiClient` subclass for `api.perplexity.ai`, mirroring `DeepSeekClient` / `KimiClient` / `OpenRouterClient`. Selectable via `model { perplexity("sonar") }`; supports `sonar` / `sonar-pro` / `sonar-reasoning-pro` / `sonar-deep-research`. Unlike Kimi and DeepSeek, Perplexity accepts OpenAI's `response_format` json_schema, so its constrained-decoding gate stays **on** — typed `@Generable` outputs work end-to-end. - -`ModelProvider`, `ModelConfig.perplexityBaseUrl`, `ModelBuilder.perplexity(...)`, the factory dispatch, and the permission manifest are all wired. Key from `.secrets/perplexity-key` or `PERPLEXITY_API_KEY` env. - -```kotlin -val analyst = agent("analyst") { - model { perplexity("sonar-deep-research"); apiKey = System.getenv("PERPLEXITY_API_KEY") } - skills { - skill("brief", "Produce a typed market brief from a query") { - tools(/* no tools — use Perplexity's own web access */) - } - } -} -``` - -### `perplexitySearch` tool — web-grounded search with citations (#3676) - -Lets an agent on its **own** model (Claude, OpenAI, Ollama, anything) reach into Perplexity for live, cited facts. `tools { +perplexitySearchTool(key) }` registers it; `untrustedOutput = true` so results are wrapped in the `{"trusted":false}` envelope per #642 and flagged as data, not instructions. - -The result renders the answer plus a numbered source list parsed from `search_results[]` (falling back to `citations[]`); sources reach both the model context and the JSONL audit row, so the audit lane carries the provenance of every cited fact. - -```kotlin -val researcher = agent("researcher") { - model { claude("claude-opus-4-7"); apiKey = anthropicKey } - tools { +perplexitySearchTool(perplexityKey) } - skills { - skill("research", "Research with citations") { tools("perplexitySearch") } - } -} -``` - -### Search controls + structured output (#3677) - -`perplexitySearchOptions { }` maps directly to Perplexity's documented request params: - -- `search_mode` — `web` / `academic` / `sec` -- `search_recency_filter` -- `search_domain_filter` — allowlist + `-`-prefixed denylist -- `web_search_options.search_context_size` — `low` / `medium` / `high` -- `reasoning_effort` -- Native `response_format` json_schema via `structuredOutput(MyType::class)` from a `@Generable` type - -### `chatCompletionsPath` seam on `OpenAiClient` (#3675) - -The chat-completions path is now overridable (default `/v1/chat/completions`); `PerplexityClient` overrides to `/chat/completions` — Perplexity serves no `/v1` segment and hitting `/v1` there 404s with an empty body. Behavior unchanged for OpenAI / DeepSeek / Kimi / OpenRouter. - -## Docs — accurate shipped signals - -External gap analysis surfaced false-negative signals in the roadmap — multimodal / reactive-UI streaming / the session model were marked unchecked while the README and shipped code said otherwise. The roadmap even contradicted itself: `AgentSession.events` shown as shipped on line 78 and not-shipped on lines 73 / 83. Cleaned up so docs (and future AI consumers reading the repo) get correct signals: - -- Multi-turn `AgentSession` (#1736) — marked shipped. -- `AgentSession.events` `Flow` (#1736) + `agent.observe { }` (#965) — marked shipped. -- Vision / document multimodal input across Anthropic / OpenAI / Ollama — marked shipped. -- Remaining open items (automatic compaction, Pipeline-stage event types) left as `[ ]`. - -## Changed — truth-surface pass (docs + version identity catch up with the runtime) - -An external 0.7.23 review rated the runtime well ahead of its public truth surface. This release closes that gap: - -- **Version identity:** between releases, `main` now carries a `-SNAPSHOT` version (runbook step 8), so unreleased commits never masquerade as the published artifact. `checkReadmeVersion` understands the dev state: on a `-SNAPSHOT` it requires the README to advertise the last published release. -- **SECURITY.md rewritten to 0.7.x reality:** seven providers over four wire shapes; tool sandboxing and `McpServer` authentication are no longer "out of scope" — Layer-1 in-JVM filesystem gating (#2890), the Layer-2 OS sandbox (#1916), and `McpServerAuth` are documented alongside an honest remaining-gaps list. -- **production-hardening.md:** `ToolPolicy` is enforcement, not "audit evidence"; subprocess tools point at the fail-closed `processTool` (#2914). -- **Skill-routing docs match the fail-loud runtime (#3087):** the pre-0.7.21 silent first-match fallback is gone from `model-and-tools.md` and the wiki, replaced by the real `SkillRoutingException` behavior with a migration note. -- **permission-manifest.md** no longer advertises Maven coordinates for `agents-kt-manifest` (never published to Central — only `agents-kt` and `agents-kt-ksp` are); it's documented as an in-repo module. -- **regulated-deployment.md HITL** documents the shipped `humanApproval { }` → `ApprovalRequest` → `resumeWith(HumanDecision)` path (#2489) and `onBefore*` decisions (#1907) instead of a never-shipped `Decision` variant. -- **New `DocsConsistencyTest`** pins provider-count claims to `ModelProvider.entries`, doc `Decision.X` references to the real sealed variants, and the routing table to `SkillRoutingException` — this class of drift now fails `./gradlew test`. - -## Refactored — one type per file burndown complete (#3199) - -`PerplexitySearch.kt` was the last multi-type file. Split into one type per file (`Args` / `Source` / `Result` / `Options` / `OptionsBuilder` / `Backend` / `HttpBackend` / `Exception` + `SearchMode` / `Recency` / `ContextSize`), keeping only the pure wire helpers and the `perplexitySearchTool` factory in `PerplexitySearch.kt`. The `checkOneTypePerFile` guard's allowlist is now empty — future commits cannot reintroduce multi-type files without failing CI. - -## Maintainability — detekt baseline 415 → 410 - -Five real cleanups, no mechanical wraps. Replaced inline `kotlinx.coroutines.flow.FlowCollector` FQNs with imports in `ClaudeClient` / `OpenAiClient` SSE parsers, wrapped two over-long lines that read better wrapped, converted `MockTcpMcpServer`'s unused `private val acceptThread` into an `init { }` block (same start-on-construction, no retained handle). The bulk of the remaining MaxLineLength baseline is intentional — table-aligned test fixtures and inline JSON wire-templates that read worse if wrapped — and was left alone. - -## Dependencies - -- **Kotlin 2.3.21 → 2.4.0** — compiler, stdlib, reflect across every module + KSP. -- **`org.jline:jline` 3.27.1 → 4.1.3.** -- **detekt 1.23.7 → 1.23.8.** -- **KSP API 2.3.7 → 2.3.9** — matches Kotlin 2.4.0. +# Agents.KT v0.8.0 — Interoperable, multimodal agents, with capability grants + +**Release date:** 2026-06-15 + +0.8.0 is the largest minor since 0.5.0. The boundary-first runtime grows outward: it now **talks to +other agents**, **sees and hears**, **composes in richer shapes**, and lets you **grant capabilities +explicitly** — while keeping the typed `Agent` contract and the audit/manifest spine intact. +Additive throughout: existing public API surfaces are preserved (drop-in on the 0.7.x line). + +## Headlines + +- **A2A v1 — agent-to-agent interop (#3864).** Agents.KT agents are A2A servers (typed skills exposed + via AgentCard) and typed A2A clients — cross-system discovery and invocation over the wire. +- **Multimodal, end to end.** Vision input across Claude / OpenAI / Ollama (#2466–#2470); audio as + tools — `transcribe_audio` / `speak` with self-hosted Whisper / Qwen adapters (#4501), an in-process + `:agents-kt-whisper-jni` STT module (#4505), and image generation + TTS (#3867). Weights never ship + in the jar. +- **Eighth model provider: Google Gemini (#1917).** A full from-scratch adapter (Gemini is not + OpenAI-compatible): `contents`/`parts`, `functionDeclarations` tool calling, native SSE streaming, + `responseJsonSchema` constrained decoding, thought-summary reasoning, `inlineData` vision. Joins + Ollama / Anthropic / OpenAI / DeepSeek / Kimi / OpenRouter / Perplexity. +- **Capability grants (#4545).** `grants { allow(writeFile); confirm(deploy) }` — `allow` tools are + freely callable; `confirm` tools require the **granting agent's** authorization (fail-closed), not a + human gate. Build-validated; opt-in. +- **Richer composition.** `handoff` (#3871), `firstOf` / `.speculative(n)` (#3869), `loopUntil` + + `evalGate` (#3870), built-in aggregators on `/` (#3872), and built-in forum captains (#3877). +- **RAG seam (#3863)** — `EmbeddingStore` SPI + query-aware knowledge, with LangChain4j / Spring-AI + adapter modules. +- **Human-in-the-loop + eval.** `HumanGateRegistry` (#3868); a typed eval harness with + LLM-as-judge and cross-model regression (#3876). +- **agent.json (#4516)** — deterministic, byte-stable serialization of an agent's definition + (distinct from the permission manifest and the A2A AgentCard). +- **Agentic-web standards groundwork** — PRD §12.6–§12.9 plan AGNTCY (OASF/DIR/Identity), AG-UI, + x402 payments, and NLWeb, positioned against the runtime. + +Plus history compression (#3865), pipeline stage events (#4491), compaction strategies (#4492), +typed tool hooks (#4493), memory retention strategies (#4515), W3C trace propagation across MCP/A2A +(#3873), and an antifragility hardening pass (#4495–#4500). + +See [`CHANGELOG.md`](CHANGELOG.md) for the full, itemized list. + +## Deferred to 0.9.0 + +The Layer-2 **sandbox backends** originally pencilled for 0.8 slipped — they want a Linux-capable +environment to build and verify: + +- `DockerSandbox` (#2895), the network hostname-allowlist **egress proxy** (#2893), and + **read confinement** (#4546) move to **0.9.0**. +- `WasmSandbox` (#2894) was closed **won't-do** — embedding a WASM runtime to sandbox tools isn't + rational (`ProcessSandbox` already covers it). The rational WASM direction — compiling **agents** to + WASM for portable execution — is a separate forward-looking track (#4547, starting with a + feasibility spike). ## Compatibility -Drop-in on the 0.7.x line. The Perplexity surface is additive; no public-API change to existing connectors, the agentic loop, or the audit boundary. The Kotlin 2.4.0 upgrade is binary-compatible for consumers on Kotlin 2.3.x or 2.4.x; existing tests pass byte-for-byte. - -## What's not in this release - -- Pipeline-stage event types in the streaming surface — still pending. -- Automatic conversation compaction in `AgentSession` — still pending. -- Closing #2791 (the turn-loop core of `executeAgentic`) — last open child of the #2790 maintainability epic, deliberately deferred as the highest-risk refactor. +Additive, no breaking changes to existing public API. The capability-grants block, A2A surfaces, +multimodal tools, and the Gemini provider are all opt-in. CodeQL's `java-kotlin` check is red on the +Kotlin 2.4 toolchain (upstream codeql#21938) — the Gradle build is the gate. diff --git a/build.gradle.kts b/build.gradle.kts index bfff3643..8beaaadf 100644 --- a/build.gradle.kts +++ b/build.gradle.kts @@ -16,7 +16,7 @@ plugins { } group = "ai.deep-code" -version = "0.7.25-SNAPSHOT" +version = "0.8.0" repositories { mavenCentral() diff --git a/docs/roadmap.md b/docs/roadmap.md index f13b89c2..0892eccd 100644 --- a/docs/roadmap.md +++ b/docs/roadmap.md @@ -8,9 +8,22 @@ 0.5.0 Agents with boundaries — shipped 0.6.0 Boundaries you can audit — shipped (epic [#1911](../../issues/1911)) 0.7.0 Boundaries you can enforce externally — shipped (epic [#2879](../../issues/2879)) -0.8.0 Sandbox backends + capability grants — next (Wasm/Docker/proxy/grants) +0.8.0 Interoperable, multimodal agents (+ grants) — shipped (A2A v1, multimodal, RAG, composition, Gemini, capability grants) +0.9.0 Layer-2 sandbox backends — next (Docker/proxy/read-confinement) ``` +**0.8.0 shipped:** agent-to-agent interop (**A2A v1** — server + typed client), full **multimodal** +(audio STT/TTS, vision, image generation), the **RAG** seam, richer **composition** (`handoff` / +`firstOf` / `.speculative` / `loopUntil` / built-in aggregators / forum captains), **human-in-the-loop** +gates + the **eval** harness, history compression, an eighth model provider (**Google Gemini** #1917), +**agent.json** serialization (#4516), and **capability grants** (`grants { allow / confirm }` #4545), +plus the agentic-web standards groundwork (AGNTCY / AG-UI / x402 / NLWeb, PRD §12.6–§12.9). The +"sandbox backends" originally pencilled for 0.8 slipped: `WasmSandbox` ([#2894](../../issues/2894)) was +closed won't-do (embedded-WASM-for-tools isn't rational; agent → WASM export is the separate forward +track [#4547](../../issues/4547)), and `DockerSandbox` ([#2895](../../issues/2895)), the egress +hostname-allowlist proxy ([#2893](../../issues/2893)), and read confinement ([#4546](../../issues/4546)) +move to **0.9.0** (they want a Linux-capable environment to build + verify). + **0.7.0 shipped (epic [#2879](../../issues/2879)):** runtime *enforcement* of declared tool policies — Layer 1 in-JVM filesystem gate ([#2890](../../issues/2890)) + Layer 2 OS sandbox ([#1916](../../issues/1916)): macOS Seatbelt, Linux bubblewrap, firejail setuid fallback, plain-`ProcessBuilder` fallback; write-root + env + cwd confinement; default-deny network — and the standalone **`agents-kt` CLI** ([#1923](../../issues/1923)) for manifest generate/inspect/verify outside Gradle. **Deferred to 0.8:** `WasmSandbox` ([#2894](../../issues/2894)), `DockerSandbox` ([#2895](../../issues/2895)), the network hostname-allowlist proxy ([#2893](../../issues/2893)), and the `grants { }` structure DSL. **0.6.0 hero feature:** the **permission manifest / capability graph** ([#1912](../../issues/1912)) — a deterministic YAML/JSON artifact showing every agent / skill / tool / memory access / MCP endpoint / provider / budget / policy boundary in a system. Build-time evidence for security review; the manifest hash ([#1913](../../issues/1913)) propagates into every runtime audit event so dynamic behaviour ties back to the signed-off capability graph. From f720958a6540ba3c26be83df51699103ff08723c Mon Sep 17 00:00:00 2001 From: skobeltsyn Date: Mon, 15 Jun 2026 02:01:20 +0300 Subject: [PATCH 2/2] docs: align release surfaces to 0.8.0 + bump main to 0.8.1-SNAPSHOT MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 0.8.0 is published on Maven Central (agents-kt + agents-kt-ksp both resolvable); this brings the in-repo release surfaces into agreement (release-integrity fix). - build.gradle.kts: 0.8.0 -> 0.8.1-SNAPSHOT (runbook step 8 — main carries the next -SNAPSHOT; v0.8.0 is the tagged/published release point) - README: dependency snippet 0.7.24 -> 0.8.0 (now resolvable on Central); "latest release" blurb rewritten for the 0.8.0 story; streaming line 7->8 providers (+ Gemini native SSE) - docs/comparison.md: provider count 7->8 (+ Gemini); status note 0.7.23 -> 0.8.0 with the real 0.8.0 feature list - docs/permission-manifest.md: dep snippet -> 0.8.0 - docs/applications.md: "seven providers" -> eight - docs/streaming.md: 7->8 providers (four native incl. Gemini); "as of" -> 0.8.0 - docs/threat-model.md: "as of 0.7.24" -> 0.8.0 - docs/prd.md: drop the "CONFIDENTIAL" label (public repo) -> "Public design document" + a shipped/in-progress/planned disclaimer pointing at CHANGELOG/roadmap CHANGELOG [0.8.0], roadmap reframe, and RELEASE_NOTES were already on this branch. Gates: checkReadmeVersion + checkSnapshotPolicy + DocsConsistencyTest green. Co-Authored-By: Claude Opus 4.8 (1M context) --- README.md | 6 +++--- build.gradle.kts | 2 +- docs/applications.md | 2 +- docs/comparison.md | 4 ++-- docs/permission-manifest.md | 2 +- docs/prd.md | 7 ++++++- docs/streaming.md | 4 ++-- docs/threat-model.md | 2 +- 8 files changed, 17 insertions(+), 12 deletions(-) diff --git a/README.md b/README.md index 5122d9fc..2bfaafe7 100644 --- a/README.md +++ b/README.md @@ -37,7 +37,7 @@ The 0.6–0.7 line turns those boundaries into reviewable evidence: deterministi ```kotlin // build.gradle.kts dependencies { - implementation("ai.deep-code:agents-kt:0.7.24") + implementation("ai.deep-code:agents-kt:0.8.0") } ``` @@ -262,7 +262,7 @@ What the framework does **not** enforce — your responsibility: - **Seven LLM providers shipped** — Ollama, Anthropic, OpenAI, DeepSeek, Kimi (Moonshot AI, #2697), OpenRouter (#2701), and Perplexity (Sonar, #3675) — the last with a `perplexitySearch` web-grounded search tool (#3676 / #3677). Google (Gemini) is the main adapter still on the roadmap (Phase 2); the injectable `ModelClient` covers test stubs and your own adapters in the meantime. - **Synchronous agentic loop** — `runBlocking` inside the loop until the suspend refactor lands (#638). Calling agents from existing coroutine scopes works but doesn't propagate cancellation cleanly. - **No built-in MCP rate limiter** — use `McpServer` auth/policy plus a gateway for throttling. Agent/runtime audit events have a first-party JSONL exporter in `:agents-kt-observability`. -- **Streaming runtime** *(shipped — v0.5.0)*. `agent.session(input): AgentSession` exposes `events: Flow>` — bracket events (`SkillStarted` / `SkillCompleted` / `Completed` / `Failed`) plus mid-loop `Token` / `Reasoning` / `ToolCallStarted` / `ToolCallArgumentsDelta` / `ToolCallFinished` events as the agentic loop runs. All events carry `requestId`, `sessionId`, and `manifestHash` for audit correlation (#1913). All seven providers stream at the wire — Ollama (NDJSON), Anthropic and OpenAI (native SSE), with DeepSeek / Kimi / OpenRouter / Perplexity inheriting the OpenAI-compatible SSE path; live integration tests measure 19 / 2 / 19 chunks for the original three native adapters. `SkillCompleted.tokensUsed` and `Completed.tokensUsed` carry cumulative `TokenUsage` across all turns. The underlying `LlmChunk` sealed type + `ModelClient.chatStream(messages): Flow` foundation (#1722) is what custom adapters plug into. See [docs/streaming.md](docs/streaming.md) for the full API + the [v0.5.0 streaming premortem](docs/premortem-0.5.0-streaming.md) for design rationale. +- **Streaming runtime** *(shipped — v0.5.0)*. `agent.session(input): AgentSession` exposes `events: Flow>` — bracket events (`SkillStarted` / `SkillCompleted` / `Completed` / `Failed`) plus mid-loop `Token` / `Reasoning` / `ToolCallStarted` / `ToolCallArgumentsDelta` / `ToolCallFinished` events as the agentic loop runs. All events carry `requestId`, `sessionId`, and `manifestHash` for audit correlation (#1913). All eight providers stream at the wire — Ollama (NDJSON), Anthropic, OpenAI, and Gemini (native SSE), with DeepSeek / Kimi / OpenRouter / Perplexity inheriting the OpenAI-compatible SSE path; live integration tests measure 19 / 2 / 19 chunks for the original three native adapters. `SkillCompleted.tokensUsed` and `Completed.tokensUsed` carry cumulative `TokenUsage` across all turns. The underlying `LlmChunk` sealed type + `ModelClient.chatStream(messages): Flow` foundation (#1722) is what custom adapters plug into. See [docs/streaming.md](docs/streaming.md) for the full API + the [v0.5.0 streaming premortem](docs/premortem-0.5.0-streaming.md) for design rationale. - *Partial cancellation today.* `Flow` collection cancels promptly, and `perToolTimeout` now applies to both regular and session-aware tool calls. Synchronous skill bodies and blocking HTTP reads still are not fully coroutine-cancellable mid-call; the remaining adapter migration is the `sendAsync`/suspend-refactor track. - *Composition flow-through shipped (#3866).* Every composition operator exposes `session(...)`, and every `then` overload chains streaming — pipelines mixing `Parallel` / `Forum` / `Loop` / `Branch` mid-chain stream all nested agents' events through the parent session, demultiplexable by `agentId`. Remaining: pipeline-stage event types (`StageStarted` / `PipelineCompleted`). - **No native binary** — JVM-only (≥ JDK 21). GraalVM and `jlink` bundles are Phase 2 priorities. @@ -318,7 +318,7 @@ Topical guides: ## Current Release -The latest published release is `0.7.24` — **Perplexity + truth surface.** (Between releases, `main` carries unreleased work under a `-SNAPSHOT` version; see the [CHANGELOG](CHANGELOG.md) *Unreleased* section for what's landed since.) Perplexity becomes the **seventh model provider** (`model { perplexity("sonar") }`, #3675) and any agent on any model can register the **`perplexitySearch`** tool for live, web-grounded answers with citations that reach both the model context and the JSONL audit row (#3676/#3677) — `untrustedOutput = true`, so results arrive flagged as data, not instructions. The release also lands the **truth-surface pass** an external 0.7.23 review called for: SECURITY.md / production-hardening / skill-routing / HITL docs now match the shipped runtime, `main` adopts a `-SNAPSHOT` between-releases version policy, and a new `DocsConsistencyTest` pins provider counts, `Decision` variants, and routing claims to the code. Dependency line: Kotlin 2.4.0, jline 4, detekt 1.23.8, ksp 2.3.9. Additive only — drop-in on the 0.7.x line. +The latest published release is `0.8.0` — **interoperable, multimodal agents, with capability grants** (the largest minor since 0.5.0). (Between releases, `main` carries unreleased work under a `-SNAPSHOT` version; see the [CHANGELOG](CHANGELOG.md) for what's landed since.) Highlights: **A2A v1** (agents are A2A servers + typed clients), full **multimodal** (vision across Claude/OpenAI/Ollama, audio STT/TTS tools with self-hosted Whisper/Qwen, image generation), an **eighth model provider — Google Gemini** (`model { gemini("gemini-2.5-flash") }`, a full from-scratch adapter with native SSE, function calling, `responseJsonSchema` decoding, and thought-summary reasoning), **capability grants** (`grants { allow(...); confirm(...) }` — `confirm` tools need the granting agent's authorization, fail-closed), **agent.json** serialization, the **RAG** seam, and richer **composition** (`handoff` / `firstOf` / `.speculative` / `loopUntil` / aggregators / forum captains) plus HITL gates and an eval harness. The Layer-2 **sandbox backends** (Docker / egress proxy / read confinement) move to **0.9.0**. Additive only — drop-in on the 0.x line. Dependency line: Kotlin 2.4.0. **0.7.23 — maintainability + a model-error policy.** A behavior-preserving, drop-in release that closes the bulk of the code-smell remediation epic (#2790) and finishes the **AgenticLoop decomposition** begun in 0.7.21: the `Agent` god class splits into `InterceptorChain` + `ListenerRegistry` (#2793); `McpServer` into HTTP intake + a transport-agnostic `McpDispatcher` (#2795); the five composition operators' duplicated streaming-session scaffold collapses into one `agentSessionScope` (#2797); `LiveShow`'s banner + thread-shadowing spinner become their own units (#2798); `Forum`/`Branch` lose their dual-path duplication (#2802); a typed `GenerableCodec` seam collapses the `@Generable` casts to one boundary (#2803); and `executeAgentic`'s last setup block extracts as `resolveAllowedTools` (#3423). The one **new public API** is **`onLLMError`** (#3508): when a model is configured, a failed model call in the agentic loop fails fast and loud by default, with `onLLMError { e -> RespondWith(fallback) | Rethrow }` as the opt-in recovery hook. The detekt-baseline ratchet fell 423 → 415 and `@Suppress("UNCHECKED_CAST")` 42 → 30 across the release. Only #2791 (the turn-loop core of `executeAgentic`) remains open in the epic. diff --git a/build.gradle.kts b/build.gradle.kts index 8beaaadf..90b2bc1d 100644 --- a/build.gradle.kts +++ b/build.gradle.kts @@ -16,7 +16,7 @@ plugins { } group = "ai.deep-code" -version = "0.8.0" +version = "0.8.1-SNAPSHOT" repositories { mavenCentral() diff --git a/docs/applications.md b/docs/applications.md index aad94b13..a7572a12 100644 --- a/docs/applications.md +++ b/docs/applications.md @@ -41,7 +41,7 @@ Coding agents, CI/CD agents, DevSecOps. JVM-native, lives inside the build / rep | **IDE coding agents** — code review, in-editor refactoring, doc-generation assistants. | MCP-as-skills (`McpClient.toolSkills()`) fronts many tool servers from one agent, `Tool` typed handles compile-checked at agent construction, streaming via `agent.session(input)` with `AgentEvent.Token` / `ToolCall*` chunks. | | **CI/CD agents** — flaky-test diagnosis, dependency-bump validation, release-notes drafting. | `BudgetConfig` caps for runaway loops (`maxTurns` / `maxToolCalls` / `maxDuration` / `maxTokens` / `maxAgentDepth` / **`maxToolArgsBytes`** that rejects oversized injected args before the executor runs), `onBudgetExceeded` extension hook for the recoverable reasons, `Pipeline` composition for read → build → test → summarise chains, reproducible CI eval via `DeterministicModelClient`. | | **DevSecOps** — vulnerability triage, secret-leak drafting, IaC-policy review. | Declarative `ToolPolicy` records (capability evidence for compliance review), **`ToolCapabilityExtractor`** statically classifies what each executor body does (fs / net / process / env reads), `Forum` operator for multi-tool consensus, manifest-hash correlation in audit. | -| **Code-review bots** — typed structured outputs that flow into existing JVM tooling. | `@Generable` outputs deserialise into your domain types directly — no JSON-to-class glue layer; constrained decoding wired across all seven providers. | +| **Code-review bots** — typed structured outputs that flow into existing JVM tooling. | `@Generable` outputs deserialise into your domain types directly — no JSON-to-class glue layer; constrained decoding wired across all eight providers. | | **Build pipelines** — automated dependency-bump assessment, supply-chain summarisation. | **`onLLMError`** recovery hook: when the model is reachable but flakey, fall back to a typed canned response (`RespondWith(fallback)`) rather than failing the CI build; with no model, `implementedBy` skills run deterministically and no model error can arise. | **The positioning:** developer tooling lives where the JVM is already installed. The framework's correctness story (typed tools, single-placement rule, ambiguous-skill loud-fail, audit log) maps directly onto "we don't want a wrong commit / wrong merge / wrong release". diff --git a/docs/comparison.md b/docs/comparison.md index b39b9f0d..919c9770 100644 --- a/docs/comparison.md +++ b/docs/comparison.md @@ -30,7 +30,7 @@ A side-by-side for teams choosing a framework. Written with the constraint of be ## Where Agents.KT loses -**Ecosystem.** LangChain has 700+ integrations (vector stores, retrievers, embedders, agents, callbacks). Agents.KT has 7 LLM providers (Ollama, Anthropic, OpenAI, DeepSeek, Kimi, OpenRouter, Perplexity — see [providers.md](providers.md)) and you write the rest. If your job is "wire up 12 SaaS APIs into a prompt pipeline by Friday," LangChain is the right tool, not this one. +**Ecosystem.** LangChain has 700+ integrations (vector stores, retrievers, embedders, agents, callbacks). Agents.KT has 8 LLM providers (Ollama, Anthropic, OpenAI, DeepSeek, Kimi, OpenRouter, Perplexity, Gemini — see [providers.md](providers.md)) and you write the rest. If your job is "wire up 12 SaaS APIs into a prompt pipeline by Friday," LangChain is the right tool, not this one. **Python AI/ML interop.** If your team already has Python notebooks for embedding generation, fine-tuning, eval harnesses — running an Agents.KT layer next to them is a context switch. SK's Python flavor or LangChain stay in the same language. @@ -139,7 +139,7 @@ A few shortcuts that point at one framework over the others: ## Status notes (2026-06) -- **Agents.KT 0.7.23 (latest release)** — runtime ToolPolicy enforcement (in-JVM gate + OS sandbox), the standalone `agents-kt` CLI, tamper-evident audit ledger, fail-loud skill routing, and the `onLLMError` policy — on top of 0.6.0's permission manifests, JSONL audit export, OTel / LangSmith / Langfuse bridges, and constrained decoding. Unreleased `main` adds Perplexity as the seventh provider plus the `perplexitySearch` grounded tool (see CHANGELOG *Unreleased*). +- **Agents.KT 0.8.0 (latest release)** — interoperable, multimodal agents with capability grants: **A2A v1** (server + typed client), full **multimodal** (audio STT/TTS, vision, image generation), the **RAG** seam, richer **composition** (`handoff` / `firstOf` / `.speculative` / `loopUntil` / aggregators / forum captains), **HITL** gates + an **eval** harness, an eighth model provider (**Google Gemini**), **agent.json** serialization, and the **capability-grants** DSL (`grants { allow / confirm }`) — on top of 0.7.x's runtime ToolPolicy enforcement (in-JVM gate + OS sandbox), the standalone `agents-kt` CLI, tamper-evident audit ledger, fail-loud skill routing, the `onLLMError` policy, the Perplexity connector + `perplexitySearch` grounded tool, and 0.6.0's permission manifests / JSONL audit export / OTel-LangSmith-Langfuse bridges / constrained decoding. *Deferred to 0.9.0:* the remaining Layer-2 sandbox backends (Docker / egress proxy / read confinement). - **LangChain 0.3.x** — stable, ecosystem mature. LCEL is the recommended composition surface. - **Semantic Kernel 1.x** — stable, MCP integration in preview. - **AutoGen 0.4.x** — major architectural rewrite landed; the new core/agentchat split is recent. diff --git a/docs/permission-manifest.md b/docs/permission-manifest.md index 369a97df..b924c5c8 100644 --- a/docs/permission-manifest.md +++ b/docs/permission-manifest.md @@ -30,7 +30,7 @@ Add the manifest module: ```kotlin dependencies { // published on Maven Central — use the latest released version (see the README quickstart) - implementation("ai.deep-code:agents-kt:0.7.24") + implementation("ai.deep-code:agents-kt:0.8.0") // in-repo module — build from this repository (not yet published to Central; // only agents-kt and agents-kt-ksp are) implementation(project(":agents-kt-manifest")) diff --git a/docs/prd.md b/docs/prd.md index b0e5dde6..0d9ec669 100644 --- a/docs/prd.md +++ b/docs/prd.md @@ -7,7 +7,12 @@ --- **Product Requirements Document — Version 1.4** -**March 2026 · CONFIDENTIAL** +**March 2026 · Public design document** + +> This PRD is the living design vision for Agents.KT. It mixes **shipped**, **in-progress**, and +> **planned/exploratory** capabilities — treat forward-looking sections as direction, not a shipped- +> feature list. For what is actually released, see [CHANGELOG.md](../CHANGELOG.md), the +> [roadmap](roadmap.md), and the [README](../README.md). K.Skobeltsyn Studio Konstantin Skobeltsyn, CEO diff --git a/docs/streaming.md b/docs/streaming.md index cb3208cb..dc0d854c 100644 --- a/docs/streaming.md +++ b/docs/streaming.md @@ -54,7 +54,7 @@ All subtypes carry `agentId`, `requestId`, `sessionId`, and `manifestHash`. `age ## Provider streaming status -All seven providers stream at the wire: three adapters implement `ModelClient.chatStream` natively, and the four OpenAI-compatible providers inherit `OpenAiClient`'s SSE implementation. Numbers below are from the live integration tests under `./gradlew integrationTest` against real APIs. +All eight providers stream at the wire: four adapters implement `ModelClient.chatStream` natively (Ollama NDJSON; Anthropic, OpenAI, and Gemini SSE), and the four OpenAI-compatible providers (DeepSeek / Kimi / OpenRouter / Perplexity) inherit `OpenAiClient`'s SSE implementation. Numbers below are from the live integration tests under `./gradlew integrationTest` against real APIs. | Provider | Protocol | File | Live measurement (count 1–10 prompt) | |---|---|---|---| @@ -138,7 +138,7 @@ Concurrent legs (`Parallel` via `/`, `Forum` via `*`) demultiplex purely by `age The one fallback: an operator instance constructed **outside** its factory functions (`then` / `/` / `*` / `.loop` / `.branch`) has no recorded session exec — it executes non-streaming and only its boundary events appear. **Stage boundaries are first-class (#4491):** Pipeline sessions emit `StageStarted`/`StageCompleted` pairs around each direct component (agent stages by name, operator legs labeled `parallel`/`forum`/`loop`/`branch`; nested pipelines mark their own stages exactly once) — consumers no longer infer stage transitions from `agentId` flips. -## Known gaps (current as of 0.7.24) +## Known gaps (current as of 0.8.0) - *(closed by #4491: stage-boundary markers shipped — see Composition above.)* - *(closed by #4499: cancelling collection / `await()` now cancels the underlying suspending invocation — see Cancellation above.)* diff --git a/docs/threat-model.md b/docs/threat-model.md index 4133dc9e..18ee1fdf 100644 --- a/docs/threat-model.md +++ b/docs/threat-model.md @@ -197,7 +197,7 @@ Swarm.discover().forEach { sibling -> | Logging tool args / outputs to a file that gets shipped to a vendor log aggregator | Tool args / outputs often contain user PII or secrets. Redact at the `onToolUse` listener level before logging. The framework gives you the hook; it doesn't redact for you. | | Agent that calls itself recursively as a tool (via Swarm or otherwise) without a loop budget | `maxToolCalls` and `maxTurns` bound it, but the cost can spiral before the cap fires. Use `Loop` with explicit `maxIterations` for any self-feedback shape. | -## What's enforced where (security-relevant, as of 0.7.24) +## What's enforced where (security-relevant, as of 0.8.0) This is the canonical status table — README, `SECURITY.md`, and `production-hardening.md` summarize it; when they disagree, this page wins (and that disagreement is a doc bug worth filing).