Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,21 @@ All notable changes to Agents.KT are documented here. The format follows [Keep a

## [Unreleased]

### Added — `nlwebSearch` tool: query an NLWeb endpoint (#4541, PRD §12.9)

`tools { +nlwebSearchTool(baseUrl = "https://example.com") }` lets an agent on its own model query an
[NLWeb](https://github.com/nlweb-ai/NLWeb) endpoint — a website's natural-language interface over its
**schema.org**-structured content — and fold the ranked, typed results into context. Mirrors
`perplexitySearch`: marked `untrustedOutput = true` (fetched web content is wrapped in the
`{trusted:false}` envelope and the model is warned to treat it as data, #642), with pure
`buildNlWebAskBody` / `parseNlWebResponse` wire helpers and an injectable `NlWebSearchBackend` seam.
Posts to `<baseUrl>/ask` (no API key — NLWeb endpoints are public); `NlWebSearchOptions(site, mode =
LIST/SUMMARIZE/GENERATE)` selects the namespace and query mode; results render as a numbered list of
schema.org matches (name, `@type`, description, url) plus any summarize/generate answer. The first slice
of the agent↔web-content layer (epic #4539). 8 tests. (Every NLWeb endpoint is also an MCP server, so an
NLWeb `/mcp` URL is equally consumable through the existing MCP client; this tool is the zero-wiring
`/ask`-over-HTTP path.)

## [0.8.0] — 2026-06-14

**Interoperable, multimodal agents — with capability grants.** The largest minor since 0.5.0:
Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -207,6 +207,7 @@ These APIs work in `main`, are unit-tested, and are exercised by integration tes
- **Vision input to models** — `LlmMessage(role = "user", content = "...", images = listOf(ImagePart(base64, ImagePart.WireMime.Png)))` (#2470 slice a) reaches all four built-in adapters: Ollama emits `images: [<b64>...]`, Claude emits `{type:"image", source:{type:"base64",...}}` content blocks, OpenAI emits `{type:"image_url", image_url:{url:"data:..."}}` content blocks, DeepSeek inherits OpenAI (silently ignored on non-vision models). Closed `ImagePart.WireMime { Png, Jpeg, Gif, Webp }` — no `String` mime. Programmatic `VisionFixtures.threeSquaresPng()` / `housePng()` (256×256, `BufferedImage`-rendered, ~5KB) + per-provider live tests (qwen3-vl:8b / Haiku 4.5 / gpt-4o-mini) with cost discipline. See [docs/multimodal.md](docs/multimodal.md#vision-input--talking-to-the-model-2470-slice-a).
- **Typed `Content.Image` at the agent surface** — `agent.invokeWithAttachments("describe", attachments = listOf(Content.Image(ref, ImageMime.Png)))` (#2470 slice b). Inject a `BlobStore` via `blobStore(store)` in the agent DSL; the runtime dereferences each `Content.Image` against the store, base64-encodes once, and attaches `ImagePart` to the first user message. Closed `ImageMime → ImagePart.WireMime` mapping covers all four variants. Misconfiguration errors fast (no `blobStore` configured, missing blob for a ref). Composes with snapshot/resume — refs travel in the snapshot; the same store dereferences on resume. Suspending sibling `invokeSuspendWithAttachments`. Live tests across all three vision providers via the agent surface. See [docs/multimodal.md](docs/multimodal.md#agent-attachments--typed-contentimage-at-the-invoke-surface-2470-slice-b).
- **Web-grounded search tool (`perplexitySearch`)** — `tools { +perplexitySearchTool(perplexityKey) }` lets an agent reasoning on its *own* model (Claude/OpenAI/Ollama/…) fetch live, cited facts from Perplexity's Sonar API. The tool is `untrustedOutput = true`, so results are auto-wrapped in the `{"trusted":false}` envelope and the model is warned to treat them as data, not instructions (#642) — web search is the canonical prompt-injection vector. The result renders the answer plus a numbered source list parsed from `search_results[]` (citations land in both the model context and the JSONL audit row). Controls via `perplexitySearchOptions { mode = SearchMode.ACADEMIC; recency = SearchRecency.WEEK; allowDomains("arxiv.org"); contextSize = SearchContextSize.HIGH; structuredOutput(MyType::class) }` map to `search_mode` / `search_recency_filter` / `search_domain_filter` / `web_search_options` / `response_format` json_schema (#3674). Key from `.secrets/perplexity-key`. See [docs/providers.md](docs/providers.md#web-grounded-search-tool-perplexitysearch-3676--3677).
- **NLWeb endpoint tool (`nlwebSearch`)** — `tools { +nlwebSearchTool(baseUrl = "https://example.com") }` lets an agent query an [NLWeb](https://github.com/nlweb-ai/NLWeb) endpoint — a website's natural-language interface over its **schema.org**-structured content — and fold the ranked, typed results into context (#4541, PRD §12.9). Like `perplexitySearch` it is `untrustedOutput = true` (fetched web content is treated as data, not instructions). `nlwebSearchOptions`-style args via `NlWebSearchOptions(site = "podcasts", mode = NlWebMode.GENERATE)`. NLWeb endpoints need no API key. (Every NLWeb endpoint is also an MCP server, so an NLWeb `/mcp` URL is equally consumable through the existing MCP client — this tool is the zero-wiring `/ask`-over-HTTP path.)
- **Prompt caching across providers** — `agent { caching { enabled = true; cacheSystemPrompt = true; cacheToolDefs = true; cacheConversation = Rolling; ttl = 1.hours; cacheable("doc-id") { ... } } }`. Vendor-neutral DSL drives Anthropic's explicit `cache_control` breakpoints (#2658), OpenAI / DeepSeek automatic prefix caching with a stable `prompt_cache_key` routing hint (#2659 / #2661), Ollama / vLLM / SGLang engine-level KV-cache reuse (no-op hints, #2662), and surfaces cache reads + writes + hit-rate on `TokenUsage` (#2663). A prefix-stability guard (#2657) detects silent cache-busters — timestamps, UUIDs, non-deterministic ordering inside cacheable segments — and warns before you pay for a single non-cached run. Off by default; non-breaking. See [docs/caching.md](docs/caching.md).
- **JSONL audit exporter** — `:agents-kt-observability` writes append-only, one-line-per-event audit rows with `requestId`, `sessionId`, `manifestHash`, agent/skill/tool ids, event type, provider, and model; raw arguments/results are omitted by default (#1914). See [docs/observability.md](docs/observability.md).
- **ObservabilityBridge adapters** — `.observe(OtelBridge(tracer))` maps runtime events to OTel spans (#1908), `.observe(LangSmithBridge(apiKey, project))` maps the same events to LangSmith run trees (#1909), and `.observe(LangfuseBridge(publicKey, secretKey))` maps them to Langfuse traces, generations, spans, and events (#1910), while keeping core vendor-free. See [docs/observability.md](docs/observability.md).
Expand Down Expand Up @@ -259,7 +260,7 @@ What the framework does **not** enforce — your responsibility:

### Known Limitations

- **Seven LLM providers shipped** — Ollama, Anthropic, OpenAI, DeepSeek, Kimi (Moonshot AI, #2697), OpenRouter (#2701), and Perplexity (Sonar, #3675) — the last with a `perplexitySearch` web-grounded search tool (#3676 / #3677). Google (Gemini) is the main adapter still on the roadmap (Phase 2); the injectable `ModelClient` covers test stubs and your own adapters in the meantime.
- **Eight LLM providers shipped** — Ollama, Anthropic, OpenAI, DeepSeek, Kimi (Moonshot AI, #2697), OpenRouter (#2701), Perplexity (Sonar, #3675) — the last with a `perplexitySearch` web-grounded search tool (#3676 / #3677) — and Google Gemini (#1917, a full from-scratch adapter with native SSE, function calling, and `responseJsonSchema` decoding). The injectable `ModelClient` covers test stubs and your own adapters.
- **Synchronous agentic loop** — `runBlocking` inside the loop until the suspend refactor lands (#638). Calling agents from existing coroutine scopes works but doesn't propagate cancellation cleanly.
- **No built-in MCP rate limiter** — use `McpServer` auth/policy plus a gateway for throttling. Agent/runtime audit events have a first-party JSONL exporter in `:agents-kt-observability`.
- **Streaming runtime** *(shipped — v0.5.0)*. `agent.session(input): AgentSession<OUT>` exposes `events: Flow<AgentEvent<OUT>>` — bracket events (`SkillStarted` / `SkillCompleted` / `Completed<OUT>` / `Failed`) plus mid-loop `Token` / `Reasoning` / `ToolCallStarted` / `ToolCallArgumentsDelta` / `ToolCallFinished` events as the agentic loop runs. All events carry `requestId`, `sessionId`, and `manifestHash` for audit correlation (#1913). All eight providers stream at the wire — Ollama (NDJSON), Anthropic, OpenAI, and Gemini (native SSE), with DeepSeek / Kimi / OpenRouter / Perplexity inheriting the OpenAI-compatible SSE path; live integration tests measure 19 / 2 / 19 chunks for the original three native adapters. `SkillCompleted.tokensUsed` and `Completed.tokensUsed` carry cumulative `TokenUsage` across all turns. The underlying `LlmChunk` sealed type + `ModelClient.chatStream(messages): Flow<LlmChunk>` foundation (#1722) is what custom adapters plug into. See [docs/streaming.md](docs/streaming.md) for the full API + the [v0.5.0 streaming premortem](docs/premortem-0.5.0-streaming.md) for design rationale.
Expand Down
2 changes: 1 addition & 1 deletion docs/prd.md
Original file line number Diff line number Diff line change
Expand Up @@ -2949,7 +2949,7 @@ Tracking: epic `[interop] x402 agent payments`, deferred — seller-side experim

**Query shape** (the `/ask` and `/mcp` endpoints, same args): `query` (required), `site`, `prev` (conversation history — server is stateless), `mode` (`list` = ranked results, `summarize` = list + LLM summary, `generate` = full RAG answer), `streaming`. Response: `{query_id, results[]}` where each result is `{url, name, site, score, description, schema_object}` (`schema_object` = the schema.org JSON). Build tolerant of two divergent schemas — the implemented `schema_object` shape and the newer nlweb.ai v0.55 `query/context/prefer/meta` envelope.

**Client-side (consume NLWeb as knowledge) — do opportunistically, ~free.** A thin helper over the MCP client: point it at an NLWeb `/mcp` URL, `tools/call` the `ask` tool, surface each `schema_object` into a `KnowledgeProvider`/retrieval source. Mode mapping: `list`→retrieval source, `generate`→delegate-the-answer. The honest, shippable claim is *"agents.kt MCP clients can consume NLWeb endpoints today."*
**Client-side (consume NLWeb as knowledge) — SHIPPED (#4541).** `tools { +nlwebSearchTool(baseUrl) }` — a tool (mirroring `perplexitySearch`) that posts to an NLWeb `/ask` endpoint and folds the ranked schema.org results into the agent's context, `untrustedOutput = true`. `NlWebSearchOptions(site, mode = LIST/SUMMARIZE/GENERATE)` selects namespace + mode. No API key (NLWeb endpoints are public). This is the zero-wiring `/ask`-over-HTTP path; because every NLWeb endpoint is also an MCP server, an NLWeb `/mcp` URL is *equally* consumable through the existing MCP client (`tools/call` the `ask` tool) — so *"agents.kt agents can consume NLWeb endpoints today"* holds via both transports.

**Server-side (expose agent data as an NLWeb endpoint) — deferred, niche.** That means standing up schema.org-shaped data + a vector store + an LLM-in-the-loop retrieval pipeline behind `/ask` + `/mcp` — effectively building/operating a RAG service. An independent benchmark (Univ. Mannheim, [arXiv 2511.23281](https://arxiv.org/abs/2511.23281)) finds NLWeb *ties* RAG/MCP on effectiveness but plain RAG is more cost-effective — so NLWeb's value is standardization, not performance. This is an **application** concern, not a runtime primitive; defer unless a concrete consumer needs to discover our content over the open web.

Expand Down
48 changes: 48 additions & 0 deletions src/main/kotlin/agents_engine/model/HttpNlWebSearchBackend.kt
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
package agents_engine.model

import java.net.URI
import java.net.http.HttpClient
import java.net.http.HttpRequest
import java.net.http.HttpResponse
import kotlin.time.Duration
import kotlin.time.toJavaDuration

/**
* Default [NlWebSearchBackend] (#4541) — POSTs to `<baseUrl>/ask` and parses the
* schema.org result list. NLWeb endpoints are public, so there is no auth header.
* Reuses the same JDK HttpClient shape as [HttpPerplexitySearchBackend].
*/
class HttpNlWebSearchBackend(
private val baseUrl: String,
private val requestTimeout: Duration = OpenAiClient.DEFAULT_REQUEST_TIMEOUT,
connectTimeout: Duration = OpenAiClient.DEFAULT_CONNECT_TIMEOUT,
httpClient: HttpClient? = null,
) : NlWebSearchBackend {

private val http: HttpClient = httpClient ?: HttpClient.newBuilder()
.connectTimeout(connectTimeout.toJavaDuration())
.build()

override fun search(query: String, options: NlWebSearchOptions): NlWebSearchResult {
val body = buildNlWebAskBody(query, options)
val request = HttpRequest.newBuilder()
.uri(URI.create("${baseUrl.trimEnd('/')}/ask"))
.timeout(requestTimeout.toJavaDuration())
.header("content-type", "application/json")
.POST(HttpRequest.BodyPublishers.ofString(body))
.build()
val response = http.send(request, HttpResponse.BodyHandlers.ofString())
if (response.statusCode() >= HTTP_BAD_REQUEST) {
// Try to surface the endpoint's error message; fall back to the status line.
val parsed = runCatching { parseNlWebResponse(response.body()) }
parsed.exceptionOrNull()?.let { throw it }
throw NlWebSearchException("NLWeb HTTP ${response.statusCode()}: ${response.body().take(ERROR_BODY_CAP)}")
}
return parseNlWebResponse(response.body())
}

private companion object {
const val HTTP_BAD_REQUEST = 400
const val ERROR_BODY_CAP = 500
}
}
9 changes: 9 additions & 0 deletions src/main/kotlin/agents_engine/model/NlWebMode.kt
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
package agents_engine.model

/**
* NLWeb `/ask` query mode (#4541). `LIST` returns the ranked schema.org matches;
* `SUMMARIZE` adds an LLM summary of the list; `GENERATE` is full RAG — the
* endpoint composes a direct answer from the retrieved items. Sent lowercase on
* the wire. Defaults to `LIST`.
*/
enum class NlWebMode { LIST, SUMMARIZE, GENERATE }
15 changes: 15 additions & 0 deletions src/main/kotlin/agents_engine/model/NlWebResult.kt
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
package agents_engine.model

/**
* One result from an NLWeb `/ask` response (#4541): a ranked match backed by the
* site's schema.org-structured content. [schemaType] is the `@type` lifted from
* the result's `schema_object` (e.g. `Recipe`, `PodcastEpisode`) when present.
*/
data class NlWebResult(
val url: String,
val name: String? = null,
val site: String? = null,
val score: Double? = null,
val description: String? = null,
val schemaType: String? = null,
)
115 changes: 115 additions & 0 deletions src/main/kotlin/agents_engine/model/NlWebSearch.kt
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
package agents_engine.model

import agents_engine.generation.LenientJsonParser
import agents_engine.internal.toJsonString

/**
* `agents_engine/model/NlWebSearch.kt` — #4541 (PRD §12.9), the `nlwebSearch`
* tool factory plus its pure request/response wire helpers. Supporting types
* live one-per-file alongside (`NlWebSearchArgs`, `NlWebMode`,
* `NlWebSearchOptions`, `NlWebResult`, `NlWebSearchResult`, `NlWebSearchBackend`
* + `HttpNlWebSearchBackend`, `NlWebSearchException`).
*
* [NLWeb](https://github.com/nlweb-ai/NLWeb) gives a website a natural-language
* interface over its **schema.org-structured content**. This tool lets an agent
* on its OWN model ask an NLWeb endpoint and fold the ranked, schema.org-typed
* results into its context — the inbound, external-knowledge counterpart to
* MCP-tools. It is marked [ToolDef.untrustedOutput] so the agentic loop wraps the
* result in the `{trusted:false}` envelope and warns the model to treat fetched
* web content as data, not instructions (#642).
*
* (Every NLWeb endpoint is also an MCP server, so an NLWeb `/mcp` URL is equally
* consumable through the existing MCP client; this tool is the zero-wiring
* `/ask`-over-HTTP path for an agent on any model.)
*
* Register on an agent via the `tools { }` DSL:
* ```
* tools { +nlwebSearchTool(baseUrl = "https://example.com") }
* ```
*/

/**
* Build the NLWeb `/ask` request body. Pure + internal so it is unit-testable
* without a live call. Streaming is disabled so the response is a single JSON
* blob; [NlWebSearchOptions.site] is omitted when null.
*/
internal fun buildNlWebAskBody(query: String, options: NlWebSearchOptions): String {
val fields = buildList {
add(""""query":${query.toJsonString()}""")
options.site?.let { add(""""site":${it.toJsonString()}""") }
add(""""mode":${options.mode.name.lowercase().toJsonString()}""")
add(""""streaming":false""")
}
return "{${fields.joinToString(",")}}"
}

/**
* Parse an NLWeb `/ask` response body into an [NlWebSearchResult]. Pure +
* internal so it is unit-testable without a live call.
*
* - `results[]` ← each `{url, name, site, score, description, schema_object}`;
* `schemaType` is `schema_object.@type` when present.
* - `answer` ← a top-level `summary` / `answer` (present in `SUMMARIZE` /
* `GENERATE` mode), else null.
* - a top-level `error` raises [NlWebSearchException].
*/
internal fun parseNlWebResponse(rawJson: String): NlWebSearchResult {
val root = LenientJsonParser.parse(rawJson) as? Map<*, *>
?: throw NlWebSearchException("NLWeb response was not a JSON object")

root["error"]?.let { err ->
val message = (err as? Map<*, *>)?.get("message") ?: err
throw NlWebSearchException("NLWeb error: $message")
}

val queryId = root["query_id"] as? String
val answer = (root["summary"] as? String) ?: (root["answer"] as? String)
val results = (root["results"] as? List<*>).orEmpty().mapNotNull { parseNlWebResult(it) }
return NlWebSearchResult(results = results, answer = answer, queryId = queryId)
}

private fun parseNlWebResult(item: Any?): NlWebResult? {
val obj = item as? Map<*, *> ?: return null
val url = obj["url"] as? String ?: return null
val schemaType = (obj["schema_object"] as? Map<*, *>)?.get("@type") as? String
return NlWebResult(
url = url,
name = obj["name"] as? String,
site = obj["site"] as? String,
score = (obj["score"] as? Number)?.toDouble(),
description = obj["description"] as? String,
schemaType = schemaType,
)
}

/**
* Build the `nlweb_search` tool. Register via `tools { +nlwebSearchTool(baseUrl) }`.
*
* - `untrustedOutput = true` — results are auto-wrapped in the `{trusted:false}`
* envelope and the model is warned to treat them as data (#642).
* - On a blank query or a backend failure, returns an `"ERROR: …"` string
* (the agentic loop's standard tool-error convention) rather than throwing.
*
* @param baseUrl the NLWeb endpoint base URL (e.g. `http://localhost:8000`); `/ask` is appended.
* @param options default query options (`site` namespace + list/summarize/generate `mode`).
* @param backend override the network backend — injected in tests.
*/
fun nlwebSearchTool(
baseUrl: String,
options: NlWebSearchOptions = NlWebSearchOptions(),
backend: NlWebSearchBackend = HttpNlWebSearchBackend(baseUrl),
): ToolDef = ToolDef(
name = "nlweb_search",
description = "Query an NLWeb endpoint — a website's natural-language interface — for schema.org-" +
"structured answers from its content (its catalog, articles, recipes, etc.). Arguments: {query: string}.",
argsType = NlWebSearchArgs::class,
untrustedOutput = true,
) { args ->
val query = args["query"]?.toString().orEmpty()
if (query.isBlank()) {
"ERROR: missing 'query'"
} else {
runCatching { backend.search(query, options) }
.getOrElse { e -> "ERROR: nlweb_search failed: ${e.message}" }
}
}
15 changes: 15 additions & 0 deletions src/main/kotlin/agents_engine/model/NlWebSearchArgs.kt
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
package agents_engine.model

import agents_engine.generation.Generable
import agents_engine.generation.Guide

/**
* The single `@Generable` argument of the `nlwebSearch` tool (#4541): the
* natural-language query to ask an [NLWeb](https://github.com/nlweb-ai/NLWeb)
* endpoint, which answers from a website's schema.org-structured content.
*/
@Generable("Arguments for a natural-language query against an NLWeb endpoint")
data class NlWebSearchArgs(
@Guide("The natural-language query to ask the NLWeb site")
val query: String,
)
Loading
Loading