diff --git a/CHANGELOG.md b/CHANGELOG.md index 6fac69b..c80b531 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,6 +4,21 @@ All notable changes to Agents.KT are documented here. The format follows [Keep a ## [Unreleased] +### Added — `nlwebSearch` tool: query an NLWeb endpoint (#4541, PRD §12.9) + +`tools { +nlwebSearchTool(baseUrl = "https://example.com") }` lets an agent on its own model query an +[NLWeb](https://github.com/nlweb-ai/NLWeb) endpoint — a website's natural-language interface over its +**schema.org**-structured content — and fold the ranked, typed results into context. Mirrors +`perplexitySearch`: marked `untrustedOutput = true` (fetched web content is wrapped in the +`{trusted:false}` envelope and the model is warned to treat it as data, #642), with pure +`buildNlWebAskBody` / `parseNlWebResponse` wire helpers and an injectable `NlWebSearchBackend` seam. +Posts to `/ask` (no API key — NLWeb endpoints are public); `NlWebSearchOptions(site, mode = +LIST/SUMMARIZE/GENERATE)` selects the namespace and query mode; results render as a numbered list of +schema.org matches (name, `@type`, description, url) plus any summarize/generate answer. The first slice +of the agent↔web-content layer (epic #4539). 8 tests. (Every NLWeb endpoint is also an MCP server, so an +NLWeb `/mcp` URL is equally consumable through the existing MCP client; this tool is the zero-wiring +`/ask`-over-HTTP path.) + ## [0.8.0] — 2026-06-14 **Interoperable, multimodal agents — with capability grants.** The largest minor since 0.5.0: diff --git a/README.md b/README.md index 2bfaafe..09f80d3 100644 --- a/README.md +++ b/README.md @@ -207,6 +207,7 @@ These APIs work in `main`, are unit-tested, and are exercised by integration tes - **Vision input to models** — `LlmMessage(role = "user", content = "...", images = listOf(ImagePart(base64, ImagePart.WireMime.Png)))` (#2470 slice a) reaches all four built-in adapters: Ollama emits `images: [...]`, Claude emits `{type:"image", source:{type:"base64",...}}` content blocks, OpenAI emits `{type:"image_url", image_url:{url:"data:..."}}` content blocks, DeepSeek inherits OpenAI (silently ignored on non-vision models). Closed `ImagePart.WireMime { Png, Jpeg, Gif, Webp }` — no `String` mime. Programmatic `VisionFixtures.threeSquaresPng()` / `housePng()` (256×256, `BufferedImage`-rendered, ~5KB) + per-provider live tests (qwen3-vl:8b / Haiku 4.5 / gpt-4o-mini) with cost discipline. See [docs/multimodal.md](docs/multimodal.md#vision-input--talking-to-the-model-2470-slice-a). - **Typed `Content.Image` at the agent surface** — `agent.invokeWithAttachments("describe", attachments = listOf(Content.Image(ref, ImageMime.Png)))` (#2470 slice b). Inject a `BlobStore` via `blobStore(store)` in the agent DSL; the runtime dereferences each `Content.Image` against the store, base64-encodes once, and attaches `ImagePart` to the first user message. Closed `ImageMime → ImagePart.WireMime` mapping covers all four variants. Misconfiguration errors fast (no `blobStore` configured, missing blob for a ref). Composes with snapshot/resume — refs travel in the snapshot; the same store dereferences on resume. Suspending sibling `invokeSuspendWithAttachments`. Live tests across all three vision providers via the agent surface. See [docs/multimodal.md](docs/multimodal.md#agent-attachments--typed-contentimage-at-the-invoke-surface-2470-slice-b). - **Web-grounded search tool (`perplexitySearch`)** — `tools { +perplexitySearchTool(perplexityKey) }` lets an agent reasoning on its *own* model (Claude/OpenAI/Ollama/…) fetch live, cited facts from Perplexity's Sonar API. The tool is `untrustedOutput = true`, so results are auto-wrapped in the `{"trusted":false}` envelope and the model is warned to treat them as data, not instructions (#642) — web search is the canonical prompt-injection vector. The result renders the answer plus a numbered source list parsed from `search_results[]` (citations land in both the model context and the JSONL audit row). Controls via `perplexitySearchOptions { mode = SearchMode.ACADEMIC; recency = SearchRecency.WEEK; allowDomains("arxiv.org"); contextSize = SearchContextSize.HIGH; structuredOutput(MyType::class) }` map to `search_mode` / `search_recency_filter` / `search_domain_filter` / `web_search_options` / `response_format` json_schema (#3674). Key from `.secrets/perplexity-key`. See [docs/providers.md](docs/providers.md#web-grounded-search-tool-perplexitysearch-3676--3677). +- **NLWeb endpoint tool (`nlwebSearch`)** — `tools { +nlwebSearchTool(baseUrl = "https://example.com") }` lets an agent query an [NLWeb](https://github.com/nlweb-ai/NLWeb) endpoint — a website's natural-language interface over its **schema.org**-structured content — and fold the ranked, typed results into context (#4541, PRD §12.9). Like `perplexitySearch` it is `untrustedOutput = true` (fetched web content is treated as data, not instructions). `nlwebSearchOptions`-style args via `NlWebSearchOptions(site = "podcasts", mode = NlWebMode.GENERATE)`. NLWeb endpoints need no API key. (Every NLWeb endpoint is also an MCP server, so an NLWeb `/mcp` URL is equally consumable through the existing MCP client — this tool is the zero-wiring `/ask`-over-HTTP path.) - **Prompt caching across providers** — `agent { caching { enabled = true; cacheSystemPrompt = true; cacheToolDefs = true; cacheConversation = Rolling; ttl = 1.hours; cacheable("doc-id") { ... } } }`. Vendor-neutral DSL drives Anthropic's explicit `cache_control` breakpoints (#2658), OpenAI / DeepSeek automatic prefix caching with a stable `prompt_cache_key` routing hint (#2659 / #2661), Ollama / vLLM / SGLang engine-level KV-cache reuse (no-op hints, #2662), and surfaces cache reads + writes + hit-rate on `TokenUsage` (#2663). A prefix-stability guard (#2657) detects silent cache-busters — timestamps, UUIDs, non-deterministic ordering inside cacheable segments — and warns before you pay for a single non-cached run. Off by default; non-breaking. See [docs/caching.md](docs/caching.md). - **JSONL audit exporter** — `:agents-kt-observability` writes append-only, one-line-per-event audit rows with `requestId`, `sessionId`, `manifestHash`, agent/skill/tool ids, event type, provider, and model; raw arguments/results are omitted by default (#1914). See [docs/observability.md](docs/observability.md). - **ObservabilityBridge adapters** — `.observe(OtelBridge(tracer))` maps runtime events to OTel spans (#1908), `.observe(LangSmithBridge(apiKey, project))` maps the same events to LangSmith run trees (#1909), and `.observe(LangfuseBridge(publicKey, secretKey))` maps them to Langfuse traces, generations, spans, and events (#1910), while keeping core vendor-free. See [docs/observability.md](docs/observability.md). @@ -259,7 +260,7 @@ What the framework does **not** enforce — your responsibility: ### Known Limitations -- **Seven LLM providers shipped** — Ollama, Anthropic, OpenAI, DeepSeek, Kimi (Moonshot AI, #2697), OpenRouter (#2701), and Perplexity (Sonar, #3675) — the last with a `perplexitySearch` web-grounded search tool (#3676 / #3677). Google (Gemini) is the main adapter still on the roadmap (Phase 2); the injectable `ModelClient` covers test stubs and your own adapters in the meantime. +- **Eight LLM providers shipped** — Ollama, Anthropic, OpenAI, DeepSeek, Kimi (Moonshot AI, #2697), OpenRouter (#2701), Perplexity (Sonar, #3675) — the last with a `perplexitySearch` web-grounded search tool (#3676 / #3677) — and Google Gemini (#1917, a full from-scratch adapter with native SSE, function calling, and `responseJsonSchema` decoding). The injectable `ModelClient` covers test stubs and your own adapters. - **Synchronous agentic loop** — `runBlocking` inside the loop until the suspend refactor lands (#638). Calling agents from existing coroutine scopes works but doesn't propagate cancellation cleanly. - **No built-in MCP rate limiter** — use `McpServer` auth/policy plus a gateway for throttling. Agent/runtime audit events have a first-party JSONL exporter in `:agents-kt-observability`. - **Streaming runtime** *(shipped — v0.5.0)*. `agent.session(input): AgentSession` exposes `events: Flow>` — bracket events (`SkillStarted` / `SkillCompleted` / `Completed` / `Failed`) plus mid-loop `Token` / `Reasoning` / `ToolCallStarted` / `ToolCallArgumentsDelta` / `ToolCallFinished` events as the agentic loop runs. All events carry `requestId`, `sessionId`, and `manifestHash` for audit correlation (#1913). All eight providers stream at the wire — Ollama (NDJSON), Anthropic, OpenAI, and Gemini (native SSE), with DeepSeek / Kimi / OpenRouter / Perplexity inheriting the OpenAI-compatible SSE path; live integration tests measure 19 / 2 / 19 chunks for the original three native adapters. `SkillCompleted.tokensUsed` and `Completed.tokensUsed` carry cumulative `TokenUsage` across all turns. The underlying `LlmChunk` sealed type + `ModelClient.chatStream(messages): Flow` foundation (#1722) is what custom adapters plug into. See [docs/streaming.md](docs/streaming.md) for the full API + the [v0.5.0 streaming premortem](docs/premortem-0.5.0-streaming.md) for design rationale. diff --git a/docs/prd.md b/docs/prd.md index 0d9ec66..7d8bb97 100644 --- a/docs/prd.md +++ b/docs/prd.md @@ -2949,7 +2949,7 @@ Tracking: epic `[interop] x402 agent payments`, deferred — seller-side experim **Query shape** (the `/ask` and `/mcp` endpoints, same args): `query` (required), `site`, `prev` (conversation history — server is stateless), `mode` (`list` = ranked results, `summarize` = list + LLM summary, `generate` = full RAG answer), `streaming`. Response: `{query_id, results[]}` where each result is `{url, name, site, score, description, schema_object}` (`schema_object` = the schema.org JSON). Build tolerant of two divergent schemas — the implemented `schema_object` shape and the newer nlweb.ai v0.55 `query/context/prefer/meta` envelope. -**Client-side (consume NLWeb as knowledge) — do opportunistically, ~free.** A thin helper over the MCP client: point it at an NLWeb `/mcp` URL, `tools/call` the `ask` tool, surface each `schema_object` into a `KnowledgeProvider`/retrieval source. Mode mapping: `list`→retrieval source, `generate`→delegate-the-answer. The honest, shippable claim is *"agents.kt MCP clients can consume NLWeb endpoints today."* +**Client-side (consume NLWeb as knowledge) — SHIPPED (#4541).** `tools { +nlwebSearchTool(baseUrl) }` — a tool (mirroring `perplexitySearch`) that posts to an NLWeb `/ask` endpoint and folds the ranked schema.org results into the agent's context, `untrustedOutput = true`. `NlWebSearchOptions(site, mode = LIST/SUMMARIZE/GENERATE)` selects namespace + mode. No API key (NLWeb endpoints are public). This is the zero-wiring `/ask`-over-HTTP path; because every NLWeb endpoint is also an MCP server, an NLWeb `/mcp` URL is *equally* consumable through the existing MCP client (`tools/call` the `ask` tool) — so *"agents.kt agents can consume NLWeb endpoints today"* holds via both transports. **Server-side (expose agent data as an NLWeb endpoint) — deferred, niche.** That means standing up schema.org-shaped data + a vector store + an LLM-in-the-loop retrieval pipeline behind `/ask` + `/mcp` — effectively building/operating a RAG service. An independent benchmark (Univ. Mannheim, [arXiv 2511.23281](https://arxiv.org/abs/2511.23281)) finds NLWeb *ties* RAG/MCP on effectiveness but plain RAG is more cost-effective — so NLWeb's value is standardization, not performance. This is an **application** concern, not a runtime primitive; defer unless a concrete consumer needs to discover our content over the open web. diff --git a/src/main/kotlin/agents_engine/model/HttpNlWebSearchBackend.kt b/src/main/kotlin/agents_engine/model/HttpNlWebSearchBackend.kt new file mode 100644 index 0000000..872bc97 --- /dev/null +++ b/src/main/kotlin/agents_engine/model/HttpNlWebSearchBackend.kt @@ -0,0 +1,48 @@ +package agents_engine.model + +import java.net.URI +import java.net.http.HttpClient +import java.net.http.HttpRequest +import java.net.http.HttpResponse +import kotlin.time.Duration +import kotlin.time.toJavaDuration + +/** + * Default [NlWebSearchBackend] (#4541) — POSTs to `/ask` and parses the + * schema.org result list. NLWeb endpoints are public, so there is no auth header. + * Reuses the same JDK HttpClient shape as [HttpPerplexitySearchBackend]. + */ +class HttpNlWebSearchBackend( + private val baseUrl: String, + private val requestTimeout: Duration = OpenAiClient.DEFAULT_REQUEST_TIMEOUT, + connectTimeout: Duration = OpenAiClient.DEFAULT_CONNECT_TIMEOUT, + httpClient: HttpClient? = null, +) : NlWebSearchBackend { + + private val http: HttpClient = httpClient ?: HttpClient.newBuilder() + .connectTimeout(connectTimeout.toJavaDuration()) + .build() + + override fun search(query: String, options: NlWebSearchOptions): NlWebSearchResult { + val body = buildNlWebAskBody(query, options) + val request = HttpRequest.newBuilder() + .uri(URI.create("${baseUrl.trimEnd('/')}/ask")) + .timeout(requestTimeout.toJavaDuration()) + .header("content-type", "application/json") + .POST(HttpRequest.BodyPublishers.ofString(body)) + .build() + val response = http.send(request, HttpResponse.BodyHandlers.ofString()) + if (response.statusCode() >= HTTP_BAD_REQUEST) { + // Try to surface the endpoint's error message; fall back to the status line. + val parsed = runCatching { parseNlWebResponse(response.body()) } + parsed.exceptionOrNull()?.let { throw it } + throw NlWebSearchException("NLWeb HTTP ${response.statusCode()}: ${response.body().take(ERROR_BODY_CAP)}") + } + return parseNlWebResponse(response.body()) + } + + private companion object { + const val HTTP_BAD_REQUEST = 400 + const val ERROR_BODY_CAP = 500 + } +} diff --git a/src/main/kotlin/agents_engine/model/NlWebMode.kt b/src/main/kotlin/agents_engine/model/NlWebMode.kt new file mode 100644 index 0000000..b47d16f --- /dev/null +++ b/src/main/kotlin/agents_engine/model/NlWebMode.kt @@ -0,0 +1,9 @@ +package agents_engine.model + +/** + * NLWeb `/ask` query mode (#4541). `LIST` returns the ranked schema.org matches; + * `SUMMARIZE` adds an LLM summary of the list; `GENERATE` is full RAG — the + * endpoint composes a direct answer from the retrieved items. Sent lowercase on + * the wire. Defaults to `LIST`. + */ +enum class NlWebMode { LIST, SUMMARIZE, GENERATE } diff --git a/src/main/kotlin/agents_engine/model/NlWebResult.kt b/src/main/kotlin/agents_engine/model/NlWebResult.kt new file mode 100644 index 0000000..1045752 --- /dev/null +++ b/src/main/kotlin/agents_engine/model/NlWebResult.kt @@ -0,0 +1,15 @@ +package agents_engine.model + +/** + * One result from an NLWeb `/ask` response (#4541): a ranked match backed by the + * site's schema.org-structured content. [schemaType] is the `@type` lifted from + * the result's `schema_object` (e.g. `Recipe`, `PodcastEpisode`) when present. + */ +data class NlWebResult( + val url: String, + val name: String? = null, + val site: String? = null, + val score: Double? = null, + val description: String? = null, + val schemaType: String? = null, +) diff --git a/src/main/kotlin/agents_engine/model/NlWebSearch.kt b/src/main/kotlin/agents_engine/model/NlWebSearch.kt new file mode 100644 index 0000000..7701de5 --- /dev/null +++ b/src/main/kotlin/agents_engine/model/NlWebSearch.kt @@ -0,0 +1,115 @@ +package agents_engine.model + +import agents_engine.generation.LenientJsonParser +import agents_engine.internal.toJsonString + +/** + * `agents_engine/model/NlWebSearch.kt` — #4541 (PRD §12.9), the `nlwebSearch` + * tool factory plus its pure request/response wire helpers. Supporting types + * live one-per-file alongside (`NlWebSearchArgs`, `NlWebMode`, + * `NlWebSearchOptions`, `NlWebResult`, `NlWebSearchResult`, `NlWebSearchBackend` + * + `HttpNlWebSearchBackend`, `NlWebSearchException`). + * + * [NLWeb](https://github.com/nlweb-ai/NLWeb) gives a website a natural-language + * interface over its **schema.org-structured content**. This tool lets an agent + * on its OWN model ask an NLWeb endpoint and fold the ranked, schema.org-typed + * results into its context — the inbound, external-knowledge counterpart to + * MCP-tools. It is marked [ToolDef.untrustedOutput] so the agentic loop wraps the + * result in the `{trusted:false}` envelope and warns the model to treat fetched + * web content as data, not instructions (#642). + * + * (Every NLWeb endpoint is also an MCP server, so an NLWeb `/mcp` URL is equally + * consumable through the existing MCP client; this tool is the zero-wiring + * `/ask`-over-HTTP path for an agent on any model.) + * + * Register on an agent via the `tools { }` DSL: + * ``` + * tools { +nlwebSearchTool(baseUrl = "https://example.com") } + * ``` + */ + +/** + * Build the NLWeb `/ask` request body. Pure + internal so it is unit-testable + * without a live call. Streaming is disabled so the response is a single JSON + * blob; [NlWebSearchOptions.site] is omitted when null. + */ +internal fun buildNlWebAskBody(query: String, options: NlWebSearchOptions): String { + val fields = buildList { + add(""""query":${query.toJsonString()}""") + options.site?.let { add(""""site":${it.toJsonString()}""") } + add(""""mode":${options.mode.name.lowercase().toJsonString()}""") + add(""""streaming":false""") + } + return "{${fields.joinToString(",")}}" +} + +/** + * Parse an NLWeb `/ask` response body into an [NlWebSearchResult]. Pure + + * internal so it is unit-testable without a live call. + * + * - `results[]` ← each `{url, name, site, score, description, schema_object}`; + * `schemaType` is `schema_object.@type` when present. + * - `answer` ← a top-level `summary` / `answer` (present in `SUMMARIZE` / + * `GENERATE` mode), else null. + * - a top-level `error` raises [NlWebSearchException]. + */ +internal fun parseNlWebResponse(rawJson: String): NlWebSearchResult { + val root = LenientJsonParser.parse(rawJson) as? Map<*, *> + ?: throw NlWebSearchException("NLWeb response was not a JSON object") + + root["error"]?.let { err -> + val message = (err as? Map<*, *>)?.get("message") ?: err + throw NlWebSearchException("NLWeb error: $message") + } + + val queryId = root["query_id"] as? String + val answer = (root["summary"] as? String) ?: (root["answer"] as? String) + val results = (root["results"] as? List<*>).orEmpty().mapNotNull { parseNlWebResult(it) } + return NlWebSearchResult(results = results, answer = answer, queryId = queryId) +} + +private fun parseNlWebResult(item: Any?): NlWebResult? { + val obj = item as? Map<*, *> ?: return null + val url = obj["url"] as? String ?: return null + val schemaType = (obj["schema_object"] as? Map<*, *>)?.get("@type") as? String + return NlWebResult( + url = url, + name = obj["name"] as? String, + site = obj["site"] as? String, + score = (obj["score"] as? Number)?.toDouble(), + description = obj["description"] as? String, + schemaType = schemaType, + ) +} + +/** + * Build the `nlweb_search` tool. Register via `tools { +nlwebSearchTool(baseUrl) }`. + * + * - `untrustedOutput = true` — results are auto-wrapped in the `{trusted:false}` + * envelope and the model is warned to treat them as data (#642). + * - On a blank query or a backend failure, returns an `"ERROR: …"` string + * (the agentic loop's standard tool-error convention) rather than throwing. + * + * @param baseUrl the NLWeb endpoint base URL (e.g. `http://localhost:8000`); `/ask` is appended. + * @param options default query options (`site` namespace + list/summarize/generate `mode`). + * @param backend override the network backend — injected in tests. + */ +fun nlwebSearchTool( + baseUrl: String, + options: NlWebSearchOptions = NlWebSearchOptions(), + backend: NlWebSearchBackend = HttpNlWebSearchBackend(baseUrl), +): ToolDef = ToolDef( + name = "nlweb_search", + description = "Query an NLWeb endpoint — a website's natural-language interface — for schema.org-" + + "structured answers from its content (its catalog, articles, recipes, etc.). Arguments: {query: string}.", + argsType = NlWebSearchArgs::class, + untrustedOutput = true, +) { args -> + val query = args["query"]?.toString().orEmpty() + if (query.isBlank()) { + "ERROR: missing 'query'" + } else { + runCatching { backend.search(query, options) } + .getOrElse { e -> "ERROR: nlweb_search failed: ${e.message}" } + } +} diff --git a/src/main/kotlin/agents_engine/model/NlWebSearchArgs.kt b/src/main/kotlin/agents_engine/model/NlWebSearchArgs.kt new file mode 100644 index 0000000..5af33d2 --- /dev/null +++ b/src/main/kotlin/agents_engine/model/NlWebSearchArgs.kt @@ -0,0 +1,15 @@ +package agents_engine.model + +import agents_engine.generation.Generable +import agents_engine.generation.Guide + +/** + * The single `@Generable` argument of the `nlwebSearch` tool (#4541): the + * natural-language query to ask an [NLWeb](https://github.com/nlweb-ai/NLWeb) + * endpoint, which answers from a website's schema.org-structured content. + */ +@Generable("Arguments for a natural-language query against an NLWeb endpoint") +data class NlWebSearchArgs( + @Guide("The natural-language query to ask the NLWeb site") + val query: String, +) diff --git a/src/main/kotlin/agents_engine/model/NlWebSearchBackend.kt b/src/main/kotlin/agents_engine/model/NlWebSearchBackend.kt new file mode 100644 index 0000000..1823de1 --- /dev/null +++ b/src/main/kotlin/agents_engine/model/NlWebSearchBackend.kt @@ -0,0 +1,9 @@ +package agents_engine.model + +/** + * The seam the `nlwebSearch` tool calls (#4541) — injectable so tests can return + * a canned result without network. The default is [HttpNlWebSearchBackend]. + */ +fun interface NlWebSearchBackend { + fun search(query: String, options: NlWebSearchOptions): NlWebSearchResult +} diff --git a/src/main/kotlin/agents_engine/model/NlWebSearchException.kt b/src/main/kotlin/agents_engine/model/NlWebSearchException.kt new file mode 100644 index 0000000..f524049 --- /dev/null +++ b/src/main/kotlin/agents_engine/model/NlWebSearchException.kt @@ -0,0 +1,4 @@ +package agents_engine.model + +/** Raised when an NLWeb `/ask` call returns an error envelope or a non-2xx status (#4541). */ +class NlWebSearchException(message: String) : RuntimeException(message) diff --git a/src/main/kotlin/agents_engine/model/NlWebSearchOptions.kt b/src/main/kotlin/agents_engine/model/NlWebSearchOptions.kt new file mode 100644 index 0000000..f7216f1 --- /dev/null +++ b/src/main/kotlin/agents_engine/model/NlWebSearchOptions.kt @@ -0,0 +1,12 @@ +package agents_engine.model + +/** + * Options for the `nlwebSearch` tool (#4541). [site] restricts the query to a + * configured site/namespace on the endpoint (NLWeb's `site` token); [mode] + * selects list / summarize / generate. Both are optional — a bare instance asks + * the whole endpoint in `LIST` mode. + */ +data class NlWebSearchOptions( + val site: String? = null, + val mode: NlWebMode = NlWebMode.LIST, +) diff --git a/src/main/kotlin/agents_engine/model/NlWebSearchResult.kt b/src/main/kotlin/agents_engine/model/NlWebSearchResult.kt new file mode 100644 index 0000000..252f609 --- /dev/null +++ b/src/main/kotlin/agents_engine/model/NlWebSearchResult.kt @@ -0,0 +1,33 @@ +package agents_engine.model + +/** + * Parsed result of an NLWeb `/ask` query (#4541). [render] formats the ranked + * matches into the text the agentic loop feeds back to the model (then wraps in + * the untrusted envelope) and records in the audit row — schema.org results from + * a website are external content, so they are treated as data, not instructions. + * [answer] holds the LLM-composed reply when the endpoint ran in + * `SUMMARIZE` / `GENERATE` mode (null in `LIST` mode). + */ +data class NlWebSearchResult( + val results: List, + val answer: String? = null, + val queryId: String? = null, +) { + fun render(): String = buildString { + answer?.takeIf { it.isNotBlank() }?.let { append(it.trim()).append("\n\n") } + if (results.isEmpty()) { + if (answer.isNullOrBlank()) append("No results.") + } else { + append("Results:") + results.forEachIndexed { i, r -> + append("\n[").append(i + 1).append("] ") + r.name?.takeIf { it.isNotBlank() }?.let { append(it) } + r.schemaType?.takeIf { it.isNotBlank() }?.let { append(" (").append(it).append(")") } + r.description?.takeIf { it.isNotBlank() }?.let { append(" — ").append(it.trim()) } + append("\n ").append(r.url) + } + } + } + + override fun toString(): String = render() +} diff --git a/src/main/resources/internals-agent/model/NlWebSearch.md b/src/main/resources/internals-agent/model/NlWebSearch.md new file mode 100644 index 0000000..f34102c --- /dev/null +++ b/src/main/resources/internals-agent/model/NlWebSearch.md @@ -0,0 +1,44 @@ +--- +description: Source-file knowledge for agents_engine/model/NlWebSearch.kt — the nlwebSearch tool factory (#4541, PRD §12.9) that queries an NLWeb endpoint's /ask API over HTTP and folds schema.org results into an agent's context. untrustedOutput=true; pure buildNlWebAskBody/parseNlWebResponse helpers; injectable NlWebSearchBackend (default HttpNlWebSearchBackend, no API key). Call when the IDE LLM needs to reason about consuming NLWeb / agent↔web-content retrieval. +--- + +# `agents_engine/model/NlWebSearch.kt` — the `nlwebSearch` tool (#4541) + +[NLWeb](https://github.com/nlweb-ai/NLWeb) gives a website a natural-language interface over its **schema.org**-structured content. `nlwebSearchTool(baseUrl)` is a `ToolDef` (mirroring `perplexitySearchTool`) that lets an agent on its OWN model query an NLWeb endpoint and fold the ranked, schema.org-typed results into context — the inbound, external-knowledge counterpart to MCP-tools. + +```kotlin +agent("researcher") { + model { claude("claude-opus-4-7"); apiKey = anthropicKey } // your own model + tools { +nlwebSearchTool(baseUrl = "https://example.com", + options = NlWebSearchOptions(site = "podcasts", mode = NlWebMode.GENERATE)) } + skills { /* … */ } +} +``` + +## Wire shape + +- **Request:** POST `/ask` with `{query, site?, mode, streaming:false}` (`buildNlWebAskBody`). `mode` ∈ `list` / `summarize` / `generate` (lowercased). `site` omitted when null. +- **Response (`parseNlWebResponse`):** `{query_id, results:[{url, name, site, score, description, schema_object}], summary?}`. Each result → `NlWebResult` (`schemaType` = `schema_object.@type`). `answer` ← top-level `summary` / `answer` (present in summarize/generate). A top-level `error` (string or `{message}`) raises `NlWebSearchException`. +- **No API key** — NLWeb endpoints are public. `baseUrl.trimEnd('/')` avoids a double slash before `/ask`. + +## Security + +- `untrustedOutput = true` (#642): fetched web content is wrapped in the `{trusted:false}` envelope and the model is warned to treat it as data, not instructions — NLWeb returns external website content, an injection vector. Same contract as `perplexitySearch`. +- On blank query or backend failure the executor returns an `"ERROR: …"` string (the agentic-loop tool-error convention), never throws. + +## Result rendering + +`NlWebSearchResult.render()` emits any `answer` first, then a numbered list of matches (`name (@type) — description` + url). This text is what feeds back to the model and lands in the JSONL audit row. + +## Seams & types (one-per-file, #3199) + +`NlWebSearchArgs` (`@Generable {query}`), `NlWebMode` (enum), `NlWebSearchOptions`, `NlWebResult`, `NlWebSearchResult`, `NlWebSearchBackend` (`fun interface` — inject in tests), `HttpNlWebSearchBackend` (default), `NlWebSearchException`. `buildNlWebAskBody` / `parseNlWebResponse` are pure + `internal` for hermetic unit tests. + +## Two transports + +This tool is the zero-wiring `/ask`-over-HTTP path. Because **every NLWeb endpoint is also an MCP server**, an NLWeb `/mcp` URL is equally consumable through the existing MCP client (`tools/call` the `ask` tool) — so an agents.kt agent can consume NLWeb either way. + +## Related files + +- `PerplexitySearch.kt` — the structural sibling (untrusted web-search tool with build/parse helpers + injectable backend). +- `ToolDef.kt` — `untrustedOutput`, `argsType`, `executor`. diff --git a/src/test/kotlin/agents_engine/model/NlWebSearchTest.kt b/src/test/kotlin/agents_engine/model/NlWebSearchTest.kt new file mode 100644 index 0000000..f7ba1c6 --- /dev/null +++ b/src/test/kotlin/agents_engine/model/NlWebSearchTest.kt @@ -0,0 +1,102 @@ +package agents_engine.model + +import org.junit.jupiter.api.assertThrows +import kotlin.test.Test +import kotlin.test.assertEquals +import kotlin.test.assertNull +import kotlin.test.assertTrue + +// #4541 (PRD §12.9) — the nlwebSearch tool. Hermetic: pure build/parse wire helpers + the tool +// exercised through an injected backend (no network). Mirrors the perplexitySearch test shape. + +class NlWebSearchTest { + + @Test + fun `buildNlWebAskBody includes query, mode, streaming and omits a null site`() { + val body = buildNlWebAskBody("podcasts about \"AI\"", NlWebSearchOptions()) + assertTrue("\"query\":\"podcasts about \\\"AI\\\"\"" in body, body) + assertTrue("\"mode\":\"list\"" in body, body) + assertTrue("\"streaming\":false" in body, body) + assertTrue("\"site\"" !in body, "null site omitted: $body") + } + + @Test + fun `buildNlWebAskBody emits site and lowercased mode when set`() { + val body = buildNlWebAskBody("q", NlWebSearchOptions(site = "podcasts", mode = NlWebMode.GENERATE)) + assertTrue("\"site\":\"podcasts\"" in body, body) + assertTrue("\"mode\":\"generate\"" in body, body) + } + + @Test + fun `parseNlWebResponse parses results, query_id, and schema type`() { + val json = """ + {"query_id":"abc123","results":[ + {"url":"https://x/ep/42","name":"AI Safety","site":"podcasts","score":85, + "description":"alignment talk","schema_object":{"@type":"PodcastEpisode","name":"AI Safety"}} + ]} + """.trimIndent() + val r = parseNlWebResponse(json) + assertEquals("abc123", r.queryId) + assertNull(r.answer) + val item = r.results.single() + assertEquals("https://x/ep/42", item.url) + assertEquals("AI Safety", item.name) + assertEquals("podcasts", item.site) + assertEquals(85.0, item.score) + assertEquals("alignment talk", item.description) + assertEquals("PodcastEpisode", item.schemaType) + } + + @Test + fun `parseNlWebResponse picks up a summarize answer and skips a result with no url`() { + val json = """{"summary":"Two AI podcasts.","results":[{"name":"no-url"},{"url":"https://y"}]}""" + val r = parseNlWebResponse(json) + assertEquals("Two AI podcasts.", r.answer) + assertEquals(listOf("https://y"), r.results.map { it.url }) + } + + @Test + fun `parseNlWebResponse raises on an error envelope`() { + assertThrows { parseNlWebResponse("""{"error":"site not configured"}""") } + assertThrows { parseNlWebResponse("""{"error":{"message":"bad query"}}""") } + } + + @Test + fun `render formats answer then numbered results`() { + val out = NlWebSearchResult( + results = listOf( + NlWebResult(url = "https://a", name = "Alpha", description = "first", schemaType = "Recipe"), + NlWebResult(url = "https://b", name = "Beta"), + ), + answer = "Here are two.", + ).render() + assertTrue(out.startsWith("Here are two."), out) + assertTrue("[1] Alpha (Recipe) — first" in out, out) + assertTrue("https://a" in out && "[2] Beta" in out && "https://b" in out, out) + } + + @Test + fun `tool is untrusted, returns the rendered result via the backend, and errors on blank query`() { + val canned = NlWebSearchResult(results = listOf(NlWebResult(url = "https://x", name = "X"))) + val tool = nlwebSearchTool(baseUrl = "https://example.com", backend = { _, _ -> canned }) + + assertTrue(tool.untrustedOutput, "nlweb_search must be untrustedOutput (web content = injection vector)") + assertEquals("nlweb_search", tool.name) + + val ok = tool.executor(mapOf("query" to "anything")) + assertTrue(ok is NlWebSearchResult && ok.results.single().url == "https://x", "got: $ok") + + assertEquals("ERROR: missing 'query'", tool.executor(mapOf("query" to " "))) + } + + @Test + fun `tool returns an ERROR string when the backend fails`() { + val tool = nlwebSearchTool( + baseUrl = "https://example.com", + backend = { _, _ -> throw NlWebSearchException("connection refused") }, + ) + val out = tool.executor(mapOf("query" to "q")) + assertTrue(out is String, "expected ERROR string, got: $out") + assertTrue(out.startsWith("ERROR: nlweb_search failed:") && "connection refused" in out, out) + } +}