Skip to content

feat(sdk): typed agent tool resolution contracts#4763

Closed
mmabrouk wants to merge 1 commit into
feat/agent-harness-portfrom
feat/sdk-local-tools-contracts
Closed

feat(sdk): typed agent tool resolution contracts#4763
mmabrouk wants to merge 1 commit into
feat/agent-harness-portfrom
feat/sdk-local-tools-contracts

Conversation

@mmabrouk

@mmabrouk mmabrouk commented Jun 19, 2026

Copy link
Copy Markdown
Member

This PR is part of a stack. Review bottom-up.

Each PR's diff is only its own delta. Merge from the bottom. This PR's base is #4761 (merge that first).

Context

The agent service used to pass tools around as loose List[Any] and loose dicts. Nothing typed a tool, nothing validated it, and the resolution logic that turns a stored tool into something a runner can execute lived only in the service. This PR moves the canonical tool and MCP contracts into the SDK and adds an offline resolver that runs with no network access. It sits on feat/agent-harness-port and lands slice #9 (tool runtime) from docs/design/agent-workflows/pr-stack.md.

What this changes

Two new SDK subsystems appear under sdks/python/agenta/sdk/agents/: tools/ and mcp/. Each one ships strict Pydantic models, a strict parser, a resolver, typed errors, and a wire serializer.

A tool now has two shapes. A ToolConfig is what a user stores: a discriminated union over builtin, gateway, code, and client. A ToolSpec is what a runner receives after resolution: a discriminated union over callback, code, and client. The ToolResolver walks the configs, pulls declared secrets through an injected provider, and returns a ResolvedToolSet. The default EnvironmentToolSecretProvider reads os.environ, so a standalone resolve needs no platform. A gateway config has no offline resolution, so the resolver raises UnsupportedToolProviderError unless the caller injects a GatewayToolResolver. That injection point is where the server-only Composio adapter attaches later.

The DTOs change shape. Before, SessionConfig and the harness configs carried builtin_tools: List[str] and custom_tools: List[Dict[str, Any]]. After, they carry builtin_names: List[str] and tool_specs: List[ToolSpec]. The old names survive as read aliases (AliasChoices) and as @property shims, so existing callers and stored payloads keep working. AgentConfig.tools is now typed List[ToolConfig] and coerces legacy shapes on the way in. ToolCallback moves out of dtos.py into tools/models.py.

The wire gains one field. Every resolved tool spec now serializes a kind discriminator, so the runner reads the executor type directly instead of sniffing fields. The two golden /run request fixtures pick up "kind": "callback" on their callback tool, and that is the whole behavioral wire delta.

MCP rides alongside as a sibling. AgentConfig gains mcp_servers: List[MCPServerConfig], harness configs emit a mcpServers wire field through wire_mcp(), and the payload omits that field when no servers are declared, so a tool-free run's wire stays byte-identical.

Key architectural decision to review

The durable tool contract is the split between ToolConfig and ToolSpec in tools/models.py. A stored config names a tool by reference (a builtin name, a gateway slug, a code body, a client declaration). A resolved spec carries a URI-like identity plus a JSON Schema plus either an execution body (code) or a delivery reference (callRef for callback). Resolution is the only bridge between them, and it stays offline in the SDK. The tradeoff: the SDK owns the shape and the secret plumbing, but it deliberately owns no gateway or vault logic. Those are server concerns that attach through the GatewayToolResolver and ToolSecretProvider ports. Scrutinize whether those two ports are the right seam, because every server-only piece lands behind them.

The second thing to weigh is the alias-and-property compatibility layer on the DTOs (dtos.py, around the SessionConfig and PiAgentConfig validators). It lets the rename ride in without breaking callers, but it means two names point at one field for a while. Decide whether that bridge should have a removal milestone.

How to review this PR

Start with sdks/python/agenta/sdk/agents/tools/models.py. Read the two unions and confirm the discriminators (type for config, kind for spec) and the alias handling on ToolSpecBase. Then read tools/resolver.py to see how a config becomes a spec and where secrets and the gateway port enter. Then read tools/compat.py to see how legacy shapes (a bare string, a composio type, a gateway slug, an OpenAI-style function) fold into the canonical union. The mcp/ package mirrors the tools/ package one-to-one, so skim it once you trust tools/.

Skip ui_messages.py; it is a compatibility re-export from adapters/vercel and carries no new logic. Check dtos.py for the alias correctness, since a wrong AliasChoices is the likely regression: a stored payload using the old key must still load. The golden fixtures and test_wire_contract.py guard the wire, so a diff there beyond the added kind is a red flag.

Tests / notes

New unit tests cover the tool models, parsing, resolver, and MCP resolver. A transport round-trip integration test exercises the resolved set end to end. The wire-contract test adds a case proving code, client, and MCP specs reach the runner intact and that mcpServers is omitted when empty.

@dosubot dosubot Bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Jun 19, 2026
@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: e2694fc4-bc47-4d47-bfab-2f6222dd0ea7

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/sdk-local-tools-contracts

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@dosubot dosubot Bot added feature python Pull requests that update Python code SDK tests labels Jun 19, 2026
@mmabrouk

Copy link
Copy Markdown
Member Author

Reviewer guide: interesting code

  • sdks/python/agenta/sdk/agents/tools/models.py:95ToolSpecBase and its three kind subtypes are the runner-facing contract; the alias handling here is what keeps the old wire keys working.
  • sdks/python/agenta/sdk/agents/tools/resolver.py:102ToolResolver.resolve is the only bridge from stored config to runnable spec; note where secrets and the gateway port enter and where it raises on a missing gateway.
  • sdks/python/agenta/sdk/agents/tools/compat.py:60coerce_tool_config folds every legacy shape (bare string, composio type, gateway slug, OpenAI function) into the canonical union; this is the back-compat surface.
  • sdks/python/agenta/sdk/agents/dtos.py:449 — the builtin_tools/custom_tools to builtin_names/tool_specs rename rides in behind AliasChoices and @property shims; a wrong alias is the likely regression.
  • sdks/python/agenta/sdk/agents/mcp/models.py:23MCPServerConfig._validate_transport is the only MCP rule with teeth (stdio needs a command, http needs a url); the rest of mcp/ mirrors tools/.
  • sdks/python/oss/tests/pytest/unit/agents/golden/run_request.pi.json:25 — the added "kind": "callback" is the entire behavioral wire delta; confirm nothing else in the goldens moved.

tool_callback = None
if gateway_configs:
if self._gateway_resolver is None:
raise UnsupportedToolProviderError(gateway_configs[0].provider)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the seam the architecture decision hangs on: a gateway config has no offline resolution, so an absent injected resolver raises here. The server-only Composio adapter attaches at this port.

TOOL_SPEC_ADAPTER: TypeAdapter[ToolSpec] = TypeAdapter(ToolSpec)


def coerce_tool_spec(value: Any) -> ToolSpec:

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coerce_tool_spec infers the kind discriminator when a caller omits it (callRef to callback, code to code, else client). This is what lets old runner dicts without a kind still validate into the new union.

class EnvironmentToolSecretProvider:
"""Read declared tool secrets from the current process environment."""

async def get_many(self, names: Sequence[str]) -> Mapping[str, str]:

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default secret provider reads os.environ, which is why a standalone resolve needs no platform. Swapping this Protocol for a vault-backed provider is how the server injects real secrets without changing the resolver.

tools: List[str] = Field(default_factory=list)

@model_validator(mode="after")
def _validate_transport(self) -> "MCPServerConfig":

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only MCP rule with teeth: stdio requires a command, http requires a url. Worth confirming this matches what the runner bridge can actually launch.

@github-actions

github-actions Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Railway Preview Environment

Status Destroyed (PR closed)

Updated at 2026-06-19T16:30:11.537Z

@mmabrouk

Copy link
Copy Markdown
Member Author

Superseded. Replacing the path-based stack with PRs sliced by functional area showing final code only, so reviewers don't comment on intermediate scaffolding that a later PR rewrites. See the new set.

@mmabrouk mmabrouk closed this Jun 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature python Pull requests that update Python code SDK size:XXL This PR changes 1000+ lines, ignoring generated files. tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant