doc: design spec for azd ai agent toolbox direct commands by hund030 · Pull Request #8160 · Azure/azure-dev

hund030 · 2026-05-13T07:50:24Z

Summary

This PR adds the design spec for the azd ai agent toolbox direct commands. This is for design review only and contains no implementation code.

What's in this spec

Five top-level verbs for managing versioned toolboxes against a Foundry project:
- toolbox create <name>: register a toolbox (pending tools until first connection add).
- toolbox update <name> --default-version <n>: retarget the active version.
- toolbox delete <name>: delete the whole toolbox or a single version via --version.
- toolbox show <name>: display metadata, version body, and the toolbox's MCP endpoint URL for runtime consumption.
- toolbox list: list toolboxes plus any local pending records for the resolved endpoint.
Connection subgroup (connection add | remove | list) that wraps the toolbox versions API. Every tool mutation is a full-tools[] POST to /toolboxes/{name}/versions followed by a PATCH default_version; tool-entry shape is inferred from the project connection's ARM category (RemoteTool → mcp, CognitiveSearch → azure_ai_search).
Tag subgroup (tag set | remove | list) shipped as a Compatibility stub: toolboxes are not yet exposed as ARM resources, so the standard ARM resource-tag surface is unavailable. The CLI surface is final and activates without surface changes once the underlying API exists.
Pending-toolbox config store: create cannot POST because the service requires a non-empty tools[], so it records a local entry under extensions.ai-agents.pending-toolboxes.<endpointHash>.items.<name>. The first connection add promotes the record to a real v1.
Live-verified API surface (§ 4): every endpoint the spec relies on was probed against a live Foundry project, including the per-version delete guard, the MCP runtime endpoint, and the ARM connection lookup used by connection add.

Key design decisions for review

Extension placement (§ 3): toolbox commands live inside the existing azure.ai.agents extension at azd ai agent toolbox …. Files are isolated under internal/cmd/toolbox*.go and internal/pkg/toolbox/ so a future move to a standalone azure.ai.toolbox extension is a registration-only change.
create does not POST (§ 5.1, Open Question 1): the service requires a non-empty tools[]. Two options were considered — auto-seed with a sentinel built-in, or defer the first POST until the first connection add. Current proposal: defer (pending-record model). Keeps create non-network and matches the "one command, one action" principle.
update is --default-version only (§ 5.2): the service only accepts default_version on PATCH. Description and tool edits must go through connection add / connection remove, which publish a new version.
Per-version delete is exposed (§ 5.3): the service refuses to delete the default_version when sibling versions exist (400). When the default is the only remaining version the service deletes it and cascades to remove the parent toolbox; the CLI rejects this without --force to avoid surprise destruction.
Tools support both built-ins and connection-backed entries (§ 4.2, § 2 Non-Goals): the service accepts code_interpreter, web_search, file_search, mcp, and azure_ai_search side-by-side. Per issue Add azd ai toolbox create/update/show/list/delete (plus connection and tag subcommands) to manage Foundry Toolboxes from any directory #8143, built-ins are wired on the agent rather than via toolboxes; the CLI does not author built-ins but carries them through unchanged during fetch-merge-POST.
No -p short form (§ 5.1): -p is already taken on agent run / agent invoke. The canonical name is --project-endpoint, matching the project-context spec.

Feature request: #8143 Feature: Add ai.toolbox commands for CRUD on Toolboxes in Foundry Project
Sibling design specs:
- #8138 docs: design spec for azd ai connection direct commands
- #8152 doc: design spec for azd ai project context commands and endpoint resolution — the 5-level endpoint resolution cascade referenced in § 6 is owned by this spec.
Feature spec: azd ai Direct Commands (CLI Surface for Foundry)

github-actions · 2026-05-13T07:52:31Z

📋 Prioritization Note

Thanks for the contribution! The linked issue isn't in the current milestone yet.
Review may take a bit longer — reach out to @rajeshkamal5050 or @kristenwomack if you'd like to discuss prioritization.

Copilot

Pull request overview

Adds a new design specification document describing the proposed azd ai agent toolbox direct-command surface in the azure.ai.agents extension (design review only; no implementation).

Changes:

Documents the planned toolbox CRUD command surface (create|update|delete|show|list) plus connection and tag subgroups.
Specifies data-plane API endpoints, request/response shapes, and CLI behaviors (including a local pending-toolbox record model).
Outlines endpoint resolution, config storage shape, telemetry events, error codes, and a test plan.

Comments suppressed due to low confidence (1)

cli/azd/docs/design/azure-ai-toolbox-direct-commands.md:166

The spec’s computed MCP consumption URL omits the toolbox version ({projectEndpoint}/toolboxes/{name}/mcp?...). In the existing extension code, the MCP endpoint is version-scoped: /toolboxes/{name}/versions/{version}/mcp?api-version=v1 (see extensions/azure.ai.agents/internal/cmd/listen.go). If the service contract is versioned, this spec should reflect that so show returns a runnable URL.

1. `GET /toolboxes/{name}` for `default_version`.
2. `GET /toolboxes/{name}/versions/{version}` (or `default_version` when `--version` is absent) for the body.
3. Compute the toolbox's MCP consumption URL as `{projectEndpoint}/toolboxes/{name}/mcp?api-version=v1`.
4. Output:

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 5 comments.

jongio

Solid spec with clear API mapping, good test plan, and well-structured command surface. A few design gaps that'll bite during implementation:

Error code overloading: CodeInvalidToolbox is used for 6+ distinct failure modes (duplicate connection, unsupported category, empty tools, missing --index, missing connection, invalid version target). This makes telemetry useless for distinguishing root causes. Consider at least: CodeDuplicateConnection, CodeUnsupportedConnectionCategory, CodeEmptyToolsArray, CodeMissingIndex.

TOCTOU in shared write flow: Steps 2-5 (fetch default version, mutate, POST, PATCH) have a race window. Two concurrent connection add calls both fetch the same version, both POST, and the second creates a version missing the first's tool. Worth acknowledging as a known limitation or describing retry/conflict detection (e.g., conditional on the fetched version number).

Stale pending-record accumulation: Records are cleared by connection add or delete, but if a user runs create and never follows up (or the endpoint changes), pending records accumulate forever. Consider a TTL (e.g., 30 days) or a toolbox list --cleanup-stale hint.

Built-in tool interaction with last-tool guard: connection remove rejects when tools[] would be empty. But if removing the last connection-backed tool leaves only built-ins (which are "carried through unchanged"), is the result considered empty? The guard should clarify whether empty means zero total entries or zero connection-backed entries.

show on pending toolbox: create records locally, but show does GET /toolboxes/{name} which will 404 for a pending-only toolbox. Should show fall back to displaying the pending record metadata, or explicitly error with "toolbox is pending, run connection add first"?

hund030 · 2026-05-14T03:50:29Z

@jongio Thanks for the review - all five are valid and the spec is updated.

Error codes — CodeInvalidToolbox replaced with 10 specific codes (§ 11).
TOCTOU — documented as known v1 limitation; no client-side mitigation since service lacks If-Match/CAS and any client-side detect has its own TOCTOU (§ 5.6 Concurrency). Workaround: serialize calls.
Stale pending records — list shows CREATED; delete clears pending-only records (§§ 5.3, 5.5, 7).
Empty tools[] — CodeLastToolRemoval defined as "zero entries including built-ins" (§ 5.6, § 11).
show on pending — pre-checks the pending store on 404 and emits a pending view (§ 5.4.1); --version rejected.

jongio

Addresses all feedback from my prior review:

Error codes split from the overloaded CodeInvalidToolbox into distinct codes (CodeDuplicateConnection, CodeDefaultVersionDelete, CodeOnlyVersionDelete, etc.) - telemetry can now distinguish root causes.
TOCTOU race documented as a known limitation with clear user guidance (serialize calls, wait for service conditional-write support).
Pending-record cleanup is user-driven via delete; list now surfaces CREATED so stale entries are visible.
Built-in tool guard clarified: "zero entries (counting any built-ins carried through)."
show on pending-only toolbox specified in new section 5.4.1 with both table and JSON output contracts.

Architecture simplification (extending FoundryToolboxClient directly rather than wrapping via composition) is a good call - less indirection, same one-way import graph.

…in example

…state, version list)

…unknown fields

wbreza

Supplemental review — two prior approvals already in (therealjohn, jongio). Prior rounds resolved the file-path, tag-subgroup, output-default, file-based create, and versioned MCP endpoint findings. The items below are new and additive.

🟡 Should-fix before merge

1. PR description is significantly out-of-sync with the current spec. The body still describes the old design and will land in the squash-merge commit message / release notes as-is:

Says command surface is azd ai agent toolbox … → spec now uses azd ai toolbox … shipped as a sibling extension azure.ai.toolboxes (§ 1, § 3).
Says "Pending-toolbox config store" / "create does not POST" → spec now does the opposite: create --from-file publishes v1 immediately (§ 5.1) and § 7 explicitly states there is no client-side pending state.
Says "Tag subgroup shipped as a Compatibility stub" → tag was removed entirely.
Says extension placement is inside azure.ai.agents → now a new top-level azure.ai.toolboxes (§ 3).

Suggest rewriting the PR description to match the current spec before merging.

🟡 Consider before implementation

2. § 11 defines CodeConnectionMissingTarget but § 5.6 never describes when it fires. The shared write flow only mentions retrieving target from the ARM connection; no behavior step covers the empty-target case. Either drop the code from § 11 or add a pre-flight check to § 5.6 step 1 so implementers know the trigger (and so the error registry stays a faithful inventory of actual command behavior).

3. § 10 Telemetry omits error.code on failure events. Other azd commands typically emit the exterrors code as a telemetry property so failure-mode distribution is queryable. Adding an errorCode property (when an event terminates in error) to each event row would close a parity gap and make the per-code routing in § 11 observable in practice.

🟢 Nits

4. § 5.7 version list sort rule is silent on mixed numeric / non-numeric versions (e.g. v1, 2, 1.2.3-rc). Worth tightening: either "numeric-first then lexical" or "all-or-nothing" so implementers don't each pick a different total order.

5. § 5.4.1 runtime-consumption example uses jq, which isn't on the default path on Windows dev machines. Either note the example is illustrative or include a PowerShell alternative so the snippet is copy-pastable on both platforms.

6. § 8 Test Plan has no test pinning the documented concurrency race in § 5.6 ("last PATCH wins; losing call ends up on an orphan version"). A documenting test would lock in expectations so the eventual If-Match fix can flip the assertion intentionally rather than discovering the prior behavior was undocumented.

doc: add design spec for azd ai agent toolbox direct commands

a309945

Copilot AI review requested due to automatic review settings May 13, 2026 07:50

hund030 requested review from JeffreyCA, hemarina, jongio, rajeshkamal5050, tg-msft, vhvb1989, wbreza and weikanglim as code owners May 13, 2026 07:50

microsoft-github-policy-service Bot assigned hund030 May 13, 2026

hund030 linked an issue May 13, 2026 that may be closed by this pull request

Add azd ai toolbox create/update/show/list/delete (plus connection and tag subcommands) to manage Foundry Toolboxes from any directory #8143

Closed

Copilot started reviewing on behalf of hund030 May 13, 2026 07:51 View session

hund030 removed a link to an issue May 13, 2026

Add azd ai toolbox create/update/show/list/delete (plus connection and tag subcommands) to manage Foundry Toolboxes from any directory #8143

Closed

hund030 linked an issue May 13, 2026 that may be closed by this pull request

Add azd ai toolbox create/update/show/list/delete (plus connection and tag subcommands) to manage Foundry Toolboxes from any directory #8143

Closed

Copilot AI reviewed May 13, 2026

View reviewed changes

hund030 added 2 commits May 13, 2026 15:57

style: update cspell ignore list in toolbox design spec

959560b

docs: address review comments on toolbox design spec

9b2b8a2

hund030 requested a review from Copilot May 13, 2026 08:33

Copilot started reviewing on behalf of hund030 May 13, 2026 08:34 View session

Copilot AI reviewed May 13, 2026

View reviewed changes

hund030 added 2 commits May 13, 2026 17:07

docs: wrap FoundryToolboxClient in self-contained toolbox package

a8c1274

docs: correct file paths and clarify create non-mutating semantics

a106316

therealjohn approved these changes May 13, 2026

View reviewed changes

jongio reviewed May 13, 2026

View reviewed changes

docs: address jongio review feedback and verify file references

00899cf

therealjohn approved these changes May 14, 2026

View reviewed changes

jongio approved these changes May 15, 2026

View reviewed changes

docs: remove tag subgroup, reinforce modular layout, use azd env set …

32cc8b8

…in example

hund030 mentioned this pull request May 15, 2026

feat(toolboxes): add azd ai toolbox direct commands #8203

Merged

therealjohn approved these changes May 15, 2026

View reviewed changes

hund030 added 2 commits May 20, 2026 18:00

docs: align toolbox spec with implementation (file-input, no pending …

cdc6a0e

…state, version list)

docs: clarify description is set at create time only; parser rejects …

3e4f8d6

…unknown fields

wbreza reviewed May 20, 2026

View reviewed changes

JeffreyCA added the ext-agents azure.ai.{agents,connections,inspector,projects,routines,skills,toolboxes} extensions label May 20, 2026

Conversation

hund030 commented May 13, 2026

Summary

What's in this spec

Key design decisions for review

Related

Uh oh!

github-actions Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📋 Prioritization Note

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jongio left a comment

Choose a reason for hiding this comment

Uh oh!

hund030 commented May 14, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jongio left a comment

Choose a reason for hiding this comment

Uh oh!

wbreza left a comment

Choose a reason for hiding this comment

🟡 Should-fix before merge

🟡 Consider before implementation

🟢 Nits

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

github-actions Bot commented May 13, 2026 •

edited

Loading