Skip to content

Python: Fail closed when remote MCP tool schemas drift or tools disappear #4723

@davidahmann

Description

@davidahmann

Problem
A workflow can keep invoking a previously loaded remote MCP tool after the server has removed that tool or changed its expected schema, but the current error handling collapses that into opaque MCP failures instead of an explicit schema-drift/tool-missing contract.

Why now
Remote MCP is a supported extensibility surface, and schema drift is a normal operational event in multi-vendor MCP deployments. Workflows need a deterministic failure classification here so they stop early rather than surfacing transport-specific text.

Evidence packet

  • Commit under test: 22951dddc4ef833a41871f1a5208c353757a1186
  • Runtime: macOS 15.3 / Darwin 25.3.0 arm64, Python 3.14.0, .NET 9.0.109
  • Relevant codepaths:
    • python/packages/core/agent_framework/_mcp.py
    • dotnet/src/Microsoft.Agents.AI.Workflows.Declarative.Mcp/DefaultMcpToolHandler.cs
  • Minimal repro:
    1. Load a remote MCP tool into a workflow/tool list.
    2. Change the server so the tool is removed or its required schema changes.
    3. Invoke the previously loaded tool again.
  • Expected: a stable fail-closed classification such as tool missing / schema mismatch, with no ambiguity about why execution stopped.
  • Actual: Python rethrows raw McpError text from call_tool, and .NET forwards CallToolAsync failures directly without schema-drift-specific wrapping.

Scope
This is a platform contract issue in remote MCP invocation, not a docs-only request.

Validation target
Regression coverage should prove that a missing tool and a schema mismatch each fail deterministically with stable, workflow-visible classification.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions