Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 47 additions & 0 deletions docs/advanced/caching.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Caching hints

Every result a server returns for `tools/list`, `prompts/list`, `resources/list`, `resources/templates/list`, `resources/read` and `server/discover` carries two fields on the 2026-07-28 protocol: `ttlMs`, how many milliseconds a client may treat the result as fresh, and `cacheScope`, whether a cached result may be shared across users (`"public"`) or belongs to one authorization context (`"private"`).

The server doesn't cache anything. The fields are a *declaration*: "this tool list is the same for everyone and won't change for a minute." A client (or a gateway in front of you) may then skip the round trip. Honoring the hints is the client's choice; emitting them is the server's job, and the SDK does it for you.

Out of the box every result says `ttlMs: 0, cacheScope: "private"` — immediately stale, never shared. That is always safe and always conformant. If your lists really are stable and identical for all callers, say so at construction:

```python title="server.py" hl_lines="5-8"
--8<-- "docs_src/caching/tutorial001.py"
```

* The map is keyed by **method name** — the six cacheable methods are the only legal keys. The parameter is typed `Mapping[CacheableMethod, CacheHint]`, so your editor autocompletes the keys and flags a typo before you run; anything that slips past the type checker raises at construction.
* A method you don't mention keeps the defaults. The map is a set of overrides, not a manifest.
* `CacheHint(ttl_ms=5_000)` left `scope` unset, so it stays `"private"`: five seconds of freshness, per caller. Scope and TTL are independent decisions.
* `"server/discover"` is a legal key too — the handshake result is cacheable like any list.

!!! warning
`cacheScope: "public"` means *anyone* may be served your cached response — a shared
gateway will happily hand one user's result to another, even when the request was
authenticated. Mark a result `"public"` only when it is identical for every caller, and
never use `cacheScope` as access control: it is a label, not a lock.

## Per-handler override

On the low-level `Server`, handlers build their results by hand — and `ttl_ms` / `cache_scope` are just fields on the result models. A handler that sets them explicitly always wins over the constructor map, field by field:

```python title="server.py" hl_lines="11 17"
--8<-- "docs_src/caching/tutorial002.py"
```

The handler said `ttl_ms=1_000` and nothing about scope. On the wire: `ttlMs: 1000` (the handler's, not the map's `60_000`) and `cacheScope: "public"` (the map's — the handler left it unset). Explicit beats configured, configured beats default — per field, so a handler can pin one field and leave the other to the server-wide policy.

This is also the escape hatch for dynamics the constructor can't know: a handler that filters `resources/read` per user can return `cache_scope="private"` for one URI from an otherwise-public server.

One caveat on paginated lists: the protocol requires the **same `cacheScope` on every page** of one list. The constructor map satisfies that by construction — it's keyed by method, not by page. But a handler that overrides the scope itself owns that consistency: override it on *every* page, never only when a cursor is present, or page one and page two will disagree.

## Older clients

Clients on pre-2026 protocol versions never see either field — the SDK strips them at serialization for those connections. Configure your hints once; there is nothing version-specific to write.

## Recap

* Six methods carry `ttlMs`/`cacheScope`; the SDK defaults them to `0`/`"private"` — stale and unshared, always safe.
* `cache_hints={method: CacheHint(...)}` at construction (both `MCPServer` and `Server`) sets server-wide values per method.
* A handler that sets the fields on its result overrides the map, per field.
* `"public"` is a promise that the result is identical for every caller. It is not access control.
2 changes: 1 addition & 1 deletion docs/migration.md
Original file line number Diff line number Diff line change
Expand Up @@ -1554,7 +1554,7 @@ The implementation is responsible for validating the assertion per RFC 7523 §3

### 2025-11-25 and 2026-07-28 protocol fields modeled

`mcp_types` models the 2025-11-25 and 2026-07-28 protocol fields (e.g. `resultType`, `ttlMs`/`cacheScope` on cacheable results, `inputResponses`/`requestState` on retried requests), so inbound payloads carrying these keys parse into typed fields and round-trip. `ttlMs`/`cacheScope` default to `0`/`"private"` (immediately stale, not shared-cacheable); `resultType` defaults to `"complete"` on concrete results (`None` on `EmptyResult`); the server strips all of them from the wire at pre-2026 versions.
`mcp_types` models the 2025-11-25 and 2026-07-28 protocol fields (e.g. `resultType`, `ttlMs`/`cacheScope` on cacheable results, `inputResponses`/`requestState` on retried requests), so inbound payloads carrying these keys parse into typed fields and round-trip. `ttlMs`/`cacheScope` default to `0`/`"private"` (immediately stale, not shared-cacheable); `resultType` defaults to `"complete"` on concrete results (`None` on `EmptyResult`); the server strips all of them from the wire at pre-2026 versions. Servers set per-method values with `cache_hints={method: CacheHint(...)}` on the `Server`/`MCPServer` constructor — see [Caching hints](advanced/caching.md).

### `streamable_http_app()` available on lowlevel Server

Expand Down
Empty file added docs_src/caching/__init__.py
Empty file.
19 changes: 19 additions & 0 deletions docs_src/caching/tutorial001.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
from mcp.server import CacheHint, MCPServer

mcp = MCPServer(
"Weather",
cache_hints={
"tools/list": CacheHint(ttl_ms=60_000, scope="public"),
"resources/read": CacheHint(ttl_ms=5_000),
},
)


@mcp.tool()
def forecast(city: str) -> str:
return f"Sunny in {city}"


@mcp.resource("config://units")
def units() -> str:
return "metric"
18 changes: 18 additions & 0 deletions docs_src/caching/tutorial002.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
from typing import Any

from mcp_types import ListToolsResult, PaginatedRequestParams, Tool

from mcp.server import CacheHint, Server, ServerRequestContext

TOOLS = [Tool(name="forecast", input_schema={"type": "object"})]


async def list_tools(ctx: ServerRequestContext[Any], params: PaginatedRequestParams | None) -> ListToolsResult:
return ListToolsResult(tools=TOOLS, ttl_ms=1_000)


server = Server(
"Weather",
on_list_tools=list_tools,
cache_hints={"tools/list": CacheHint(ttl_ms=60_000, scope="public")},
)
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ nav:
- The low-level Server: advanced/low-level-server.md
- URI templates: advanced/uri-templates.md
- Pagination: advanced/pagination.md
- Caching hints: advanced/caching.md
- Middleware: advanced/middleware.md
- OpenTelemetry: advanced/opentelemetry.md
- Authorization: advanced/authorization.md
Expand Down
3 changes: 2 additions & 1 deletion src/mcp/server/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from .caching import CacheHint
from .context import ServerRequestContext
from .lowlevel import NotificationOptions, Server
from .mcpserver import MCPServer
from .models import InitializationOptions

__all__ = ["Server", "ServerRequestContext", "MCPServer", "NotificationOptions", "InitializationOptions"]
__all__ = ["CacheHint", "Server", "ServerRequestContext", "MCPServer", "NotificationOptions", "InitializationOptions"]
98 changes: 98 additions & 0 deletions src/mcp/server/caching.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
"""Server-side caching hints (SEP-2549, protocol revision 2026-07-28).

Results for the cacheable methods carry `ttlMs`/`cacheScope` freshness hints.
A handler sets them by returning a result with explicit `ttl_ms`/`cache_scope`
values; `Server(cache_hints={method: CacheHint(...)})` fills them for handlers
that don't. Fields the handler set win, per field, so a server-wide hint never
overrides a handler's explicit choice.
"""

from __future__ import annotations

from collections.abc import Mapping
from dataclasses import dataclass
from typing import Any, Final, Literal, TypeVar, get_args

import mcp_types as types

__all__ = ["CACHEABLE_METHODS", "CacheHint", "CacheableMethod", "apply_cache_hint", "validate_cache_hints"]

CacheableMethod = Literal[
"prompts/list",
"resources/list",
"resources/read",
"resources/templates/list",
"server/discover",
"tools/list",
]
"""The methods whose results carry `ttlMs`/`cacheScope`. Closed set: the spec
defines caching hints on exactly these six (tests pin it to which result models
mix in `CacheableResult`)."""

CACHEABLE_METHODS: Final[frozenset[str]] = frozenset(get_args(CacheableMethod))
"""Runtime mirror of `CacheableMethod`, for callers the type checker can't see."""


@dataclass(frozen=True, slots=True)
class CacheHint:
"""Freshness hint for one cacheable method's results.

`ttl_ms` is how long, in milliseconds, a client may consider the result
fresh (`0` means immediately stale). `scope` is whether a cached result may
be shared across authorization contexts (`"public"`) or only reused within
the one that produced it (`"private"`).
"""

ttl_ms: int = 0
scope: Literal["public", "private"] = "private"

def __post_init__(self) -> None:
if self.ttl_ms < 0:
raise ValueError(f"ttl_ms must be >= 0, got {self.ttl_ms}")
if self.scope not in ("public", "private"):
raise ValueError(f"scope must be 'public' or 'private', got {self.scope!r}")


CacheableResultT = TypeVar("CacheableResultT", bound=types.CacheableResult)


def apply_cache_hint(result: CacheableResultT, hint: CacheHint) -> CacheableResultT:
"""Fill `ttl_ms`/`cache_scope` on `result` from `hint`.

Per-field: a field the handler set explicitly - even to its default value,
tracked via `model_fields_set` - is left alone; only unset fields take the
hint. A handler constructing results with `model_construct` bypasses that
tracking and is treated as having set nothing.
"""
update: dict[str, int | str] = {}
if "ttl_ms" not in result.model_fields_set:
update["ttl_ms"] = hint.ttl_ms
if "cache_scope" not in result.model_fields_set:
update["cache_scope"] = hint.scope
return result.model_copy(update=update) if update else result


def validate_cache_hints(cache_hints: Mapping[Any, Any] | None) -> dict[str, CacheHint]:
"""Validate a `cache_hints` constructor argument into a plain dict.

The `Server`/`MCPServer` signatures already close the key set and value
type for type-checked callers; this runtime gate is deliberately loose in
its parameter so it covers everyone else (e.g. a map deserialized from
config) - a bad entry fails at construction, not on the first request to
that method.

Raises:
ValueError: If a key is not a cacheable method.
TypeError: If a value is not a `CacheHint`.
"""
if cache_hints is None:
return {}
unknown = sorted(method for method in cache_hints if method not in CACHEABLE_METHODS)
if unknown:
raise ValueError(f"cache_hints keys must be cacheable methods (see CacheableMethod); got: {', '.join(unknown)}")
validated: dict[str, CacheHint] = {}
for method, hint in cache_hints.items():
if not isinstance(hint, CacheHint):
raise TypeError(f"cache_hints[{method!r}] must be a CacheHint, got {type(hint).__name__}")
validated[method] = hint
return validated
9 changes: 8 additions & 1 deletion src/mcp/server/lowlevel/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ async def main():

import logging
import warnings
from collections.abc import AsyncIterator, Awaitable, Callable
from collections.abc import AsyncIterator, Awaitable, Callable, Mapping
from contextlib import AbstractAsyncContextManager, asynccontextmanager
from dataclasses import dataclass
from importlib.metadata import version as importlib_version
Expand All @@ -59,6 +59,7 @@ async def main():
from mcp.server.auth.provider import OAuthAuthorizationServerProvider, TokenVerifier
from mcp.server.auth.routes import build_resource_metadata_url, create_auth_routes, create_protected_resource_routes
from mcp.server.auth.settings import AuthSettings
from mcp.server.caching import CacheableMethod, CacheHint, validate_cache_hints
from mcp.server.context import HandlerResult, ServerMiddleware, ServerRequestContext
from mcp.server.models import InitializationOptions
from mcp.server.runner import serve_loop
Expand Down Expand Up @@ -140,6 +141,7 @@ def __init__(
instructions: str | None = None,
website_url: str | None = None,
icons: list[types.Icon] | None = None,
cache_hints: Mapping[CacheableMethod, CacheHint] | None = None,
lifespan: Callable[
[Server[LifespanResultT]],
AbstractAsyncContextManager[LifespanResultT],
Expand Down Expand Up @@ -222,6 +224,7 @@ def __init__(
instructions: str | None = None,
website_url: str | None = None,
icons: list[types.Icon] | None = None,
cache_hints: Mapping[CacheableMethod, CacheHint] | None = None,
lifespan: Callable[
[Server[LifespanResultT]],
AbstractAsyncContextManager[LifespanResultT],
Expand Down Expand Up @@ -313,6 +316,7 @@ def __init__(
instructions: str | None = None,
website_url: str | None = None,
icons: list[types.Icon] | None = None,
cache_hints: Mapping[CacheableMethod, CacheHint] | None = None,
lifespan: Callable[
[Server[LifespanResultT]],
AbstractAsyncContextManager[LifespanResultT],
Expand Down Expand Up @@ -420,6 +424,9 @@ def __init__(
self.instructions = instructions
self.website_url = website_url
self.icons = icons
# Per-method `ttl_ms`/`cache_scope` fills, applied by `ServerRunner`
# after the handler returns; fields the handler set explicitly win.
self.cache_hints: dict[str, CacheHint] = validate_cache_hints(cache_hints)
self.lifespan = lifespan
self._request_handlers: dict[str, HandlerEntry[LifespanResultT]] = {}
self._notification_handlers: dict[str, HandlerEntry[LifespanResultT]] = {}
Expand Down
5 changes: 4 additions & 1 deletion src/mcp/server/mcpserver/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

import base64
import inspect
from collections.abc import AsyncIterator, Awaitable, Callable, Iterable
from collections.abc import AsyncIterator, Awaitable, Callable, Iterable, Mapping
from contextlib import AbstractAsyncContextManager, asynccontextmanager
from typing import Any, Generic, Literal, TypeVar, overload

Expand Down Expand Up @@ -54,6 +54,7 @@
from mcp.server.auth.middleware.bearer_auth import BearerAuthBackend, RequireAuthMiddleware
from mcp.server.auth.provider import OAuthAuthorizationServerProvider, ProviderTokenVerifier, TokenVerifier
from mcp.server.auth.settings import AuthSettings
from mcp.server.caching import CacheableMethod, CacheHint
from mcp.server.context import ServerRequestContext
from mcp.server.lowlevel.helper_types import ReadResourceContents
from mcp.server.lowlevel.server import LifespanResultT, Server
Expand Down Expand Up @@ -157,6 +158,7 @@ def __init__(
lifespan: Callable[[MCPServer[LifespanResultT]], AbstractAsyncContextManager[LifespanResultT]] | None = None,
auth: AuthSettings | None = None,
resource_security: ResourceSecurity = DEFAULT_RESOURCE_SECURITY,
cache_hints: Mapping[CacheableMethod, CacheHint] | None = None,
):
self._resource_security = resource_security
self.settings = Settings(
Expand Down Expand Up @@ -184,6 +186,7 @@ def __init__(
website_url=website_url,
icons=icons,
version=version,
cache_hints=cache_hints,
on_list_tools=self._handle_list_tools,
on_call_tool=self._handle_call_tool,
on_list_resources=self._handle_list_resources,
Expand Down
8 changes: 8 additions & 0 deletions src/mcp/server/runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
INVALID_PARAMS,
METHOD_NOT_FOUND,
PROTOCOL_VERSION_META_KEY,
CacheableResult,
ErrorData,
Implementation,
InitializeRequestParams,
Expand All @@ -40,6 +41,7 @@
from pydantic import BaseModel, ValidationError
from typing_extensions import TypeVar

from mcp.server.caching import apply_cache_hint
from mcp.server.connection import Connection
from mcp.server.context import CallNext, HandlerResult, ServerMiddleware, ServerRequestContext
from mcp.server.models import InitializationOptions
Expand Down Expand Up @@ -196,6 +198,12 @@ async def _inner(ctx: ServerRequestContext[LifespanT, Any]) -> HandlerResult:
if isinstance(result, ErrorData):
# Raise inside the chain so middleware observes the failure.
raise MCPError.from_error_data(result)
# Fill cache hints on the typed result, before the serialize sieve
# decides whether the negotiated version carries the fields at all.
# `input_required` interim results are not `CacheableResult` models,
# so the MRTR carve-out (no hints on them) holds by shape.
if isinstance(result, CacheableResult) and (hint := self.server.cache_hints.get(method)) is not None:
result = apply_cache_hint(result, hint)
# Dump and serialize inside the chain so the OpenTelemetry span (the
# outermost middleware) records a failing handler return shape too.
return self._serialize(method, version, result)
Expand Down
55 changes: 55 additions & 0 deletions tests/docs_src/test_caching.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
"""`docs/advanced/caching.md`: every claim the page makes, proved against the real SDK."""

from typing import Any, cast

import pytest
from inline_snapshot import snapshot

from docs_src.caching import tutorial001, tutorial002
from mcp import Client
from mcp.server import CacheHint, MCPServer

# See test_index.py for why this is a per-module mark and not a conftest hook.
pytestmark = [pytest.mark.anyio, pytest.mark.filterwarnings("error::mcp.MCPDeprecationWarning")]


async def test_a_mapped_method_carries_the_configured_hint() -> None:
"""tutorial001: `tools/list` is in the map, so clients see one minute, public."""
async with Client(tutorial001.mcp) as client:
tools = await client.list_tools()
assert tools.ttl_ms == 60_000
assert tools.cache_scope == "public"


async def test_a_hint_without_a_scope_stays_private() -> None:
"""tutorial001: `resources/read` set only `ttl_ms`; scope keeps the conservative default."""
async with Client(tutorial001.mcp) as client:
result = await client.read_resource("config://units")
assert result.ttl_ms == 5_000
assert result.cache_scope == "private"


async def test_an_unmapped_method_stays_immediately_stale_and_private() -> None:
"""tutorial001: `resources/list` is not in the map - the defaults hold."""
async with Client(tutorial001.mcp) as client:
resources = await client.list_resources()
assert resources.ttl_ms == 0
assert resources.cache_scope == "private"


async def test_a_non_cacheable_method_is_rejected_at_construction() -> None:
"""The page's claim: anything but the six cacheable methods raises at construction."""
with pytest.raises(ValueError) as exc:
MCPServer("Weather", cache_hints=cast(Any, {"tools/call": CacheHint(ttl_ms=1_000)}))
assert str(exc.value) == snapshot(
"cache_hints keys must be cacheable methods (see CacheableMethod); got: tools/call"
)


async def test_the_handler_value_wins_over_the_map_per_field() -> None:
"""tutorial002: the handler's `ttl_ms=1_000` beats the map's `60_000`; the scope
the handler left unset takes the map's `"public"`."""
async with Client(tutorial002.server) as client:
tools = await client.list_tools()
assert tools.ttl_ms == 1_000
assert tools.cache_scope == "public"
Loading
Loading