Skip to content

AnthropicLlm usage_metadata is missing thinking token count #5397

@sebastienc

Description

@sebastienc

AnthropicLlm only maps basic input/output token counts into usage_metadata. When extended thinking is enabled, thinking block tokens are included in output_tokens but never broken out separately.

Current behaviour

message_to_generate_content_response and the streaming final response both produce:

usage_metadata=types.GenerateContentResponseUsageMetadata(
    prompt_token_count=message.usage.input_tokens,
    candidates_token_count=message.usage.output_tokens,
    total_token_count=(input_tokens + output_tokens),
)

Cache token counts (cache_creation_input_tokens, cache_read_input_tokens) are also missing but are tracked separately in #5395.

Expected behaviour

When extended thinking is enabled, populate usage_metadata.thoughts_token_count with the token count of thinking blocks. This is derivable from the thinking block content (supplemental API call to tokenizer) or from a future dedicated API field (ref: anthropic-python-sdk ).

Reference

This is particularly relevant now that extended thinking is supported via PR #5392.

Metadata

Metadata

Assignees

Labels

models[Component] Issues related to model supportrequest clarification[Status] The maintainer need clarification or more information from the author
No fields configured for Feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions