Skip to content

fix: ensure balanced tool lifecycle callbacks for hallucinated tools#4847

Open
weiguangli-io wants to merge 1 commit intogoogle:mainfrom
weiguangli-io:fix/hallucinated-tool-trace-4775
Open

fix: ensure balanced tool lifecycle callbacks for hallucinated tools#4847
weiguangli-io wants to merge 1 commit intogoogle:mainfrom
weiguangli-io:fix/hallucinated-tool-trace-4775

Conversation

@weiguangli-io
Copy link
Contributor

Summary

Fixes #4775

When the LLM hallucinates a non-existent tool name, the ValueError handler in _execute_single_function_call_async (and its _live counterpart) was calling on_tool_error_callback directly without first invoking before_tool_callback or entering the tracer.start_as_current_span() block. This broke the push/pop invariant that plugins (e.g., BigQueryAgentAnalyticsPlugin) rely on for TraceManager span stack management, causing stack corruption.

Root cause: The hallucinated-tool error path short-circuited before entering _run_with_trace(), skipping before_tool_callback and the tracer span context.

Fix: Move the hallucinated-tool error handling inside the traced lifecycle path so that:

  1. The tracer span context (start_as_current_span) is entered first
  2. before_tool_callback runs before on_tool_error_callback
  3. after_tool_callback / trace_tool_call run in the finally block

Applied to both _execute_single_function_call_async and _execute_single_function_call_live.

Test plan

  • Added test_hallucinated_tool_calls_before_tool_callback that verifies before_tool_callback is called before on_tool_error_callback for hallucinated tools
  • All 11 existing tool callback tests continue to pass
  • Ran autoformat.sh with no changes needed

When the LLM hallucinates a non-existent tool name, the ValueError
handler was calling on_tool_error_callback directly without first
invoking before_tool_callback or entering the tracer span context.
This broke the push/pop invariant that plugins (e.g.,
BigQueryAgentAnalyticsPlugin) rely on for TraceManager span stack
management, causing stack corruption.

Move the hallucinated-tool error handling inside the traced lifecycle
path (_run_with_trace) so that:
1. The tracer span context (start_as_current_span) is entered first
2. before_tool_callback runs before on_tool_error_callback
3. after_tool_callback / trace_tool_call run in the finally block

Applied to both _execute_single_function_call_async and
_execute_single_function_call_live.

Fixes google#4775
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves an issue where the tool lifecycle callbacks were not consistently balanced when an LLM attempted to call a non-existent tool. By relocating the error handling for hallucinated tools within the established tracing context, the change ensures that all necessary callbacks are invoked in the correct sequence, preventing potential corruption of trace spans and maintaining the integrity of plugin operations.

Highlights

  • Balanced Tool Lifecycle Callbacks: The handling of hallucinated (non-existent) tool names has been refactored to ensure that tool lifecycle callbacks (before_tool_callback, on_tool_error_callback, after_tool_callback) are always invoked in a balanced order, even when a ValueError occurs for a missing tool.
  • Trace Span Invariant Preservation: The fix addresses an issue where on_tool_error_callback was called directly for hallucinated tools, bypassing before_tool_callback and the tracer.start_as_current_span() block. This ensures the push/pop invariant for TraceManager span stack management is maintained, preventing stack corruption.
  • Error Handling Relocation: Error handling for non-existent tools has been moved inside the traced lifecycle path (_run_with_trace) in both _execute_single_function_call_async and _execute_single_function_call_live functions. This guarantees that the tracing span context is entered and before_tool_callback runs before on_tool_error_callback.
  • New Test Case Added: A new test, test_hallucinated_tool_calls_before_tool_callback, was added to specifically verify that before_tool_callback is correctly invoked before on_tool_error_callback for hallucinated tools, confirming the fix's behavior.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • src/google/adk/flows/llm_flows/functions.py
    • Refactored error handling for ValueError when a tool is not found, deferring the on_tool_error_callback invocation until after the tracing context and before_tool_callback are established.
    • Introduced a tool_not_found_error variable to temporarily store the error and process it within the _run_with_trace block.
    • Applied the updated error handling logic to both _execute_single_function_call_async and _execute_single_function_call_live.
  • tests/unittests/flows/llm_flows/test_tool_callbacks.py
    • Added test_hallucinated_tool_calls_before_tool_callback to validate the correct callback order for hallucinated tools.
Activity
  • Added test_hallucinated_tool_calls_before_tool_callback to verify the new behavior.
  • Confirmed that all 11 existing tool callback tests continue to pass.
  • Ran autoformat.sh with no changes needed, indicating adherence to code style.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@adk-bot adk-bot added the tracing [Component] This issue is related to OpenTelemetry tracing label Mar 16, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses the issue of unbalanced tool lifecycle callbacks when the LLM hallucinates a non-existent tool. By refactoring the error handling to occur within the traced lifecycle path, the before_tool_callback is now correctly invoked before on_tool_error_callback, ensuring proper span stack management and preventing corruption. The addition of test_hallucinated_tool_calls_before_tool_callback provides excellent validation for this fix, confirming the intended callback order. The changes are well-implemented and directly resolve the reported bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

tracing [Component] This issue is related to OpenTelemetry tracing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unbalanced tool lifecycle callbacks for hallucinated tools cause TraceManager stack corruption in plugins

2 participants