🔴 Required Information
Is your feature request related to a specific problem?
When building conversational agents or voice-enabled applications with Google's Agent Development Kit (ADK), developers need low-latency, high-performance semantic search runtime components for retrieval-augmented generation (RAG). Standard remote cloud-based embedding or vector retrieval APIs often introduce significant network latency (100ms+), which degrades the agent's responsiveness.
Describe the Solution You'd Like
We can have integration demonstrating how to use Moss (a sub-10ms, on-device/local semantic search runtime) as a fast retrieval tool for Google ADK agents. (available at https://moss.dev)
This contribution includes:
- A new cookbook directory: containing a runnable demo and setup guides.
create_moss_search_tool: A utility factory that takes a MossClient and returns an ADK-compatible async function tool.
Impact on your work
This cookbook makes it straightforward for developers using Google ADK to integrate highly responsive, sub-10ms semantic search capabilities directly into their agent pipelines, resolving latency bottlenecks in RAG setups.
Willingness to contribute
Yes, this PR contains the complete implementation.
🟡 Recommended Information
Describe Alternatives You've Considered
Developers could write custom tool bindings from scratch for every vector database, but providing a standardized cookbook factory simplifies adoption and demonstrates best practices for integrating low-latency runtimes with ADK.
Proposed API / Implementation
The integration utilizes ADK's automatic schema generation from async Python functions:
from google.adk.agents import Agent
from google.adk.models.lite_llm import LiteLlm
from moss import MossClient
from moss_adk import create_moss_search_tool
# Initialize low-latency Moss client
client = MossClient("your-project-id", "your-project-key")
# Build the ADK-compatible search tool
search = create_moss_search_tool(client=client, index_name="knowledge-base", top_k=3)
# Pass the tool directly to the ADK Agent
agent = Agent(
model='gemini-flash-latest',
name="retrieval_agent",
instruction="Use the moss_search tool to answer questions from the knowledge base.",
tools=[search],
)
🔴 Required Information
Is your feature request related to a specific problem?
When building conversational agents or voice-enabled applications with Google's Agent Development Kit (ADK), developers need low-latency, high-performance semantic search runtime components for retrieval-augmented generation (RAG). Standard remote cloud-based embedding or vector retrieval APIs often introduce significant network latency (100ms+), which degrades the agent's responsiveness.
Describe the Solution You'd Like
We can have integration demonstrating how to use Moss (a sub-10ms, on-device/local semantic search runtime) as a fast retrieval tool for Google ADK agents. (available at https://moss.dev)
This contribution includes:
create_moss_search_tool: A utility factory that takes aMossClientand returns an ADK-compatible async function tool.Impact on your work
This cookbook makes it straightforward for developers using Google ADK to integrate highly responsive, sub-10ms semantic search capabilities directly into their agent pipelines, resolving latency bottlenecks in RAG setups.
Willingness to contribute
Yes, this PR contains the complete implementation.
🟡 Recommended Information
Describe Alternatives You've Considered
Developers could write custom tool bindings from scratch for every vector database, but providing a standardized cookbook factory simplifies adoption and demonstrates best practices for integrating low-latency runtimes with ADK.
Proposed API / Implementation
The integration utilizes ADK's automatic schema generation from async Python functions: