AI-Generated Placeholder Documentation
This documentation page has been automatically generated by a Large Language Model (LLM) and serves as placeholder content. The information provided here may be incomplete, inaccurate, or subject to change.
For accurate and complete information, please refer to the Vanna source code on GitHub.
Tool Memory
Tool Memory is one of Vanna 2.0βs most powerful features - it allows your agent to learn from successful interactions and get better over time.
How Tool Memory Works
Every time a tool is successfully used, the question, tool name, and arguments are stored in a vector database. When a similar question is asked later:
- The agent searches for similar past tool usage
- Successful examples are retrieved based on semantic similarity
- The LLM uses these examples to inform its tool selection and argument choices
This creates a self-improving system that learns your data patterns and business logic automatically.
AgentMemory Interface
All memory backends implement the AgentMemory interface:
from vanna.capabilities.agent_memory import AgentMemory
class AgentMemory(ABC):
async def save_tool_usage(
self,
question: str,
tool_name: str,
args: Dict[str, Any],
context: ToolContext,
success: bool = True
) -> None:
"""Save a tool usage pattern"""
async def search_similar_usage(
self,
question: str,
context: ToolContext,
limit: int = 10,
similarity_threshold: float = 0.7
) -> List[ToolMemorySearchResult]:
"""Search for similar tool usage""" Built-in Memory Backends
ChromaDB (Local Development)
ChromaDB provides persistent local vector storage for local development and testing:
from vanna.integrations.chromadb import ChromaAgentMemory
memory = ChromaAgentMemory(
collection_name="vanna_tool_memory",
persist_directory="./chroma_db"
)
agent = Agent(
llm_service=llm,
tool_registry=tools,
user_resolver=resolver,
agent_memory=memory
) Features:
- Persistent local storage
- Fast vector search
- No external dependencies
- Ideal for development and testing
DemoAgentMemory (In-Memory)
Ephemeral in-memory storage for demos:
from vanna.integrations.local.agent_memory import DemoAgentMemory
memory = DemoAgentMemory()
agent = Agent(
llm_service=llm,
tool_registry=tools,
user_resolver=resolver,
agent_memory=memory
) Features:
- No persistence (data lost on restart)
- Fast for prototyping
- No setup required
Cloud Agent Memory (Production)
Vanna Cloud provides managed vector storage for production:
from vanna.integrations.premium.agent_memory import CloudAgentMemory
memory = CloudAgentMemory(
api_key="your-vanna-api-key",
workspace_id="your-workspace-id"
)
agent = Agent(
llm_service=llm,
tool_registry=tools,
user_resolver=resolver,
agent_memory=memory
) Features:
- Fully managed
- Scales automatically
- Cross-instance sharing
- Built-in analytics
Custom Memory Backends
Implement your own memory backend for databases like Pinecone, Weaviate, or Milvus:
from vanna.capabilities.agent_memory import AgentMemory, ToolMemory, ToolMemorySearchResult
from typing import List, Dict, Any
class PineconeAgentMemory(AgentMemory):
def __init__(self, api_key: str, index_name: str):
import pinecone
pinecone.init(api_key=api_key)
self.index = pinecone.Index(index_name)
async def save_tool_usage(
self,
question: str,
tool_name: str,
args: Dict[str, Any],
context: ToolContext,
success: bool = True
) -> None:
# Generate embedding for question
embedding = await self.embed(question)
# Store in Pinecone
self.index.upsert(
vectors=[{
'id': generate_id(),
'values': embedding,
'metadata': {
'question': question,
'tool_name': tool_name,
'args': json.dumps(args),
'user_id': context.user.id,
'timestamp': datetime.now().isoformat()
}
}]
)
async def search_similar_usage(
self,
question: str,
context: ToolContext,
limit: int = 10,
similarity_threshold: float = 0.7
) -> List[ToolMemorySearchResult]:
# Generate query embedding
embedding = await self.embed(question)
# Search Pinecone
results = self.index.query(
vector=embedding,
top_k=limit,
filter={'user_id': context.user.id} # User-scoped memories
)
# Convert to ToolMemorySearchResult
memories = []
for i, match in enumerate(results.matches):
if match.score >= similarity_threshold:
metadata = match.metadata
memories.append(ToolMemorySearchResult(
memory=ToolMemory(
question=metadata['question'],
tool_name=metadata['tool_name'],
args=json.loads(metadata['args']),
timestamp=metadata['timestamp']
),
similarity_score=match.score,
rank=i
))
return memories Memory Management
Clear Old Memories
# Clear all memories
await agent.agent_memory.clear_memories(context)
# Clear memories for specific tool
await agent.agent_memory.clear_memories(context, tool_name="run_sql")
# Clear memories before a certain date
await agent.agent_memory.clear_memories(
context,
before_date="2024-01-01"
) Memory Statistics
stats = await agent.agent_memory.get_tool_usage_stats(context)
print(f"Total memories: {stats['total_count']}")
print(f"Tools with memories: {stats['tools']}") List Tools with Memories
tools = await agent.agent_memory.list_tools_with_memories(context)
print(f"Tools with saved patterns: {tools}") User-Scoped Memories
By default, memories are scoped to individual users for privacy and relevance:
# Each user builds their own memory
user_alice = User(id="alice", ...)
user_bob = User(id="bob", ...)
# Alice's searches only find Alice's past tool usage
# Bob's searches only find Bob's past tool usage For shared team memories, implement a custom backend that ignores user scoping.
Memory Search Configuration
Control how memories are retrieved:
# More permissive search
results = await memory.search_similar_usage(
question="Show me sales",
context=context,
limit=20, # Return more results
similarity_threshold=0.5 # Lower threshold = more matches
)
# Stricter search
results = await memory.search_similar_usage(
question="Show me sales",
context=context,
limit=5, # Fewer results
similarity_threshold=0.9 # Higher threshold = only very similar
) Best Practices
- Use persistent storage in production - Donβt lose learned patterns
- Monitor memory growth - Implement cleanup policies for old data
- Consider user privacy - User-scoped memories protect sensitive queries
- Tune similarity threshold - Balance precision vs recall
- Log memory hits - Track when memories are being used
- Seed with examples - Pre-populate common patterns for faster learning
Minimal Memory Overhead
If you donβt need persistent learning, use DemoAgentMemory for minimal overhead:
from vanna.integrations.local.agent_memory import DemoAgentMemory
agent = Agent(
llm_service=llm,
tool_registry=tools,
user_resolver=resolver,
agent_memory=DemoAgentMemory() # In-memory, no persistence
) This provides the memory interface without the overhead of persistent storage.