AI-Generated Placeholder Documentation

This documentation page has been automatically generated by a Large Language Model (LLM) and serves as placeholder content. The information provided here may be incomplete, inaccurate, or subject to change.

For accurate and complete information, please refer to the Vanna source code on GitHub.

Tool Memory

Tool Memory is one of Vanna 2.0’s most powerful features - it allows your agent to learn from successful interactions and get better over time.

How Tool Memory Works

Every time a tool is successfully used, the question, tool name, and arguments are stored in a vector database. When a similar question is asked later:

The agent searches for similar past tool usage
Successful examples are retrieved based on semantic similarity
The LLM uses these examples to inform its tool selection and argument choices

This creates a self-improving system that learns your data patterns and business logic automatically.

AgentMemory Interface

All memory backends implement the AgentMemory interface:

from vanna.capabilities.agent_memory import AgentMemory

class AgentMemory(ABC):
    async def save_tool_usage(
        self,
        question: str,
        tool_name: str,
        args: Dict[str, Any],
        context: ToolContext,
        success: bool = True
    ) -> None:
        """Save a tool usage pattern"""
        
    async def search_similar_usage(
        self,
        question: str,
        context: ToolContext,
        limit: int = 10,
        similarity_threshold: float = 0.7
    ) -> List[ToolMemorySearchResult]:
        """Search for similar tool usage"""

Built-in Memory Backends

ChromaDB (Local Development)

ChromaDB provides persistent local vector storage for local development and testing:

from vanna.integrations.chromadb import ChromaAgentMemory

memory = ChromaAgentMemory(
    collection_name="vanna_tool_memory",
    persist_directory="./chroma_db"
)

agent = Agent(
    llm_service=llm,
    tool_registry=tools,
    user_resolver=resolver,
    agent_memory=memory
)

Features:

Persistent local storage
Fast vector search
No external dependencies
Ideal for development and testing

DemoAgentMemory (In-Memory)

Ephemeral in-memory storage for demos:

from vanna.integrations.local.agent_memory import DemoAgentMemory

memory = DemoAgentMemory()

agent = Agent(
    llm_service=llm,
    tool_registry=tools,
    user_resolver=resolver,
    agent_memory=memory
)

Features:

No persistence (data lost on restart)
Fast for prototyping
No setup required

Cloud Agent Memory (Production)

Vanna Cloud provides managed vector storage for production:

from vanna.integrations.premium.agent_memory import CloudAgentMemory

memory = CloudAgentMemory(
    api_key="your-vanna-api-key",
    workspace_id="your-workspace-id"
)

agent = Agent(
    llm_service=llm,
    tool_registry=tools,
    user_resolver=resolver,
    agent_memory=memory
)

Features:

Fully managed
Scales automatically
Cross-instance sharing
Built-in analytics

Custom Memory Backends

Implement your own memory backend for databases like Pinecone, Weaviate, or Milvus:

from vanna.capabilities.agent_memory import AgentMemory, ToolMemory, ToolMemorySearchResult
from typing import List, Dict, Any

class PineconeAgentMemory(AgentMemory):
    def __init__(self, api_key: str, index_name: str):
        import pinecone
        pinecone.init(api_key=api_key)
        self.index = pinecone.Index(index_name)
    
    async def save_tool_usage(
        self,
        question: str,
        tool_name: str,
        args: Dict[str, Any],
        context: ToolContext,
        success: bool = True
    ) -> None:
        # Generate embedding for question
        embedding = await self.embed(question)
        
        # Store in Pinecone
        self.index.upsert(
            vectors=[{
                'id': generate_id(),
                'values': embedding,
                'metadata': {
                    'question': question,
                    'tool_name': tool_name,
                    'args': json.dumps(args),
                    'user_id': context.user.id,
                    'timestamp': datetime.now().isoformat()
                }
            }]
        )
    
    async def search_similar_usage(
        self,
        question: str,
        context: ToolContext,
        limit: int = 10,
        similarity_threshold: float = 0.7
    ) -> List[ToolMemorySearchResult]:
        # Generate query embedding
        embedding = await self.embed(question)
        
        # Search Pinecone
        results = self.index.query(
            vector=embedding,
            top_k=limit,
            filter={'user_id': context.user.id}  # User-scoped memories
        )
        
        # Convert to ToolMemorySearchResult
        memories = []
        for i, match in enumerate(results.matches):
            if match.score >= similarity_threshold:
                metadata = match.metadata
                memories.append(ToolMemorySearchResult(
                    memory=ToolMemory(
                        question=metadata['question'],
                        tool_name=metadata['tool_name'],
                        args=json.loads(metadata['args']),
                        timestamp=metadata['timestamp']
                    ),
                    similarity_score=match.score,
                    rank=i
                ))
        
        return memories

Memory Management

Clear Old Memories

# Clear all memories
await agent.agent_memory.clear_memories(context)

# Clear memories for specific tool
await agent.agent_memory.clear_memories(context, tool_name="run_sql")

# Clear memories before a certain date
await agent.agent_memory.clear_memories(
    context,
    before_date="2024-01-01"
)

Memory Statistics

stats = await agent.agent_memory.get_tool_usage_stats(context)

print(f"Total memories: {stats['total_count']}")
print(f"Tools with memories: {stats['tools']}")

List Tools with Memories

tools = await agent.agent_memory.list_tools_with_memories(context)
print(f"Tools with saved patterns: {tools}")

User-Scoped Memories

By default, memories are scoped to individual users for privacy and relevance:

# Each user builds their own memory
user_alice = User(id="alice", ...)
user_bob = User(id="bob", ...)

# Alice's searches only find Alice's past tool usage
# Bob's searches only find Bob's past tool usage

For shared team memories, implement a custom backend that ignores user scoping.

Memory Search Configuration

Control how memories are retrieved:

# More permissive search
results = await memory.search_similar_usage(
    question="Show me sales",
    context=context,
    limit=20,  # Return more results
    similarity_threshold=0.5  # Lower threshold = more matches
)

# Stricter search
results = await memory.search_similar_usage(
    question="Show me sales",
    context=context,
    limit=5,  # Fewer results
    similarity_threshold=0.9  # Higher threshold = only very similar
)

Best Practices

Use persistent storage in production - Don’t lose learned patterns
Monitor memory growth - Implement cleanup policies for old data
Consider user privacy - User-scoped memories protect sensitive queries
Tune similarity threshold - Balance precision vs recall
Log memory hits - Track when memories are being used
Seed with examples - Pre-populate common patterns for faster learning

Minimal Memory Overhead

If you don’t need persistent learning, use DemoAgentMemory for minimal overhead:

from vanna.integrations.local.agent_memory import DemoAgentMemory

agent = Agent(
    llm_service=llm,
    tool_registry=tools,
    user_resolver=resolver,
    agent_memory=DemoAgentMemory()  # In-memory, no persistence
)

This provides the memory interface without the overhead of persistent storage.

AI-Generated Placeholder Documentation

Tool Memory

How Tool Memory Works

AgentMemory Interface

Built-in Memory Backends

ChromaDB (Local Development)

DemoAgentMemory (In-Memory)

Cloud Agent Memory (Production)

Custom Memory Backends

Memory Management

Clear Old Memories

Memory Statistics

List Tools with Memories

User-Scoped Memories

Memory Search Configuration

Best Practices

Minimal Memory Overhead

See Also