AI-Generated Placeholder Documentation

This documentation page has been automatically generated by a Large Language Model (LLM) and serves as placeholder content. The information provided here may be incomplete, inaccurate, or subject to change.

For accurate and complete information, please refer to the Vanna source code on GitHub.

Tool Memory

Tool Memory is one of Vanna 2.0’s most powerful features - it allows your agent to learn from successful interactions and get better over time.

How Tool Memory Works

Every time a tool is successfully used, the question, tool name, and arguments are stored in a vector database. When a similar question is asked later:

  1. The agent searches for similar past tool usage
  2. Successful examples are retrieved based on semantic similarity
  3. The LLM uses these examples to inform its tool selection and argument choices

This creates a self-improving system that learns your data patterns and business logic automatically.

AgentMemory Interface

All memory backends implement the AgentMemory interface:

from vanna.capabilities.agent_memory import AgentMemory

class AgentMemory(ABC):
    async def save_tool_usage(
        self,
        question: str,
        tool_name: str,
        args: Dict[str, Any],
        context: ToolContext,
        success: bool = True
    ) -> None:
        """Save a tool usage pattern"""
        
    async def search_similar_usage(
        self,
        question: str,
        context: ToolContext,
        limit: int = 10,
        similarity_threshold: float = 0.7
    ) -> List[ToolMemorySearchResult]:
        """Search for similar tool usage"""

Built-in Memory Backends

ChromaDB (Local Development)

ChromaDB provides persistent local vector storage for local development and testing:

from vanna.integrations.chromadb import ChromaAgentMemory

memory = ChromaAgentMemory(
    collection_name="vanna_tool_memory",
    persist_directory="./chroma_db"
)

agent = Agent(
    llm_service=llm,
    tool_registry=tools,
    user_resolver=resolver,
    agent_memory=memory
)

Features:

  • Persistent local storage
  • Fast vector search
  • No external dependencies
  • Ideal for development and testing

DemoAgentMemory (In-Memory)

Ephemeral in-memory storage for demos:

from vanna.integrations.local.agent_memory import DemoAgentMemory

memory = DemoAgentMemory()

agent = Agent(
    llm_service=llm,
    tool_registry=tools,
    user_resolver=resolver,
    agent_memory=memory
)

Features:

  • No persistence (data lost on restart)
  • Fast for prototyping
  • No setup required

Cloud Agent Memory (Production)

Vanna Cloud provides managed vector storage for production:

from vanna.integrations.premium.agent_memory import CloudAgentMemory

memory = CloudAgentMemory(
    api_key="your-vanna-api-key",
    workspace_id="your-workspace-id"
)

agent = Agent(
    llm_service=llm,
    tool_registry=tools,
    user_resolver=resolver,
    agent_memory=memory
)

Features:

  • Fully managed
  • Scales automatically
  • Cross-instance sharing
  • Built-in analytics

Custom Memory Backends

Implement your own memory backend for databases like Pinecone, Weaviate, or Milvus:

from vanna.capabilities.agent_memory import AgentMemory, ToolMemory, ToolMemorySearchResult
from typing import List, Dict, Any

class PineconeAgentMemory(AgentMemory):
    def __init__(self, api_key: str, index_name: str):
        import pinecone
        pinecone.init(api_key=api_key)
        self.index = pinecone.Index(index_name)
    
    async def save_tool_usage(
        self,
        question: str,
        tool_name: str,
        args: Dict[str, Any],
        context: ToolContext,
        success: bool = True
    ) -> None:
        # Generate embedding for question
        embedding = await self.embed(question)
        
        # Store in Pinecone
        self.index.upsert(
            vectors=[{
                'id': generate_id(),
                'values': embedding,
                'metadata': {
                    'question': question,
                    'tool_name': tool_name,
                    'args': json.dumps(args),
                    'user_id': context.user.id,
                    'timestamp': datetime.now().isoformat()
                }
            }]
        )
    
    async def search_similar_usage(
        self,
        question: str,
        context: ToolContext,
        limit: int = 10,
        similarity_threshold: float = 0.7
    ) -> List[ToolMemorySearchResult]:
        # Generate query embedding
        embedding = await self.embed(question)
        
        # Search Pinecone
        results = self.index.query(
            vector=embedding,
            top_k=limit,
            filter={'user_id': context.user.id}  # User-scoped memories
        )
        
        # Convert to ToolMemorySearchResult
        memories = []
        for i, match in enumerate(results.matches):
            if match.score >= similarity_threshold:
                metadata = match.metadata
                memories.append(ToolMemorySearchResult(
                    memory=ToolMemory(
                        question=metadata['question'],
                        tool_name=metadata['tool_name'],
                        args=json.loads(metadata['args']),
                        timestamp=metadata['timestamp']
                    ),
                    similarity_score=match.score,
                    rank=i
                ))
        
        return memories

Memory Management

Clear Old Memories

# Clear all memories
await agent.agent_memory.clear_memories(context)

# Clear memories for specific tool
await agent.agent_memory.clear_memories(context, tool_name="run_sql")

# Clear memories before a certain date
await agent.agent_memory.clear_memories(
    context,
    before_date="2024-01-01"
)

Memory Statistics

stats = await agent.agent_memory.get_tool_usage_stats(context)

print(f"Total memories: {stats['total_count']}")
print(f"Tools with memories: {stats['tools']}")

List Tools with Memories

tools = await agent.agent_memory.list_tools_with_memories(context)
print(f"Tools with saved patterns: {tools}")

User-Scoped Memories

By default, memories are scoped to individual users for privacy and relevance:

# Each user builds their own memory
user_alice = User(id="alice", ...)
user_bob = User(id="bob", ...)

# Alice's searches only find Alice's past tool usage
# Bob's searches only find Bob's past tool usage

For shared team memories, implement a custom backend that ignores user scoping.

Memory Search Configuration

Control how memories are retrieved:

# More permissive search
results = await memory.search_similar_usage(
    question="Show me sales",
    context=context,
    limit=20,  # Return more results
    similarity_threshold=0.5  # Lower threshold = more matches
)

# Stricter search
results = await memory.search_similar_usage(
    question="Show me sales",
    context=context,
    limit=5,  # Fewer results
    similarity_threshold=0.9  # Higher threshold = only very similar
)

Best Practices

  1. Use persistent storage in production - Don’t lose learned patterns
  2. Monitor memory growth - Implement cleanup policies for old data
  3. Consider user privacy - User-scoped memories protect sensitive queries
  4. Tune similarity threshold - Balance precision vs recall
  5. Log memory hits - Track when memories are being used
  6. Seed with examples - Pre-populate common patterns for faster learning

Minimal Memory Overhead

If you don’t need persistent learning, use DemoAgentMemory for minimal overhead:

from vanna.integrations.local.agent_memory import DemoAgentMemory

agent = Agent(
    llm_service=llm,
    tool_registry=tools,
    user_resolver=resolver,
    agent_memory=DemoAgentMemory()  # In-memory, no persistence
)

This provides the memory interface without the overhead of persistent storage.

See Also