Architecture Overview

Understanding the core components and design principles of Vanna Agents.

Vanna Agent Architecture

Core Principles

1. User-Aware from First Token

Every interaction in Vanna requires a User object. This isn’t just authentication—it’s the foundation for:

  • Permission enforcement - What tools can this user access?
  • Data scoping - Which data can this user see?
  • Audit trails - Who did what and when?
  • Rate limiting - Quota management per user
  • Context - User preferences and metadata
user = User(
    id="analyst-123",
    username="sarah",
    email="sarah@company.com",
    group_memberships=["analysts", "data-team"],
    permissions=["read_data", "export_reports"]
)

2. Tool-Based Execution

Agents don’t just chat—they execute tools to accomplish tasks. Tools are:

  • Strongly typed - Pydantic validation prevents bad inputs
  • Permission-aware - Automatic access control
  • Context-rich - Execute with full user and conversation context
  • Dual-output - Efficient for LLMs, rich for humans

3. Composable Providers

Every dependency is an interface you can swap:

# Development
Agent(
    llm_service=MockLlmService(),
    conversation_store=MemoryConversationStore()
)

# Production  
Agent(
    llm_service=AnthropicLlmService(),
    conversation_store=PostgresConversationStore(),
    observability_provider=DatadogObservability()
)

4. Async Everything

All I/O is asynchronous for efficient concurrency:

  • Tool execution
  • LLM requests
  • Database queries
  • External API calls

Core Components

Agent

The orchestrator that ties everything together.

Responsibilities:

  • Load/save conversations
  • Build LLM requests with tool schemas
  • Stream status updates
  • Execute tools via registry
  • Apply middleware and lifecycle hooks
  • Yield UI components

Interface:

async for component in agent.send_message(
    user=user,
    message="Query sales data",
    conversation_id="conv-123"
):
    # Components stream as they're generated
    print(component)

Learn more about Agents →

ToolRegistry

The gatekeeper between agents and tools.

Responsibilities:

  • Store tools by name
  • Generate JSON schemas for LLM
  • Validate permissions before execution
  • Parse and validate arguments
  • Execute tools with context
  • Normalize errors

Interface:

registry = ToolRegistry()

# Register tool with permissions
registry.register_local_tool(
    RunSqlTool(sql_runner),
    access_groups=['analysts', 'admins']
)

# Get schemas (filtered by user permissions)
schemas = registry.get_schemas(user)

# Execute tool
result = await registry.execute_tool(
    tool_name="run_sql",
    args={"query": "SELECT * FROM sales"},
    context=tool_context
)

Learn more about Tools →

Tool

Base class for all executable tools.

Contract:

class MyTool(Tool[MyArgs]):
    @property
    def name(self) -> str:
        return "my_tool"
    
    @property
    def description(self) -> str:
        return "What this tool does"
    
    @property
    def required_permissions(self) -> List[str]:
        return ["permission1", "permission2"]
    
    def get_args_schema(self) -> Type[MyArgs]:
        return MyArgs
    
    async def execute(self, context: ToolContext, args: MyArgs) -> ToolResult:
        # Do work
        return ToolResult(
            success=True,
            result_for_llm="Summary for LLM",
            ui_component=UiComponent(...)
        )

Learn more about Tool Development →

LlmService

Abstraction over LLM providers (Claude, GPT, etc.).

Responsibilities:

  • Send requests to LLM
  • Stream responses token-by-token
  • Validate tool schemas
  • Handle provider-specific quirks

Interface:

class LlmService:
    async def send_request(self, request: LlmRequest) -> LlmResponse:
        ...
    
    async def stream_request(self, request: LlmRequest) -> AsyncIterator[LlmStreamChunk]:
        ...
    
    async def validate_tools(self, schemas: List[ToolSchema]) -> List[str]:
        ...

Built-in implementations:

  • AnthropicLlmService - Claude via Messages API
  • OpenAILlmService - GPT via Chat Completions
  • MockLlmService - Testing and development

Learn more about LLM Providers →

ConversationStore

Persistence layer for conversation history.

Responsibilities:

  • Create/retrieve/update/delete conversations
  • List user’s conversations
  • User-scoped access
  • Pagination support

Interface:

class ConversationStore:
    async def create_conversation(self, user_id: str, ...) -> Conversation:
        ...
    
    async def get_conversation(self, conversation_id: str, user_id: str) -> Conversation:
        ...
    
    async def update_conversation(self, conversation: Conversation) -> None:
        ...
    
    async def list_conversations(self, user_id: str, limit: int, offset: int) -> List[Conversation]:
        ...

Built-in implementations:

  • MemoryConversationStore - In-memory (dev/testing)
  • Custom: PostgreSQL, MongoDB, DynamoDB, Redis, etc.

Learn more about Storage →

User & ToolContext

User carries identity and permissions:

@dataclass
class User:
    id: str
    username: str
    email: str
    group_memberships: List[str] = field(default_factory=list)
    permissions: List[str] = field(default_factory=list)
    metadata: Dict[str, Any] = field(default_factory=dict)

ToolContext provides execution context:

@dataclass
class ToolContext:
    user: User
    conversation_id: str
    request_id: str
    metadata: Dict[str, Any]

Tools receive full context for:

  • Permission checks
  • User-scoped queries
  • Audit logging
  • Custom routing

Learn more about Permissions →

UiComponent System

Dual-output system for LLM efficiency + rich UX:

UiComponent(
    # Rich component (tables, charts, cards)
    rich_component=DataFrameComponent.from_records(
        records=rows,
        title="Sales Data",
        description="Q4 2024 results"
    ),
    # Simple component (text fallback)
    simple_component=SimpleTextComponent(
        text=f"Query returned {len(rows)} rows"
    )
)

Rich components:

  • DataFrameComponent - Interactive tables
  • StatusCardComponent - Status indicators
  • ProgressBarComponent - Progress tracking
  • ArtifactComponent - Files and documents
  • NotificationComponent - Alerts and messages
  • Chart components (Plotly, etc.)

Simple components:

  • SimpleTextComponent - Plain text
  • SimpleImageComponent - Images
  • SimpleLinkComponent - Hyperlinks

Learn more about UI Components →

Request Flow

Here’s what happens when you send a message:

1. User sends message
   ↓
2. Agent loads conversation state
   ↓
3. Agent builds LLM request
   - Conversation history
   - Tool schemas (filtered by user permissions)
   - System prompt
   ↓
4. LlmService processes request
   ↓
5a. LLM returns text
    - Agent emits UiComponent
    - Conversation updated
    - Done
    
5b. LLM calls tools
    - ToolRegistry validates permissions
    - Tool executes with ToolContext
    - ToolResult updates conversation
    - Agent emits UiComponents
    - Loop back to step 3 (until done or max iterations)

Design Patterns

Factory Pattern for Agents

def create_analytics_agent() -> Agent:
    """Factory for analytics agents"""
    tools = ToolRegistry()
    tools.register_local_tool(RunSqlTool(...), access_groups=['analysts'])
    tools.register_local_tool(VisualizeDataTool(), access_groups=['analysts'])
    
    return Agent(
        llm_service=get_llm_service(),  # From config
        tool_registry=tools,
        config=AgentConfig(max_tool_iterations=5)
    )

Dependency Injection

# Inject implementations
agent = Agent(
    llm_service=AnthropicLlmService(),           # Swappable
    conversation_store=PostgresConversationStore(),  # Swappable
    observability_provider=DatadogObservability(),   # Optional
    config=AgentConfig()
)

Middleware Pattern

class CachingMiddleware:
    async def before_llm_request(self, request: LlmRequest) -> LlmRequest:
        # Check cache
        if cached := get_cached_response(request):
            return cached
        return request
    
    async def after_llm_response(self, response: LlmResponse) -> LlmResponse:
        # Store in cache
        cache_response(response)
        return response

Error Handling

Tool Errors

Tools return ToolResult with success=False:

return ToolResult(
    success=False,
    result_for_llm="Database connection failed",
    error="DB_CONNECTION_ERROR",
    ui_component=UiComponent(
        simple_component=SimpleTextComponent(
            text="❌ Could not connect to database"
        )
    )
)

LLM Errors

Caught and logged by agent:

try:
    response = await llm_service.send_request(request)
except LlmServiceError as e:
    # Log error
    # Return error component to user
    yield UiComponent(simple_component=SimpleTextComponent(
        text="❌ AI service unavailable"
    ))

Permission Errors

Automatically enforced by registry:

# User lacks required permission
try:
    result = await registry.execute_tool(...)
except PermissionError:
    # Caught by agent, error component returned
    yield UiComponent(...)

Performance Considerations

Async Execution

All I/O is async for concurrent operations:

# These can run concurrently
results = await asyncio.gather(
    sql_tool.execute(context, args1),
    api_tool.execute(context, args2),
    file_tool.execute(context, args3)
)

Streaming

Stream responses as they’re generated:

config = AgentConfig(stream_responses=True)

async for component in agent.send_message(user, message):
    # Components arrive as they're ready
    # User sees progress in real-time
    render(component)

Caching

Implement caching in tools or middleware:

@lru_cache(maxsize=100)
def get_schema(database: str):
    # Expensive operation cached
    return fetch_schema(database)

Security Model

Permission Checks

Automatic at tool execution:

class SensitiveTool(Tool[Args]):
    @property
    def required_permissions(self) -> List[str]:
        return ["admin"]  # Only admins can use

User Scoping

All operations scoped to user:

# Conversation store enforces user_id
conversations = await store.list_conversations(
    user_id=user.id  # Can only see own conversations
)

Audit Trails

Every tool execution logged:

{
    "user_id": "analyst-123",
    "tool_name": "run_sql",
    "args": {"query": "SELECT..."},
    "timestamp": "2024-...",
    "duration_ms": 45,
    "success": true
}

Learn more about Security →

Next Steps

Deep Dives:

How-To Guides:

Examples: