How Vanna 2.0 Works

Understanding the core components that power Vanna Agents.

๐Ÿ’ก

Are you non-technical?

This page dives into the technical internals of Vanna 2.0. For a business-oriented overview, head over to the guide for non-technical readers.

Read the Business User Guide โ†’

Architecture Overview

Vanna Agent Request Flow
View detailed component diagram User RequestWith user + permissionsAgentLoops between LLM + toolsStreams UI updatesConversationStoreLoads/saves conversationUser-scoped stateLlmServiceClaude / GPT / localDecides: text or tools?ToolRegistryValidates permissionsExecutes tools safelyToolsDB, Files,APIsโ€ฆDual Output Systemresult_for_llm: "Returned 1,247 rows"ui_component: Rich data frame + chartEfficient for LLMs, rich for usersStreamed UI UpdatesRich + simple componentsload / savebuild requestexecute tooltool results

Core Components

1. ToolRegistry

The ToolRegistry is the gatekeeper between the agent and your tools. It:

  1. Stores tools in a dictionary keyed by name (with duplicate protection)
  2. Generates JSON schema for each tool via get_schemas(user) (honouring permissions)
  3. Validates user permissions before execution
  4. Parses and validates tool arguments with Pydantic
  5. Executes the tool and captures timing + metadata
  6. Normalizes errors into ToolResult objects for the LLM and UI

2. Tool

Every tool subclasses Tool[TArgs] and implements a small contract:

from typing import Type
from pydantic import BaseModel, Field
from vanna import Tool, ToolContext, ToolResult, UiComponent, SimpleTextComponent, DataFrameComponent

class QueryArgs(BaseModel):
    sql: str = Field(description="SQL statement to execute")

class QueryDatabaseTool(Tool[QueryArgs]):
    @property
    def name(self) -> str:
        return "query_database"

    @property
    def description(self) -> str:
        return "Execute SQL against the analytics warehouse"

    def get_args_schema(self) -> Type[QueryArgs]:
        return QueryArgs

    async def execute(self, context: ToolContext, args: QueryArgs) -> ToolResult:
        rows = await run_sql(args.sql, user=context.user)  # Your implementation
        summary = f"Returned {len(rows)} rows"

        return ToolResult(
            success=True,
            result_for_llm=summary,
            ui_component=UiComponent(
                rich_component=DataFrameComponent.from_records(
                    records=rows,
                    title="Query Results",
                    description=args.sql,
                ),
                simple_component=SimpleTextComponent(text=summary),
            ),
        )

Key ideas:

  • Strong typing โ€“ arguments are validated through Pydantic before execution.
  • Context-first execution โ€“ the ToolContext includes user, conversation, request, and metadata for multi-tenant safety.
  • Dual outputs โ€“ result_for_llm keeps the language model efficient while ui_component powers rich client experiences.

3. Agent

The Agent orchestrates the entire loop:

  1. Loads or creates the conversation via ConversationStore
  2. Builds the system prompt and LLM request (messages + tool schemas)
  3. Streams status updates before and after key steps
  4. Invokes the LlmService
  5. Handles tool calls by delegating to ToolRegistry
  6. Persists state, applies middleware, lifecycle hooks, filters, and recovery policies
  7. Yields UiComponent instances as soon as theyโ€™re available

You call await agent.send_message(user=user, message="โ€ฆ") and iterateโ€”everything else (tool loops, retries, telemetry) is handled for you.

4. ConversationStore

A simple interface for conversation persistence:

  • Create, retrieve, update, delete conversations
  • Always scoped by user_id
  • Supports pagination via list_conversations

Start with MemoryConversationStore for prototypes, then swap in your own PostgreSQL/DynamoDB/Redis implementation without touching agent logic.

5. User & ToolContext

The User model carries identity (id, username, email), permissions, and arbitrary metadata. That user is threaded through ToolContext, providing:

  • The executing user (critical for authorization)
  • Conversation & request identifiers
  • A metadata dict for custom routing (workspace, locale, plan tier, etc.)

6. LlmService

The LLM abstraction shields you from provider quirks:

  • send_request for non-streaming workflows
  • stream_request for token-by-token updates
  • validate_tools to sanity-check tool schemas

Built-in implementations include Anthropic, OpenAI, and MockLlmService for tests. Bring your own by subclassing LlmService and implementing the async methods.

7. UI Component System

Tools return UiComponent objects with both simple and rich payloads:

UiComponent(
    rich_component=StatusCardComponent(
        title="Database Query",
        status="success",
        description="Fetched 5 rows in 120ms",
    ),
    simple_component=SimpleTextComponent(text="Query succeeded: 5 rows"),
)

Other rich components include DataFrameComponent.from_records(...), ProgressDisplayComponent, NotificationComponent, and chart helpers. Clients can choose the best representation for their channel (web, Slack, CLI, SMS, โ€ฆ) by preferring rich components and falling back to simple ones.

How It All Works Together

1. User sends a message
   โ†“
2. Agent loads conversation state (ConversationStore)
   โ†“
3. Agent builds an LLM request with history + tool schemas
   โ†“
4. LlmService sends the request and streams responses
   โ†“
5a. If the LLM returns text:
    - Agent emits UiComponent(s)
    - Conversation is updated and persisted

5b. If the LLM calls tools:
    - ToolRegistry validates permissions & args
    - Tool executes with ToolContext
    - ToolResult updates the conversation and emits components
    - Agent loops back to the LLM with the new context until a final answer arrives or guardrails trip

Why This Architecture Matters

User-centric: Permissions, quotas, and data isolation are built-inโ€”not bolted on later.

Composable: Swap LLMs, storage, middleware, recovery strategies, or observability providers without touching business logic.

Production-ready: Pydantic validation, async execution, structured telemetry, and recoverable error handling keep agents healthy in real environments.

Rich by default: Dual-output components let you serve command-line operators and dashboard users with the same tool.

Next Steps