AI-Generated Placeholder Documentation

This documentation page has been automatically generated by a Large Language Model (LLM) and serves as placeholder content. The information provided here may be incomplete, inaccurate, or subject to change.

For accurate and complete information, please refer to the Vanna source code on GitHub.

Deploying with FastAPI

Vanna 2.0 provides a production-ready FastAPI server for deploying your agents.

Quick Start

from vanna import Agent, ToolRegistry
from vanna.servers.fastapi import VannaFastAPIServer
from vanna.integrations.anthropic import AnthropicLlmService
from vanna.integrations.postgres import PostgresRunner
from vanna.tools import RunSqlTool

# Create components
llm_service = AnthropicLlmService(api_key="...")
sql_runner = PostgresRunner(connection_string="...")

# Create tool registry and register tools
registry = ToolRegistry()
registry.register_local_tool(RunSqlTool(sql_runner), access_groups=[])

# Create your agent
agent = Agent(
    llm_service=llm_service,
    tool_registry=registry,
    user_resolver=MyUserResolver(),  # See Authentication docs
    agent_memory=MyAgentMemory()  # Required parameter
)

# Create FastAPI server
server = VannaFastAPIServer(agent)

# Run the server
server.run(host="0.0.0.0", port=8000)

Server Configuration

The VannaFastAPIServer accepts a configuration dictionary:

config = {
    "fastapi": {
        "title": "My Data Agent API",
        "version": "1.0.0",
        "description": "Custom agent for our analytics"
    },
    "cors": {
        "enabled": True,
        "allow_origins": ["https://myapp.com"],
        "allow_credentials": True,
        "allow_methods": ["GET", "POST"],
        "allow_headers": ["Authorization", "Content-Type"]
    },
    "dev_mode": False,
    "static_folder": "static"
}

server = VannaFastAPIServer(agent, config=config)

Adding to an Existing FastAPI App

If you already have a FastAPI application, you can add Vanna chat endpoints to it:

from fastapi import FastAPI
from vanna import Agent
from vanna.servers.base import ChatHandler
from vanna.servers.fastapi.routes import register_chat_routes

# Your existing FastAPI app with your own routes
app = FastAPI()

# ... your existing routes here ...
# @app.get("/api/users")
# @app.post("/api/orders")
# etc.

# Add Vanna chat endpoints to your existing app
agent = Agent(
    llm_service=llm,
    tool_registry=registry,
    user_resolver=user_resolver,
    agent_memory=agent_memory
)

chat_handler = ChatHandler(agent)
register_chat_routes(app, chat_handler, config={
    "dev_mode": False,
    "cdn_url": "https://img.vanna.ai/vanna-components.js"
})

# Now your app has BOTH:
# - Your existing routes: /api/users, /api/orders, etc.
# - Vanna routes: /, /api/vanna/v2/chat_sse, /api/vanna/v2/chat_websocket, /api/vanna/v2/chat_poll

This registers the following Vanna endpoints on your existing app:

GET / - Web UI (optional, can be disabled by not registering)
POST /api/vanna/v2/chat_sse - Server-Sent Events streaming
WebSocket /api/vanna/v2/chat_websocket - WebSocket real-time chat
POST /api/vanna/v2/chat_poll - HTTP polling

Creating the ASGI Application

For deployment with Uvicorn, Gunicorn, or containerization:

# app.py
from vanna import Agent
from vanna.servers.fastapi import VannaFastAPIServer

agent = Agent(...)
server = VannaFastAPIServer(agent)
app = server.create_app()

# Run with: uvicorn app:app --host 0.0.0.0 --port 8000

Async Environment Support

The FastAPI server automatically detects and handles async environments like Jupyter notebooks and Google Colab:

# In Jupyter/Colab
server.run(port=8000)  # Automatically uses nest_asyncio and displays URL

For Colab, it even sets up port forwarding automatically and displays the correct public URL.

API Endpoints

The FastAPI server provides these endpoints:

Chat Endpoints

GET / - Web UI interface
POST /api/vanna/v2/chat_sse - Server-Sent Events streaming endpoint
WebSocket /api/vanna/v2/chat_websocket - WebSocket real-time chat
POST /api/vanna/v2/chat_poll - HTTP polling endpoint

Health Check

GET /health - Server health status

Example Chat Request (SSE)

import requests
import json

response = requests.post(
    "http://localhost:8000/api/vanna/v2/chat_sse",
    json={
        "message": "Show me sales by region",
        "conversation_id": "conv_123",
        "metadata": {}
    },
    headers={
        "Authorization": "Bearer your-jwt-token"
    },
    stream=True
)

# Process SSE stream
for line in response.iter_lines():
    if line:
        if line.startswith(b"data: "):
            data = line[6:].decode('utf-8')
            if data != "[DONE]":
                chunk = json.loads(data)
                print(chunk)

Example Chat Request (Polling)

import requests

response = requests.post(
    "http://localhost:8000/api/vanna/v2/chat_poll",
    json={
        "message": "Show me sales by region",
        "conversation_id": "conv_123",
        "metadata": {}
    },
    headers={
        "Authorization": "Bearer your-jwt-token"
    }
)

print(response.json())

Production Deployment

Using Uvicorn

pip install uvicorn[standard]
uvicorn app:app --host 0.0.0.0 --port 8000 --workers 4

Using Gunicorn with Uvicorn Workers

pip install gunicorn uvicorn[standard]
gunicorn app:app 
  --workers 4 
  --worker-class uvicorn.workers.UvicornWorker 
  --bind 0.0.0.0:8000

Docker Deployment

FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install -r requirements.txt

COPY app.py .

CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

docker build -t vanna-agent .
docker run -p 8000:8000 vanna-agent

Docker Compose

version: '3.8'

services:
  vanna:
    build: .
    ports:
      - "8000:8000"
    environment:
      - DATABASE_URL=postgresql://user:pass@db:5432/mydb
      - ANTHROPIC_API_KEY=sk-...
    depends_on:
      - db
  
  db:
    image: postgres:15
    environment:
      - POSTGRES_PASSWORD=password
      - POSTGRES_DB=mydb

Environment Variables

Recommended environment variables for production:

# LLM Configuration
ANTHROPIC_API_KEY=sk-...
OPENAI_API_KEY=sk-...

# Database
DATABASE_URL=postgresql://user:pass@host:5432/db

# Auth
JWT_SECRET=your-secret-key
AUTH_COOKIE_NAME=session_id

# Server
PORT=8000
HOST=0.0.0.0
LOG_LEVEL=info
WORKERS=4

# CORS
ALLOWED_ORIGINS=https://myapp.com,https://app.mycompany.com

Load Balancing

For high availability, run multiple instances behind a load balancer:

                    ┌─────────────┐
                    │Load Balancer│
                    └──────┬──────┘
                           │
           ┌───────────────┼───────────────┐
           │               │               │
      ┌────▼───┐      ┌────▼───┐     ┌────▼───┐
      │Vanna   │      │Vanna   │     │Vanna   │
      │Instance│      │Instance│     │Instance│
      │   1    │      │   2    │     │   3    │
      └────┬───┘      └────┬───┘     └────┬───┘
           │               │               │
           └───────────────┼───────────────┘
                           │
                      ┌────▼────┐
                      │Database │
                      └─────────┘

Monitoring

Add health check endpoint monitoring:

# app.py
from vanna.servers.fastapi import VannaFastAPIServer

server = VannaFastAPIServer(agent)
app = server.create_app()

# Add custom monitoring endpoint
@app.get("/metrics")
async def metrics():
    return {
        "active_conversations": get_active_count(),
        "total_requests": get_request_count(),
        "uptime_seconds": get_uptime()
    }