Custom LLM Integration

📋 Planned

Documentation Under Construction

This guide will show you how to integrate a custom LLM provider by implementing the LlmService interface, allowing Vanna to work with any language model.

Planned Content

✓ The LlmService interface explained
✓ Complete example implementation (e.g., for local Ollama or HuggingFace)
✓ Handling streaming responses
✓ Tool/function calling implementation
✓ Error handling and retries
✓ Testing your custom LLM service
✓ Registering your custom service with the agent

Want to contribute or suggest improvements? Open an issue on GitHub

The LlmService Interface

When complete, this will show:

from vanna.core.llm import LlmService, LlmRequest, LlmResponse
from typing import AsyncIterator

class MyCustomLlmService(LlmService):
    """Connect to your custom LLM"""

    def __init__(self, api_url: str, api_key: str):
        self.api_url = api_url
        self.api_key = api_key

    async def send_message(
        self,
        request: LlmRequest
    ) -> AsyncIterator[LlmResponse]:
        """Send message to LLM and stream responses"""

        # 1. Convert Vanna message format to your LLM's format
        formatted_request = self.format_request(request)

        # 2. Call your LLM API (streaming)
        async for chunk in self.call_llm_api(formatted_request):
            # 3. Convert response back to Vanna format
            yield self.parse_response(chunk)

    async def supports_tool_calling(self) -> bool:
        """Whether this LLM supports function/tool calling"""
        return True

Use Cases

This guide will be useful for:

Local models: Ollama, llama.cpp, vLLM
Open source models: HuggingFace, Mistral, Llama
Enterprise LLMs: Azure OpenAI, AWS Bedrock, custom endpoints
Specialized models: Fine-tuned models for SQL, domain-specific LLMs