Custom LLM Integration
📋 Planned
 Documentation Under Construction
This guide will show you how to integrate a custom LLM provider by implementing the LlmService interface, allowing Vanna to work with any language model.
Planned Content
- ✓ The LlmService interface explained
- ✓ Complete example implementation (e.g., for local Ollama or HuggingFace)
- ✓ Handling streaming responses
- ✓ Tool/function calling implementation
- ✓ Error handling and retries
- ✓ Testing your custom LLM service
- ✓ Registering your custom service with the agent
Want to contribute or suggest improvements? Open an issue on GitHub
The LlmService Interface
When complete, this will show:
from vanna.core.llm import LlmService, LlmRequest, LlmResponse
from typing import AsyncIterator
class MyCustomLlmService(LlmService):
    """Connect to your custom LLM"""
    def __init__(self, api_url: str, api_key: str):
        self.api_url = api_url
        self.api_key = api_key
    async def send_message(
        self,
        request: LlmRequest
    ) -> AsyncIterator[LlmResponse]:
        """Send message to LLM and stream responses"""
        # 1. Convert Vanna message format to your LLM's format
        formatted_request = self.format_request(request)
        # 2. Call your LLM API (streaming)
        async for chunk in self.call_llm_api(formatted_request):
            # 3. Convert response back to Vanna format
            yield self.parse_response(chunk)
    async def supports_tool_calling(self) -> bool:
        """Whether this LLM supports function/tool calling"""
        return TrueUse Cases
This guide will be useful for:
- Local models: Ollama, llama.cpp, vLLM
- Open source models: HuggingFace, Mistral, Llama
- Enterprise LLMs: Azure OpenAI, AWS Bedrock, custom endpoints
- Specialized models: Fine-tuned models for SQL, domain-specific LLMs