Custom LLM Integration

📋 Planned

Documentation Under Construction

This guide will show you how to integrate a custom LLM provider by implementing the LlmService interface, allowing Vanna to work with any language model.

Planned Content

  • The LlmService interface explained
  • Complete example implementation (e.g., for local Ollama or HuggingFace)
  • Handling streaming responses
  • Tool/function calling implementation
  • Error handling and retries
  • Testing your custom LLM service
  • Registering your custom service with the agent

Want to contribute or suggest improvements? Open an issue on GitHub

The LlmService Interface

When complete, this will show:

from vanna.core.llm import LlmService, LlmRequest, LlmResponse
from typing import AsyncIterator

class MyCustomLlmService(LlmService):
    """Connect to your custom LLM"""

    def __init__(self, api_url: str, api_key: str):
        self.api_url = api_url
        self.api_key = api_key

    async def send_message(
        self,
        request: LlmRequest
    ) -> AsyncIterator[LlmResponse]:
        """Send message to LLM and stream responses"""

        # 1. Convert Vanna message format to your LLM's format
        formatted_request = self.format_request(request)

        # 2. Call your LLM API (streaming)
        async for chunk in self.call_llm_api(formatted_request):
            # 3. Convert response back to Vanna format
            yield self.parse_response(chunk)

    async def supports_tool_calling(self) -> bool:
        """Whether this LLM supports function/tool calling"""
        return True

Use Cases

This guide will be useful for:

  • Local models: Ollama, llama.cpp, vLLM
  • Open source models: HuggingFace, Mistral, Llama
  • Enterprise LLMs: Azure OpenAI, AWS Bedrock, custom endpoints
  • Specialized models: Fine-tuned models for SQL, domain-specific LLMs