AI-Generated Placeholder Documentation

This documentation page has been automatically generated by a Large Language Model (LLM) and serves as placeholder content. The information provided here may be incomplete, inaccurate, or subject to change.

For accurate and complete information, please refer to the Vanna source code on GitHub.

LLM Context Enhancers

LLM Context Enhancers add additional context to the system prompt and user messages before LLM calls, enabling RAG (Retrieval-Augmented Generation), documentation injection, and dynamic prompt enhancement.

LlmContextEnhancer Interface

from vanna.core.enhancer import LlmContextEnhancer
from vanna.core.llm import LlmMessage
from vanna.core.user import User

class LlmContextEnhancer(ABC):
    @abstractmethod
    async def enhance_system_prompt(
        self,
        system_prompt: str,
        user_message: str,
        user: User
    ) -> str:
        """Enhance system prompt with additional context based on user message"""
        return system_prompt

    @abstractmethod
    async def enhance_user_messages(
        self,
        messages: list[LlmMessage],
        user: User
    ) -> list[LlmMessage]:
        """Enhance user messages with additional context"""
        return messages

Key Difference from Context Enrichers

  • Context Enrichers (ToolContextEnricher): Enrich the tool execution context with metadata
  • LLM Context Enhancers (LlmContextEnhancer): Enhance LLM prompts and messages with additional context

Default Implementation

Vanna provides a DefaultLlmContextEnhancer that uses AgentMemory for RAG:

from vanna import Agent
from vanna.core.enhancer import DefaultLlmContextEnhancer

agent = Agent(
    llm_service=llm,
    tool_registry=tools,
    user_resolver=user_resolver,
    agent_memory=agent_memory,
    llm_context_enhancer=DefaultLlmContextEnhancer(agent_memory)  # Default
)

The default enhancer:

  • Searches agent memory for similar past interactions
  • Adds up to 5 relevant examples to the system prompt
  • Shows successful tool uses as examples
  • Includes failed attempts as counter-examples

Custom Implementation

Example 1: Documentation-Based RAG

from vanna.core.enhancer import LlmContextEnhancer

class DocumentationEnhancer(LlmContextEnhancer):
    def __init__(self, doc_search_service):
        self.docs = doc_search_service

    async def enhance_system_prompt(
        self,
        system_prompt: str,
        user_message: str,
        user: User
    ) -> str:
        # Search documentation based on user question
        relevant_docs = await self.docs.search(user_message, limit=3)

        if not relevant_docs:
            return system_prompt

        # Add documentation to system prompt
        docs_section = "\n\n## Relevant Documentation\n\n"
        for doc in relevant_docs:
            docs_section += f"### {doc.title}\n{doc.content}\n\n"

        return system_prompt + docs_section

    async def enhance_user_messages(
        self,
        messages: list[LlmMessage],
        user: User
    ) -> list[LlmMessage]:
        # Don't modify user messages in this example
        return messages

Example 2: Schema-Aware Enhancement

class SchemaEnhancer(LlmContextEnhancer):
    def __init__(self, sql_runner):
        self.sql_runner = sql_runner
        self.schema_cache = None

    async def enhance_system_prompt(
        self,
        system_prompt: str,
        user_message: str,
        user: User
    ) -> str:
        # Cache schema information
        if not self.schema_cache:
            self.schema_cache = await self.sql_runner.get_schema_info()

        # Extract relevant tables mentioned in user message
        relevant_tables = self._find_relevant_tables(
            user_message,
            self.schema_cache
        )

        if not relevant_tables:
            return system_prompt

        # Add schema information to prompt
        schema_section = "\n\n## Relevant Database Schema\n\n"
        for table in relevant_tables:
            schema_section += f"**{table['name']}**\n"
            schema_section += f"Columns: {', '.join(table['columns'])}\n\n"

        return system_prompt + schema_section

    def _find_relevant_tables(self, message: str, schema: dict) -> list:
        # Simple keyword matching (could use embeddings)
        message_lower = message.lower()
        relevant = []

        for table in schema['tables']:
            if table['name'].lower() in message_lower:
                relevant.append(table)

        return relevant

    async def enhance_user_messages(
        self,
        messages: list[LlmMessage],
        user: User
    ) -> list[LlmMessage]:
        return messages

Example 3: Vector Search RAG

from openai import OpenAI

class VectorSearchEnhancer(LlmContextEnhancer):
    def __init__(self, vector_db, embedding_model):
        self.vector_db = vector_db
        self.embedding_model = embedding_model
        self.openai = OpenAI()

    async def enhance_system_prompt(
        self,
        system_prompt: str,
        user_message: str,
        user: User
    ) -> str:
        # Generate embedding for user message
        embedding = await self.openai.embeddings.create(
            model=self.embedding_model,
            input=user_message
        )

        # Search vector database
        results = await self.vector_db.search(
            embedding=embedding.data[0].embedding,
            limit=5,
            filter={'user_id': user.id}
        )

        if not results:
            return system_prompt

        # Add relevant context
        context_section = "\n\n## Relevant Context from Past Interactions\n\n"
        for result in results:
            context_section += f"- {result['text']}\n"

        return system_prompt + context_section

    async def enhance_user_messages(
        self,
        messages: list[LlmMessage],
        user: User
    ) -> list[LlmMessage]:
        return messages

Example 4: Multi-Turn Context Enhancement

class MultiTurnEnhancer(LlmContextEnhancer):
    def __init__(self):
        self.conversation_summaries = {}

    async def enhance_system_prompt(
        self,
        system_prompt: str,
        user_message: str,
        user: User
    ) -> str:
        # Only enhance on first message
        return system_prompt

    async def enhance_user_messages(
        self,
        messages: list[LlmMessage],
        user: User
    ) -> list[LlmMessage]:
        # Add context to user messages in long conversations
        if len(messages) > 10:
            # Summarize earlier conversation
            summary = self._summarize_conversation(messages[:5])

            # Insert summary as a system-like message
            enhanced = [messages[0]]
            enhanced.append(LlmMessage(
                role="system",
                content=f"Conversation context: {summary}"
            ))
            enhanced.extend(messages[1:])

            return enhanced

        return messages

    def _summarize_conversation(self, messages: list[LlmMessage]) -> str:
        # Simple summary (could use LLM for better results)
        topics = []
        for msg in messages:
            if msg.role == "user" and msg.content:
                topics.append(msg.content[:50])

        return f"Topics discussed: {', '.join(topics)}"

Example 5: User Permission-Based Enhancement

class PermissionBasedEnhancer(LlmContextEnhancer):
    def __init__(self, permission_service):
        self.permissions = permission_service

    async def enhance_system_prompt(
        self,
        system_prompt: str,
        user_message: str,
        user: User
    ) -> str:
        # Get user's data access permissions
        accessible_tables = await self.permissions.get_accessible_tables(user.id)

        # Add access constraints to system prompt
        constraints = "\n\n## Data Access Constraints\n\n"
        constraints += f"You can only query these tables: {', '.join(accessible_tables)}\n"
        constraints += "If the user asks for data from other tables, explain they don't have access.\n"

        return system_prompt + constraints

    async def enhance_user_messages(
        self,
        messages: list[LlmMessage],
        user: User
    ) -> list[LlmMessage]:
        return messages

Using Custom Enhancers

Register your custom enhancer:

from vanna import Agent

# Option 1: Use custom enhancer only
agent = Agent(
    llm_service=llm,
    tool_registry=tools,
    user_resolver=user_resolver,
    llm_context_enhancer=DocumentationEnhancer(doc_service)
)

# Option 2: Combine multiple enhancers
class CombinedEnhancer(LlmContextEnhancer):
    def __init__(self, enhancers: list[LlmContextEnhancer]):
        self.enhancers = enhancers

    async def enhance_system_prompt(
        self,
        system_prompt: str,
        user_message: str,
        user: User
    ) -> str:
        enhanced = system_prompt
        for enhancer in self.enhancers:
            enhanced = await enhancer.enhance_system_prompt(
                enhanced, user_message, user
            )
        return enhanced

    async def enhance_user_messages(
        self,
        messages: list[LlmMessage],
        user: User
    ) -> list[LlmMessage]:
        enhanced = messages
        for enhancer in self.enhancers:
            enhanced = await enhancer.enhance_user_messages(enhanced, user)
        return enhanced

agent = Agent(
    llm_service=llm,
    llm_context_enhancer=CombinedEnhancer([
        DocumentationEnhancer(docs),
        SchemaEnhancer(sql_runner),
        DefaultLlmContextEnhancer(agent_memory)
    ])
)

When Methods Are Called

enhance_system_prompt()

  • Called once per conversation turn
  • Invoked after building the system prompt, before the first LLM request
  • Receives the initial user message
  • Use for: RAG, documentation injection, contextual examples

enhance_user_messages()

  • Called before each LLM request
  • Invoked for initial request and after tool calls
  • Receives all conversation messages
  • Use for: Dynamic message modification, conversation summarization
  • Caution: Avoid adding context repeatedly on each iteration

Performance Considerations

  1. Caching: Cache expensive lookups (embeddings, documentation)
  2. Async operations: Use asyncio.gather() for parallel operations
  3. Rate limiting: Be mindful of external API calls (embeddings, vector search)
  4. Token limits: Keep enhanced prompts within model context windows
  5. Conditional enhancement: Only enhance when necessary
class OptimizedEnhancer(LlmContextEnhancer):
    def __init__(self):
        self.cache = {}
        self.cache_ttl = 300  # 5 minutes

    async def enhance_system_prompt(
        self,
        system_prompt: str,
        user_message: str,
        user: User
    ) -> str:
        # Check cache first
        cache_key = f"{user.id}:{user_message[:50]}"
        if cache_key in self.cache:
            cached, timestamp = self.cache[cache_key]
            if time.time() - timestamp < self.cache_ttl:
                return system_prompt + cached

        # Fetch and cache
        context = await self._fetch_context(user_message)
        self.cache[cache_key] = (context, time.time())

        return system_prompt + context

Error Handling

Always handle errors gracefully:

class SafeEnhancer(LlmContextEnhancer):
    async def enhance_system_prompt(
        self,
        system_prompt: str,
        user_message: str,
        user: User
    ) -> str:
        try:
            context = await self._fetch_context(user_message)
            return system_prompt + context
        except Exception as e:
            # Log error but don't fail the request
            logger.warning(f"Failed to enhance prompt: {e}")
            return system_prompt  # Return original prompt

    async def enhance_user_messages(
        self,
        messages: list[LlmMessage],
        user: User
    ) -> list[LlmMessage]:
        try:
            return await self._enhance_messages(messages, user)
        except Exception as e:
            logger.warning(f"Failed to enhance messages: {e}")
            return messages  # Return original messages

Best Practices

  1. Keep prompts focused - Only add relevant context
  2. Monitor token usage - Enhanced prompts use more tokens
  3. Cache aggressively - Avoid redundant API calls
  4. Handle failures - Return original prompt/messages on error
  5. Test thoroughly - Verify context actually helps LLM performance
  6. Use observability - Track enhancement duration and impact
  7. Consider privacy - Be careful with user data in prompts

Observability

The Agent automatically tracks enhancement metrics:

# Metrics tracked:
# - agent.llm_context.enhance_system_prompt.duration
# - agent.llm_context.enhance_user_messages.duration
# - Spans for each enhancement operation

See Also