OSS vs Premium
The updated Python SDK submodule shows that every Vanna agent is really a composition of services: an LlmService, a ToolRegistry, storage, prompts, lifecycle hooks, middleware, enrichers, filters, observability, evaluations, and user authentication. You can wire up each piece yourself by building against the abstract base classes in the SDK (for example LlmService in submodules/vanna-user-agents/python-sdk/vanna/core/llm/base.py, Tool in submodules/vanna-user-agents/python-sdk/vanna/core/tool/base.py, Evaluator in submodules/vanna-user-agents/python-sdk/vanna/core/evaluation/base.py, or UserService in submodules/vanna-user-agents/python-sdk/vanna/core/user/base.py), but the hosted platform ships opinionated counterparts that remove the undifferentiated heavy lifting.
| Capability | Compatible OSS / third-party adapters | Vanna-hosted option (batteries-included) | 
|---|---|---|
| LLM service | Anthropic Messages via AnthropicLlmService, OpenAI Chat viaOpenAILlmService, or any endpoint you wrap to theLlmServiceasync contract (send_request,stream_request,validate_tools) | Managed Vanna LLM endpoints with pre-tuned model routing, guardrails, and automatic regression evaluation on model updates | 
| Tool execution | Python SDKs such as Snowflake, Slack, or Jira wrapped in a subclass of Toolthat exposes a Pydantic args schema and asyncexecutemethod (seesubmodules/.../core/tool/base.py) | Vanna Tool Registry with role-aware permissions, audit trails, automatic schema validation, and centralized credentials management | 
| Conversation store | Databases like Supabase/PostgreSQL or Redis that you access from a custom ConversationStoreimplementation (submodules/.../core/storage/base.py); the repo includesMemoryConversationStoreas a working reference | Vanna Managed Conversation Store with encrypted retention, PII scrubbing, time-travel restore, and configurable retention policies | 
| System prompt management | PromptLayer, FeatureForm, or Git-backed prompt repos that you read from a subclass of SystemPromptBuilder(submodules/.../core/system_prompt/base.py) to supply the string at runtime | Vanna System Prompt Builder that versions prompts per agent, ships safer defaults, and captures prompt→outcome analytics for continuous tuning | 
| Lifecycle governance | LaunchDarkly/Unleash feature flags or internal quota services invoked from LifecycleHookimplementations (submodules/.../core/lifecycle/base.py) to block, modify, or approve interactions | Policy-driven lifecycle hooks that enforce quotas, approvals, session limits, and compliance checks without custom glue code | 
| LLM middleware & safety | Guardrails AI, Presidio scrubbing, or custom caching layers invoked from LlmMiddleware(submodules/.../core/middleware/base.py) to mutate requests/responses before they hit the model | Inline Vanna middleware with automatic caching, redaction, multi-provider failover, token accounting, and SOC2-aligned audit logging | 
| Context enrichment | Pinecone, Weaviate, or warehouse lookups fetched inside ContextEnrichersubclasses (submodules/.../core/enricher/base.py) that attach metadata toToolContext | Managed Vanna context enrichers with zero-copy connectors to your sources, adaptive retrieval, and enrichment telemetry | 
| Observability | OpenTelemetry, Datadog, or Honeycomb exporters that satisfy ObservabilityProvider(submodules/.../core/observability/base.py) by implementingrecord_metricandcreate_span | Vanna Observability with unified traces, live conversation playback, SLA alerting, and long-term cost analytics tuned to agent workflows | 
| Error recovery | Tenacity-based retry logic or SQS retry queues wrapped in ErrorRecoveryStrategy(submodules/.../core/recovery/base.py) to issueRecoveryActiondecisions | Vanna Recovery Orchestrator with adaptive retry budgets, safe fallbacks, and incident timelines surfaced directly in the admin console | 
| Conversation filtering | OpenAI Moderation, AWS Comprehend, or custom regex/PII detectors called from ConversationFilterimplementations (submodules/.../core/filter/base.py) to redact or summarize history | Built-in policy filters tuned to Vanna components that catch sensitive data, redact secrets, and route escalations before prompting | 
| Evaluations | Custom harnesses that emit AgentResultand plug into theEvaluatorinterface (submodules/.../core/evaluation/base.py), plus storage/reporting you build aroundEvaluationRunner | Vanna Evaluation Suite with curated datasets, scoring templates, regression dashboards, and release gates wired into agent deployments | 
| Authentication & users | Auth0, Clerk, Supabase, or Cognito adapters that fulfill the UserServicecontract (submodules/.../core/user/base.py) to fetch users, authenticate credentials, and resolve permissions | Vanna Identity layer with built-in session management, granular permissions, and audit-friendly user history shared across teams | 
Together, the hosted stack mirrors the same abstractions exported by the open-source SDK, but with production-grade defaults, compliance controls, and instrumentation that let teams ship faster and sleep better.
Quick Take (Non-technical)
| Capability | DIY vibe | Vanna hosted | 
|---|---|---|
| LLMs | Juggle API keys and model quirks | One endpoint that stays safe and fast | 
| Tools | Write wrappers, handle creds, hope permissions are right | Plug-and-play tools with guardrails built in | 
| Memory | Stand up databases, manage retention, scrub data manually | Secure conversation history with compliance defaults | 
| Prompts | Keep prompt files in sync across teams | Versioned prompts with analytics and safe defaults | 
| Governance | Build quota checks and approvals yourself | Policies and approvals ready to flip on | 
| Safety | Chain together redaction, caching, fallbacks | Middleware that handles it automatically | 
| Context | Maintain your own RAG pipelines | Managed connectors that inject the right context | 
| Observability | Wire up traces, dashboards, alerts | Full telemetry and playback out of the box | 
| Recovery | Script retries and fallbacks | Automated recovery flows with admin visibility | 
| Filtering | Glue together moderation and PII detectors | Built-in policies that block issues before they reach the model | 
| Evaluations | Spin up QA scripts and keep datasets fresh by hand | Regression tests, scoring, and launch gates already wired in | 
| Authentication | Stitch together login, session, and RBAC checks | Unified auth, permissions, and user audit logs | 
Enterprise Deployment Path
POC rollout
- Ship a standardized Vanna container that bundles LLM endpoint, registry, observability, and evaluation services.
- Deploy on a single VM in the customer’s cloud (EC2, Compute Engine, Azure VM); networking is limited to outbound calls to sanctioned data sources.
- Quick integration with customer SSO via a lightweight UserServiceadapter; storage defaults to the managed conversation store inside the container.
- Goal: validate value in weeks, collect telemetry, and finalize the production requirements document.
Production build-out
- Stand up dedicated databases, object storage, and observability stacks inside the customer’s cloud account.
- Deploy Vanna components to separate subnets or Kubernetes clusters with autoscaling, HA, and customer-managed keys.
- Integrate with enterprise IdP, logging (e.g., Splunk, Datadog), and ticketing for lifecycle hooks and approvals.
- Migrate from the POC container to modular services (LLM, tools, evaluations, observability) that comply with customer compliance policies.
Commercial Options
| Offering | Scope | Hosting model | Support & SLAs | Indicative pricing* | 
|---|---|---|---|---|
| DIY + Vanna consulting | Customer builds against the OSS SDK; Vanna provides design reviews, implementation playbooks, and evaluation templates over a 6–8 week engagement | Customer-owned infrastructure | Slack/email support during engagement, optional quarterly health checks | From $45k one-time + optional $5k/month retainer | 
| Vanna managed in your cloud | Vanna deploys and operates the managed stack inside the customer’s AWS/GCP/Azure account, including production integrations and ongoing evaluations | Customer cloud account with Vanna-managed workloads | 24/5 support, 99.5% availability SLA, shared runbooks, quarterly optimization workshops | From $12k/month + usage-based LLM costs | 
*Pricing is indicative; final quotes depend on user counts, data residency, and compliance scope.