OSS vs Premium

The updated Python SDK submodule shows that every Vanna agent is really a composition of services: an LlmService, a ToolRegistry, storage, prompts, lifecycle hooks, middleware, enrichers, filters, observability, evaluations, and user authentication. You can wire up each piece yourself by building against the abstract base classes in the SDK (for example LlmService in submodules/vanna-user-agents/python-sdk/vanna/core/llm/base.py, Tool in submodules/vanna-user-agents/python-sdk/vanna/core/tool/base.py, Evaluator in submodules/vanna-user-agents/python-sdk/vanna/core/evaluation/base.py, or UserService in submodules/vanna-user-agents/python-sdk/vanna/core/user/base.py), but the hosted platform ships opinionated counterparts that remove the undifferentiated heavy lifting.

Capability	Compatible OSS / third-party adapters	Vanna-hosted option (batteries-included)
LLM service	Anthropic Messages via `AnthropicLlmService`, OpenAI Chat via `OpenAILlmService`, or any endpoint you wrap to the `LlmService` async contract (`send_request`, `stream_request`, `validate_tools`)	Managed Vanna LLM endpoints with pre-tuned model routing, guardrails, and automatic regression evaluation on model updates
Tool execution	Python SDKs such as Snowflake, Slack, or Jira wrapped in a subclass of `Tool` that exposes a Pydantic args schema and async `execute` method (see `submodules/.../core/tool/base.py`)	Vanna Tool Registry with role-aware permissions, audit trails, automatic schema validation, and centralized credentials management
Conversation store	Databases like Supabase/PostgreSQL or Redis that you access from a custom `ConversationStore` implementation (`submodules/.../core/storage/base.py`); the repo includes `MemoryConversationStore` as a working reference	Vanna Managed Conversation Store with encrypted retention, PII scrubbing, time-travel restore, and configurable retention policies
System prompt management	PromptLayer, FeatureForm, or Git-backed prompt repos that you read from a subclass of `SystemPromptBuilder` (`submodules/.../core/system_prompt/base.py`) to supply the string at runtime	Vanna System Prompt Builder that versions prompts per agent, ships safer defaults, and captures prompt→outcome analytics for continuous tuning
Lifecycle governance	LaunchDarkly/Unleash feature flags or internal quota services invoked from `LifecycleHook` implementations (`submodules/.../core/lifecycle/base.py`) to block, modify, or approve interactions	Policy-driven lifecycle hooks that enforce quotas, approvals, session limits, and compliance checks without custom glue code
LLM middleware & safety	Guardrails AI, Presidio scrubbing, or custom caching layers invoked from `LlmMiddleware` (`submodules/.../core/middleware/base.py`) to mutate requests/responses before they hit the model	Inline Vanna middleware with automatic caching, redaction, multi-provider failover, token accounting, and SOC2-aligned audit logging
Context enrichment	Pinecone, Weaviate, or warehouse lookups fetched inside `ContextEnricher` subclasses (`submodules/.../core/enricher/base.py`) that attach metadata to `ToolContext`	Managed Vanna context enrichers with zero-copy connectors to your sources, adaptive retrieval, and enrichment telemetry
Observability	OpenTelemetry, Datadog, or Honeycomb exporters that satisfy `ObservabilityProvider` (`submodules/.../core/observability/base.py`) by implementing `record_metric` and `create_span`	Vanna Observability with unified traces, live conversation playback, SLA alerting, and long-term cost analytics tuned to agent workflows
Error recovery	Tenacity-based retry logic or SQS retry queues wrapped in `ErrorRecoveryStrategy` (`submodules/.../core/recovery/base.py`) to issue `RecoveryAction` decisions	Vanna Recovery Orchestrator with adaptive retry budgets, safe fallbacks, and incident timelines surfaced directly in the admin console
Conversation filtering	OpenAI Moderation, AWS Comprehend, or custom regex/PII detectors called from `ConversationFilter` implementations (`submodules/.../core/filter/base.py`) to redact or summarize history	Built-in policy filters tuned to Vanna components that catch sensitive data, redact secrets, and route escalations before prompting
Evaluations	Custom harnesses that emit `AgentResult` and plug into the `Evaluator` interface (`submodules/.../core/evaluation/base.py`), plus storage/reporting you build around `EvaluationRunner`	Vanna Evaluation Suite with curated datasets, scoring templates, regression dashboards, and release gates wired into agent deployments
Authentication & users	Auth0, Clerk, Supabase, or Cognito adapters that fulfill the `UserService` contract (`submodules/.../core/user/base.py`) to fetch users, authenticate credentials, and resolve permissions	Vanna Identity layer with built-in session management, granular permissions, and audit-friendly user history shared across teams

Together, the hosted stack mirrors the same abstractions exported by the open-source SDK, but with production-grade defaults, compliance controls, and instrumentation that let teams ship faster and sleep better.

Quick Take (Non-technical)

Capability	DIY vibe	Vanna hosted
LLMs	Juggle API keys and model quirks	One endpoint that stays safe and fast
Tools	Write wrappers, handle creds, hope permissions are right	Plug-and-play tools with guardrails built in
Memory	Stand up databases, manage retention, scrub data manually	Secure conversation history with compliance defaults
Prompts	Keep prompt files in sync across teams	Versioned prompts with analytics and safe defaults
Governance	Build quota checks and approvals yourself	Policies and approvals ready to flip on
Safety	Chain together redaction, caching, fallbacks	Middleware that handles it automatically
Context	Maintain your own RAG pipelines	Managed connectors that inject the right context
Observability	Wire up traces, dashboards, alerts	Full telemetry and playback out of the box
Recovery	Script retries and fallbacks	Automated recovery flows with admin visibility
Filtering	Glue together moderation and PII detectors	Built-in policies that block issues before they reach the model
Evaluations	Spin up QA scripts and keep datasets fresh by hand	Regression tests, scoring, and launch gates already wired in
Authentication	Stitch together login, session, and RBAC checks	Unified auth, permissions, and user audit logs

Enterprise Deployment Path

POC rollout

Ship a standardized Vanna container that bundles LLM endpoint, registry, observability, and evaluation services.
Deploy on a single VM in the customer’s cloud (EC2, Compute Engine, Azure VM); networking is limited to outbound calls to sanctioned data sources.
Quick integration with customer SSO via a lightweight UserService adapter; storage defaults to the managed conversation store inside the container.
Goal: validate value in weeks, collect telemetry, and finalize the production requirements document.

Production build-out

Stand up dedicated databases, object storage, and observability stacks inside the customer’s cloud account.
Deploy Vanna components to separate subnets or Kubernetes clusters with autoscaling, HA, and customer-managed keys.
Integrate with enterprise IdP, logging (e.g., Splunk, Datadog), and ticketing for lifecycle hooks and approvals.
Migrate from the POC container to modular services (LLM, tools, evaluations, observability) that comply with customer compliance policies.

Commercial Options

Offering	Scope	Hosting model	Support & SLAs	Indicative pricing*
DIY + Vanna consulting	Customer builds against the OSS SDK; Vanna provides design reviews, implementation playbooks, and evaluation templates over a 6–8 week engagement	Customer-owned infrastructure	Slack/email support during engagement, optional quarterly health checks	From $45k one-time + optional $5k/month retainer
Vanna managed in your cloud	Vanna deploys and operates the managed stack inside the customer’s AWS/GCP/Azure account, including production integrations and ongoing evaluations	Customer cloud account with Vanna-managed workloads	24/5 support, 99.5% availability SLA, shared runbooks, quarterly optimization workshops	From $12k/month + usage-based LLM costs

*Pricing is indicative; final quotes depend on user counts, data residency, and compliance scope.