OSS vs Premium

OSS vs Premium

The updated Python SDK submodule shows that every Vanna agent is really a composition of services: an LlmService, a ToolRegistry, storage, prompts, lifecycle hooks, middleware, enrichers, filters, observability, evaluations, and user authentication. You can wire up each piece yourself by building against the abstract base classes in the SDK (for example LlmService in submodules/vanna-user-agents/python-sdk/vanna/core/llm/base.py, Tool in submodules/vanna-user-agents/python-sdk/vanna/core/tool/base.py, Evaluator in submodules/vanna-user-agents/python-sdk/vanna/core/evaluation/base.py, or UserService in submodules/vanna-user-agents/python-sdk/vanna/core/user/base.py), but the hosted platform ships opinionated counterparts that remove the undifferentiated heavy lifting.

CapabilityCompatible OSS / third-party adaptersVanna-hosted option (batteries-included)
LLM serviceAnthropic Messages via AnthropicLlmService, OpenAI Chat via OpenAILlmService, or any endpoint you wrap to the LlmService async contract (send_request, stream_request, validate_tools)Managed Vanna LLM endpoints with pre-tuned model routing, guardrails, and automatic regression evaluation on model updates
Tool executionPython SDKs such as Snowflake, Slack, or Jira wrapped in a subclass of Tool that exposes a Pydantic args schema and async execute method (see submodules/.../core/tool/base.py)Vanna Tool Registry with role-aware permissions, audit trails, automatic schema validation, and centralized credentials management
Conversation storeDatabases like Supabase/PostgreSQL or Redis that you access from a custom ConversationStore implementation (submodules/.../core/storage/base.py); the repo includes MemoryConversationStore as a working referenceVanna Managed Conversation Store with encrypted retention, PII scrubbing, time-travel restore, and configurable retention policies
System prompt managementPromptLayer, FeatureForm, or Git-backed prompt repos that you read from a subclass of SystemPromptBuilder (submodules/.../core/system_prompt/base.py) to supply the string at runtimeVanna System Prompt Builder that versions prompts per agent, ships safer defaults, and captures prompt→outcome analytics for continuous tuning
Lifecycle governanceLaunchDarkly/Unleash feature flags or internal quota services invoked from LifecycleHook implementations (submodules/.../core/lifecycle/base.py) to block, modify, or approve interactionsPolicy-driven lifecycle hooks that enforce quotas, approvals, session limits, and compliance checks without custom glue code
LLM middleware & safetyGuardrails AI, Presidio scrubbing, or custom caching layers invoked from LlmMiddleware (submodules/.../core/middleware/base.py) to mutate requests/responses before they hit the modelInline Vanna middleware with automatic caching, redaction, multi-provider failover, token accounting, and SOC2-aligned audit logging
Context enrichmentPinecone, Weaviate, or warehouse lookups fetched inside ContextEnricher subclasses (submodules/.../core/enricher/base.py) that attach metadata to ToolContextManaged Vanna context enrichers with zero-copy connectors to your sources, adaptive retrieval, and enrichment telemetry
ObservabilityOpenTelemetry, Datadog, or Honeycomb exporters that satisfy ObservabilityProvider (submodules/.../core/observability/base.py) by implementing record_metric and create_spanVanna Observability with unified traces, live conversation playback, SLA alerting, and long-term cost analytics tuned to agent workflows
Error recoveryTenacity-based retry logic or SQS retry queues wrapped in ErrorRecoveryStrategy (submodules/.../core/recovery/base.py) to issue RecoveryAction decisionsVanna Recovery Orchestrator with adaptive retry budgets, safe fallbacks, and incident timelines surfaced directly in the admin console
Conversation filteringOpenAI Moderation, AWS Comprehend, or custom regex/PII detectors called from ConversationFilter implementations (submodules/.../core/filter/base.py) to redact or summarize historyBuilt-in policy filters tuned to Vanna components that catch sensitive data, redact secrets, and route escalations before prompting
EvaluationsCustom harnesses that emit AgentResult and plug into the Evaluator interface (submodules/.../core/evaluation/base.py), plus storage/reporting you build around EvaluationRunnerVanna Evaluation Suite with curated datasets, scoring templates, regression dashboards, and release gates wired into agent deployments
Authentication & usersAuth0, Clerk, Supabase, or Cognito adapters that fulfill the UserService contract (submodules/.../core/user/base.py) to fetch users, authenticate credentials, and resolve permissionsVanna Identity layer with built-in session management, granular permissions, and audit-friendly user history shared across teams

Together, the hosted stack mirrors the same abstractions exported by the open-source SDK, but with production-grade defaults, compliance controls, and instrumentation that let teams ship faster and sleep better.

Quick Take (Non-technical)

CapabilityDIY vibeVanna hosted
LLMsJuggle API keys and model quirksOne endpoint that stays safe and fast
ToolsWrite wrappers, handle creds, hope permissions are rightPlug-and-play tools with guardrails built in
MemoryStand up databases, manage retention, scrub data manuallySecure conversation history with compliance defaults
PromptsKeep prompt files in sync across teamsVersioned prompts with analytics and safe defaults
GovernanceBuild quota checks and approvals yourselfPolicies and approvals ready to flip on
SafetyChain together redaction, caching, fallbacksMiddleware that handles it automatically
ContextMaintain your own RAG pipelinesManaged connectors that inject the right context
ObservabilityWire up traces, dashboards, alertsFull telemetry and playback out of the box
RecoveryScript retries and fallbacksAutomated recovery flows with admin visibility
FilteringGlue together moderation and PII detectorsBuilt-in policies that block issues before they reach the model
EvaluationsSpin up QA scripts and keep datasets fresh by handRegression tests, scoring, and launch gates already wired in
AuthenticationStitch together login, session, and RBAC checksUnified auth, permissions, and user audit logs

Enterprise Deployment Path

POC rollout

  • Ship a standardized Vanna container that bundles LLM endpoint, registry, observability, and evaluation services.
  • Deploy on a single VM in the customer’s cloud (EC2, Compute Engine, Azure VM); networking is limited to outbound calls to sanctioned data sources.
  • Quick integration with customer SSO via a lightweight UserService adapter; storage defaults to the managed conversation store inside the container.
  • Goal: validate value in weeks, collect telemetry, and finalize the production requirements document.

Production build-out

  • Stand up dedicated databases, object storage, and observability stacks inside the customer’s cloud account.
  • Deploy Vanna components to separate subnets or Kubernetes clusters with autoscaling, HA, and customer-managed keys.
  • Integrate with enterprise IdP, logging (e.g., Splunk, Datadog), and ticketing for lifecycle hooks and approvals.
  • Migrate from the POC container to modular services (LLM, tools, evaluations, observability) that comply with customer compliance policies.

Commercial Options

OfferingScopeHosting modelSupport & SLAsIndicative pricing*
DIY + Vanna consultingCustomer builds against the OSS SDK; Vanna provides design reviews, implementation playbooks, and evaluation templates over a 6–8 week engagementCustomer-owned infrastructureSlack/email support during engagement, optional quarterly health checksFrom $45k one-time + optional $5k/month retainer
Vanna managed in your cloudVanna deploys and operates the managed stack inside the customer’s AWS/GCP/Azure account, including production integrations and ongoing evaluationsCustomer cloud account with Vanna-managed workloads24/5 support, 99.5% availability SLA, shared runbooks, quarterly optimization workshopsFrom $12k/month + usage-based LLM costs

*Pricing is indicative; final quotes depend on user counts, data residency, and compliance scope.