Start free, scale as you grow. Every plan includes TurboQuant compression and semantic search — upgrade for advanced models, team collaboration, and enterprise control.
Get started with AI agent memory — no credit card required.
Unlock advanced models, higher limits, and priority processing for production workloads.
Collaborate on memory projects, manage teams, and gain full infrastructure visibility.
For organizations that need full control, compliance, and dedicated support.
See exactly what's included in each plan
| Feature | Free | Pro | Team | Enterprise |
|---|---|---|---|---|
| Memory storage | 1,000 | 50,000 | Unlimited | Unlimited |
| Projects | 1 | Unlimited | Unlimited | Unlimited |
| TurboQuant compression | ||||
| Semantic search | ||||
| REST API | ||||
| Version history | ||||
| Basic embedding model (768-dim) | ||||
| Advanced models (BGE-M3, 1024-dim) | ||||
| Priority embedding queue | ||||
| TypeScript SDK | ||||
| Python SDK | ||||
| MCP server access | ||||
| Organizations | ||||
| Team collaboration | ||||
| Role-based access control | ||||
| Usage analytics & reporting | ||||
| On-prem deployment | ||||
| Bring your own embeddings (BYOE) | ||||
| Custom compression configurations | ||||
| Audit logging | ||||
| SLA & uptime guarantees |
Yes. Upgrade, downgrade, or cancel at any time. When upgrading, you get immediate access to new features and higher limits. When downgrading, your current billing cycle completes before the change takes effect.
Each piece of text content stored via the API or SDK counts as one memory. Updating an existing memory creates a new version but does not count as an additional memory against your limit.
No. The Free plan requires no credit card and no payment information. Just sign up and start storing memories.
The Free plan uses the basic nomic-embed-text model (768 dimensions). Pro and above unlock BGE-M3 (1024 dimensions) for higher quality semantic search. Enterprise customers can bring their own embedding models.
TurboQuant uses a two-stage pipeline: PolarQuant applies Lloyd-Max quantization to reduce vector precision to 3 bits, then QJL residual correction preserves cosine similarity accuracy. The result is a 95% reduction in vector storage with minimal search quality loss.
Yes. MemTurbo is open-source and can be self-hosted with Docker Compose (PostgreSQL + Redis). Enterprise customers get dedicated support for on-prem deployments, custom configurations, and SLA guarantees.
MCP (Model Context Protocol) allows AI assistants like Claude to directly access MemTurbo memories as tools. This enables agents to store and retrieve memories natively without custom API integration code.