The persistent memory layer for AI

Your AI keeps forgetting.
We fix that.

memory.ms is the memory layer your AI agents have been missing. Store, structure, and retrieve everything your models need to know — without rebuilding context on every prompt.

Read the Docs

Trusted by developers building agents on Claude, GPT, Gemini, and open-source models.

memory.ms — recall.ts

Plug into the entire AI stack

ms p95 recall latency

% uptime SLA

Agents in production

Memories indexed

The Problem

Every conversation starts from zero.
That is the most expensive bug in AI.

Today's AI models are brilliant in the moment and amnesiac the second the window closes. Every new session means re-explaining the same project, re-uploading the same documents, re-pasting the same instructions. Tokens get burned. Costs add up. Users get frustrated.

Worse, your agents cannot learn. A coding agent that fixed a bug on Monday cannot remember the fix on Tuesday. A support bot that resolved a customer issue last week treats the same customer like a stranger today.

Memory is not a feature you bolt on. It is the infrastructure that decides whether your AI is genuinely useful or just an expensive autocomplete.

Context resets every session

Re-paste the same instructions, every prompt, forever.

Tokens burn on repeated context

Bills scale with re-explanation, not real intelligence.

Agents can't learn over time

Yesterday's wins vanish. Same mistakes, again and again.

Multi-agent swarms have no shared brain

Every agent re-discovers the world from scratch.

What is memory.ms

A purpose-built memory store, accessible from any model, anywhere.

Semantic search, structured tagging, and a graph layer that captures how memories relate. On the surface, two API calls — Save and Recall.

Universal Memory API

One endpoint works with every major model and framework. Drop into LangChain, LlamaIndex, AutoGen, or your own stack in a few lines.

Semantic Recall

Search by meaning, not keywords. Ask the way a human would ask, and get the memories that actually answer the question.

Memory Graphs

See how facts, people, projects, and decisions connect. Reason across relationships, not just isolated facts.

Scoped Memory Spaces

Separate memories by user, project, team, or tenant. Each space gets its own access rules, retention, and audit trails.

Time-Aware Recall

Memories carry timestamps and recency weighting. Ask what your agent knew last Tuesday — or what's relevant right now.

Forget on Command

Delete a memory, a session, or an entire user's history with one call. Right-to-be-forgotten across every region.

How It Works

Three steps. No infrastructure to manage.

Save. Recall. Evolve. That is the entire developer experience.

Send anything. We embed and index it.

One API call. Plain text, JSON objects, conversation snippets, file content, structured records. Encryption at rest and in transit, baked in.

Automatic embedding with the latest open and proprietary models
Structured metadata, tags, and namespaces on every write
End-to-end TLS, customer-managed encryption keys on Pro+

// 1. Save anything to memory
await memory.save({
  text: "Acme Corp uses snake_case in their API",
  space: "acme-coding-agent",
  tags: ["convention", "api"],
  source: "engineering-docs",
});
// → memory_8f3a2 stored in 42ms

Query in natural language. Get ranked results.

Memories return ranked by semantic similarity, recency, and relationship strength — back in your agent's context in under 100ms.

Hybrid semantic + keyword + graph ranking
Filter by space, tag, time range, source, or author
Provenance trail attached to every result

// 2. Recall what matters now
const memories = await memory.recall({
  query: "What naming convention does Acme use?",
  space: "acme-coding-agent",
  limit: 5,
});
// → 5 memories returned in 84ms (p95: 92ms)

Memories aren't static. They evolve.

memory.ms automatically merges duplicates, surfaces contradictions, and updates relationships as new information arrives. Your AI gets smarter every interaction.

Automatic deduplication and merging
Contradiction detection with conflict resolution hooks
Relationship graphs updated in real time

// 3. Watch it evolve
memory.on("contradiction", (e) => {
console.log("⚠ ", e.old, "vs", e.new);
return e.resolve("latest");
});
// merges, dedupes, updates graph — automatically

Why teams choose memory.ms

Less infrastructure. More intelligence.

Sub-100ms recall

p95 latency, globally — context arrives before your model finishes thinking.

Model-agnostic by design

Switch from GPT to Claude tomorrow without losing a single memory.

Privacy by default

We don't train on your data. Customer-managed keys. SOC 2, GDPR, HIPAA-ready.

Run anywhere

Cloud, on-premise, hybrid, air-gapped. Same API, your infrastructure.

Transparent retrieval

Every result ships with a provenance trail. No black boxes — agents can explain themselves.

Cuts token costs

Stop paying to re-explain. Bills shrink as memory does the heavy lifting.

Built for real workloads

Memory that ships in production.

Coding Agents

Cursor, Claude Code, Cline

A vault that remembers project conventions, API contracts, and architectural decisions. Stop re-explaining your codebase every session.

Customer Support

Bots that actually remember

Every customer's history, preferences, and past issues. The fifth conversation feels like the fifth conversation, not a reset.

Research

Read once, remember forever

Papers, reports, briefings, meeting notes — queryable, connected, retrievable in context, on demand.

Personal AI

Assistants that know the user

Preferences, routines, relationships, ongoing goals. The memory layer is what makes a chatbot feel like an actual assistant.

Sales & CRM

AI account executives

Capture every meeting, email, objection, commitment. Walk into the next call with full context, not a blank page.

Multi-Agent

One shared brain for swarms

When agents collaborate they need shared memory. memory.ms is the single source of truth your swarm reads from and writes to.

Integrations

Works with what you already use.

Model-agnostic and framework-agnostic by design. If it speaks JSON, it can use memory.ms.

Claude

GPT

Gemini

Llama

Mistral

DeepSeek

LangChain

LlamaIndex

AutoGen

CrewAI

Vercel AI

MCP

Security & Compliance

Built on principles that do not bend.

Privacy by default. Compliance that travels. Transparent retrieval at every layer.

Capability	Free	Pro	Team	Enterprise
Encryption at rest & in transit
Memory graphs & scoped spaces	—
Customer-managed encryption keys	—
SOC 2 Type II report	—	On request
HIPAA-ready architecture	—	—	Add-on
Regional data residency (US/EU/UK/APAC)	US	US/EU	All regions	All + custom
Self-hosted / air-gapped deployment	—	—	—

What builders say

From the engineers shipping with memory.

"We dropped our token spend by 38% the week we shipped memory.ms. Our coding agent stopped re-reading the entire codebase on every prompt — it just remembers."

Ravi Krishnan

Head of AI · Shiprock Labs

"The memory graph is the unlock. Our support agents reason across customer history, account changes, and product updates without us writing a single line of retrieval code."

Elena Marquez

Director of Engineering · Helmwave

"We needed on-prem with our own keys for a regulated workload. memory.ms was the only platform that gave us the same API on our infrastructure. Painless."

Julian Tate

Principal Architect · NorthAxis Health

Frequently asked

Answers, before you ask.

Vector databases store embeddings. memory.ms stores memories. We handle embedding, retrieval ranking, deduplication, contradiction detection, relationship mapping, scoping, and lifecycle management — every layer a vector DB leaves to you.

No. memory.ms is a memory backend, not a model wrapper. Use any model you want. Switch tomorrow without losing a single memory.

You export everything in a standard format and we delete the rest. No hostage-taking, no migration penalties. Memory belongs to whoever wrote it.

Yes, on the Enterprise plan. Same API, same SDKs, your servers, your encryption keys.

Sub-100 millisecond p95 latency on cloud deployments. Self-hosted performance scales with the hardware you give it.

Yes. The Free plan never expires — 10,000 memories and 100,000 recalls per month, generous enough to run small production agents indefinitely.

Your AI keeps forgetting.We fix that.

Every conversation starts from zero.That is the most expensive bug in AI.