Recursive Memory
Recursive memory implements the Recursive Language Model (RLM) pattern — the model iteratively queries a DO-backed document store, deciding what to read at each step, rather than making a single RAG retrieval guess.
Why recursive instead of semantic?
Semantic (RAG) retrieval embeds your query and fetches the top-k similar chunks. It makes one guess before the model has started reasoning. Recursive memory lets the model earn its answer by reading what it needs, learning from it, and reading further.
| Semantic (RAG) | Recursive (RLM) | |
|---|---|---|
| Retrieval | Embedding similarity — a guess before reasoning | Model decides what to read during reasoning |
| Cross-references | Misses joins — top-k doesn't follow links | Iterative — reads lead to further reads |
| Structured data | Flattens tables and matrices into embeddings | Queries structure directly via keyword index |
| Token cost | One large context per call | Many small calls — reads only what's needed |
| Best for | Unstructured text, past conversations | Product docs, error codes, KB articles, version matrices |
Setup
No extra bindings — recursive memory uses the agent's existing Durable Object storage.
For voice agents where latency is critical, use a tighter profile:
Loading documents
Load KB documents from a tool handler — the ctx.recursive instance is available to all tool handlers when recursive memory is enabled.
How the loop works
When recursive memory is enabled, each request runs a REPL loop before the final streamed response:
- User message arrives at the DO.
- runLoop() fires — the model calls
search(),read_chunks(), andget_index()iteratively viagenerateTextwithmaxSteps. - Each tool call executes against DO storage — sub-millisecond, no network hop.
- Loop ends when the model returns text instead of a tool call, or when
maxDepth/timeoutMsis reached. - Research result is injected into the system prompt.
- streamText produces the final streamed response using the enriched context.
RecursiveMemory API
For direct use outside createAgent():