← Back to honi.dev

Recursive Memory

Recursive memory implements the Recursive Language Model (RLM) pattern — the model iteratively queries a DO-backed document store, deciding what to read at each step, rather than making a single RAG retrieval guess.

Why recursive instead of semantic?

Semantic (RAG) retrieval embeds your query and fetches the top-k similar chunks. It makes one guess before the model has started reasoning. Recursive memory lets the model earn its answer by reading what it needs, learning from it, and reading further.

Semantic (RAG)Recursive (RLM)
RetrievalEmbedding similarity — a guess before reasoningModel decides what to read during reasoning
Cross-referencesMisses joins — top-k doesn't follow linksIterative — reads lead to further reads
Structured dataFlattens tables and matrices into embeddingsQueries structure directly via keyword index
Token costOne large context per callMany small calls — reads only what's needed
Best forUnstructured text, past conversationsProduct docs, error codes, KB articles, version matrices

Setup

No extra bindings — recursive memory uses the agent's existing Durable Object storage.

TypeScript
1import { createAgent } from 'honidev'
2
3export const agent = createAgent({
4 name: 'support-agent',
5 model: 'claude-sonnet-4-20250514',
6 memory: {
7 recursive: {
8 enabled: true,
9 maxDepth: 10, // max REPL iterations (default: 10)
10 timeoutMs: 30_000, // loop timeout in ms (default: 30s)
11 chunkSize: 800, // chars per chunk (default: 800)
12 }
13 }
14})

For voice agents where latency is critical, use a tighter profile:

TypeScript
1 recursive: { enabled: true, maxDepth: 5, timeoutMs: 5_000 }

Loading documents

Load KB documents from a tool handler — the ctx.recursive instance is available to all tool handlers when recursive memory is enabled.

TypeScript
1import { tool } from 'honidev'
2import { z } from 'zod'
3
4const loadKb = tool({
5 name: 'load_kb',
6 description: 'Load a KB article into the document store',
7 input: z.object({ id: z.string(), content: z.string(), title: z.string().optional() }),
8 async handler({ id, content, title }, ctx) {
9 await ctx.recursive!.loadDocument(id, content, title)
10 return { ok: true }
11 }
12})

How the loop works

When recursive memory is enabled, each request runs a REPL loop before the final streamed response:

  1. User message arrives at the DO.
  2. runLoop() fires — the model calls search(), read_chunks(), and get_index() iteratively via generateText with maxSteps.
  3. Each tool call executes against DO storage — sub-millisecond, no network hop.
  4. Loop ends when the model returns text instead of a tool call, or when maxDepth / timeoutMs is reached.
  5. Research result is injected into the system prompt.
  6. streamText produces the final streamed response using the enriched context.

RecursiveMemory API

For direct use outside createAgent():

TypeScript
1import { RecursiveMemory } from 'honidev'
2
3const mem = new RecursiveMemory(doStorage, { enabled: true })
4
5// Load a document (chunks + indexes into DO storage)
6await mem.loadDocument('bridge-kb', content, 'Bridge Upgrade Guide')
7
8// Keyword search — returns ranked chunk IDs + snippets
9const hits = await mem.search('arm mac activation error')
10
11// Fetch full text for specific chunk IDs
12const chunks = await mem.readChunks([0, 1, 2])
13
14// List all loaded documents
15const index = await mem.getIndex()
16
17// Run the full REPL loop
18const result = await mem.runLoop(userMessage, model, systemPrompt)
19// → { answer: string, iterations: number, chunksRead: number[] }