Recursive Memory

Recursive memory implements the Recursive Language Model (RLM) pattern — the model iteratively queries a DO-backed document store, deciding what to read at each step, rather than making a single RAG retrieval guess.

Why recursive instead of semantic?

Semantic (RAG) retrieval embeds your query and fetches the top-k similar chunks. It makes one guess before the model has started reasoning. Recursive memory lets the model earn its answer by reading what it needs, learning from it, and reading further.

	Semantic (RAG)	Recursive (RLM)
Retrieval	Embedding similarity — a guess before reasoning	Model decides what to read during reasoning
Cross-references	Misses joins — top-k doesn't follow links	Iterative — reads lead to further reads
Structured data	Flattens tables and matrices into embeddings	Queries structure directly via keyword index
Token cost	One large context per call	Many small calls — reads only what's needed
Best for	Unstructured text, past conversations	Product docs, error codes, KB articles, version matrices

Setup

No extra bindings — recursive memory uses the agent's existing Durable Object storage.

TypeScript

 1import { createAgent } from 'honidev'
 2
 3export const agent = createAgent({
name: 'support-agent',
model: 'claude-sonnet-4-20250514',
memory: {
  recursive: {
    enabled: true,
    maxDepth: 10,       // max REPL iterations (default: 10)
    timeoutMs: 30_000,  // loop timeout in ms (default: 30s)
    chunkSize: 800,     // chars per chunk (default: 800)
  }
}
14})

For voice agents where latency is critical, use a tighter profile:

TypeScript

1  recursive: { enabled: true, maxDepth: 5, timeoutMs: 5_000 }

Loading documents

Load KB documents from a tool handler — the ctx.recursive instance is available to all tool handlers when recursive memory is enabled.

TypeScript

 1import { tool } from 'honidev'
 2import { z } from 'zod'
 3
 4const loadKb = tool({
 5  name: 'load_kb',
 6  description: 'Load a KB article into the document store',
 7  input: z.object({ id: z.string(), content: z.string(), title: z.string().optional() }),
 8  async handler({ id, content, title }, ctx) {
 9    await ctx.recursive!.loadDocument(id, content, title)
10    return { ok: true }
11  }
12})

How the loop works

When recursive memory is enabled, each request runs a REPL loop before the final streamed response:

User message arrives at the DO.
runLoop() fires — the model calls search(), read_chunks(), and get_index() iteratively via generateText with maxSteps.
Each tool call executes against DO storage — sub-millisecond, no network hop.
Loop ends when the model returns text instead of a tool call, or when maxDepth / timeoutMs is reached.
Research result is injected into the system prompt.
streamText produces the final streamed response using the enriched context.

RecursiveMemory API

For direct use outside createAgent():

TypeScript

 1import { RecursiveMemory } from 'honidev'
 2
 3const mem = new RecursiveMemory(doStorage, { enabled: true })
 4
 5// Load a document (chunks + indexes into DO storage)
 6await mem.loadDocument('bridge-kb', content, 'Bridge Upgrade Guide')
 7
 8// Keyword search — returns ranked chunk IDs + snippets
 9const hits = await mem.search('arm mac activation error')
10
11// Fetch full text for specific chunk IDs
12const chunks = await mem.readChunks([0, 1, 2])
13
14// List all loaded documents
15const index = await mem.getIndex()
16
17// Run the full REPL loop
18const result = await mem.runLoop(userMessage, model, systemPrompt)
19// → { answer: string, iterations: number, chunksRead: number[] }

Graph Memory →Workflows →