Engineering / MAR 4 · 2026

Agent Memory Done Right

The industry spent years overcomplicating AI agent memory. We took the simplest approach that works — plain text facts in the cloud.

Memory is the missing piece for AI agents. Without it, every task starts from zero — the agent doesn't know your preferences, your project conventions, or what it learned yesterday. The industry recognized this years ago. Then it spent years overcomplicating the solution.

The Landscape

Here's what the market built:

Mem0 — a managed memory layer with vector stores, knowledge graphs, and rerankers. Hidden LLM calls extract "memories" from conversations behind the scenes. Complex infrastructure: SOC 2 compliance, graph databases, decay mechanisms. They reported 26% better accuracy than OpenAI's built-in memory and 91% faster retrieval. Impressive numbers. Heavy machinery.

MemGPT / Letta — virtual context management inspired by operating system memory paging. A dual-layer architecture: in-context memory (editable blocks the agent can read/write) and out-of-context memory (archival storage for overflow). When context fills up, a compaction step flushes older information to archival. Now rearchitected as Letta V1 for the latest models. Clever, but complex.

Zep — enterprise knowledge graphs with temporal awareness, entity extraction, and relationship mapping. Designed for multi-agent teams that need to share structured knowledge. Enterprise-grade infrastructure.

LangMem — memory as tool calls within LangGraph. Semantic search plus graph memory, tightly coupled to the LangChain ecosystem.

OpenClaw — the outlier. Just markdown files in the agent's workspace. MEMORY.md for long-term facts, daily log files for recent context. Hybrid search with BM25 and vector similarity. Temporal decay with configurable half-life. Local-first — files are the source of truth. Simple, elegant, and it works. But everything lives on one machine.

Claude Code — a similar local approach. CLAUDE.md hierarchy for project instructions, auto-memory files in ~/.claude/projects/. Machine-local. Not shared across environments.

The Problem

Two failure modes. Every system falls into one or the other.

Too complex. Mem0, Letta, and Zep add hidden LLM calls, graph databases, decay algorithms, compaction logic. You end up with more infrastructure to maintain than the agent itself. Each layer is another thing that can break, another thing to debug, another thing to monitor.

Too local. OpenClaw and Claude Code are simple and they work great — until the machine changes, the workspace resets, or you need to share context across agents. A memory system that dies when you switch laptops isn't a memory system. It's a notepad.

Rebyte's Design — Radically Simple

We wanted the simplicity of OpenClaw's approach with the persistence of a cloud system. Here's what we built:

Each memory = one plain-text fact, 500 characters max, with a unique key
Scoped to org + user — persists across all tasks, shared between agents
Five operations: save, search, list, get, delete
Vector embeddings (text-embedding-3-small, 1536 dimensions) for semantic search
PostgreSQL + pgvector — no extra infrastructure, just a table
The agent manages everything — no hidden LLM calls, no automatic extraction, no decay

That's the entire system.

Identity First

Before doing any work, the agent introduces itself to the memory system. This is how context gets scoped — the memory knows who is remembering.

The Memory Lifecycle

Memory lives entirely in the agent's judgment. Two moments matter: task start and task end.

The agent decides what matters. At the start of every task, it searches for relevant context. At the end, it reflects on what it learned and saves anything worth remembering. No automatic extraction, no hidden LLM calls — purely the agent's judgment.

When Agent A (Claude Code) saves a memory in Task 1, Agent B (Gemini CLI) can recall it in Task 47 three weeks later. The memory belongs to the user, not the machine.

Two Types of Memory

Agent-managed memory handles what the agent chooses to remember. But there's a second kind of memory that's equally valuable: everything the agent said and did.

Every task on Rebyte produces a full conversation — the user's prompts, the agent's reasoning, code it wrote, errors it hit, solutions it found. All of this streams to cloud storage as it happens. That's hundreds of tasks, thousands of exchanges, accumulating over weeks and months.

This is long-term memory. Not curated facts — raw history. And it's searchable.

Active memory is like a person's working knowledge — the facts you carry in your head. Long-term memory is like being able to search through every conversation you've ever had, every decision you've ever made, every mistake you've ever fixed.

How Long-term Memory Works

Every conversation is already stored in the cloud — that's how Rebyte works. The events stream to cloud storage as they happen. What we added is a search index on top of that data, so agents can query it semantically.

When an agent needs deeper context — "how did we handle rate limiting last time?" or "what was the schema migration strategy?" — it searches long-term memory. The index finds relevant conversations across all past tasks, and the agent gets the full context of what happened: the user's request, the agent's approach, the code that was written, the outcome.

Active memory gives the agent quick facts. Long-term memory gives it full context. Together, they mean an agent working on your project in month six has access to everything that happened in months one through five — not just what someone remembered to write down, but the complete history.

OpenClaw's Memory, in the Cloud

OpenClaw got the philosophy right. The agent decides what to remember. Memories are plain text. Search is semantic. No hidden processes running behind the scenes. We agree with all of it.

The difference is where it lives.

	OpenClaw	Rebyte Memory
Storage	`MEMORY.md` on local disk	PostgreSQL rows + vector index
Persistence	Dies when workspace resets	Persists forever across tasks
Sharing	Single machine, single agent	All agents in the org
Search	BM25 + vector (SQLite)	Semantic vector (pgvector HNSW)
Interface	Agent writes markdown files	Agent calls HTTP API
Philosophy	Agent-driven, explicit	Agent-driven, explicit

Same philosophy. Different address. Rebyte takes OpenClaw's agent-driven approach and moves it to the cloud — no local files to lose, no workspace to reset, shared across every agent. Then adds long-term memory on top: every conversation indexed and searchable, so agents have access to the complete history, not just what they chose to remember.

Why Simple Wins

No graph databases. No decay algorithms. No hidden LLM calls. No compaction hooks.

Two types of memory, both simple. Active memory: the agent decides what to remember, saves plain text facts, searches semantically. Long-term memory: every conversation is already stored — just index it and let agents search.

That's the entire system. It works because it doesn't try to be clever.

The best memory system is the one that's simple enough to reason about, persistent enough to survive across tasks, and complete enough that nothing is lost. Everything else is overhead.

Written by Rebyte Team ← All notes