AWS published architecture for company-wide AI agent memory using Bedrock, Neptune, and Mem0
AWS has published an architecture that combines Amazon Bedrock, the Neptune graph database, and the Mem0 framework for persistent AI agent memory at the company-wide level, solving the problem of context loss between sessions and users.
This article was generated using artificial intelligence from primary sources.
The fundamental problem of agents
Today’s AI agents share a common limitation: they lose context between interactions. After a session ends, the agent forgets what the user said, which projects they manage, and which decisions have already been made. In an enterprise environment this is unacceptable because it requires constantly repeating context.
AWS has published a reference architecture that solves this problem by combining three components. Amazon Bedrock provides the foundational language models (Claude, Llama, Titan), Amazon Neptune serves as a graph database for long-term memory, and the Mem0 framework manages the memory lifecycle — what to remember, when to retrieve, and when to forget.
Why a graph database rather than a vector database
Most existing AI memory solutions use vector databases (Pinecone, Weaviate) that store text embeddings. This approach works well for semantic search — “find similar conversations” — but poorly for structured relationships.
Amazon Neptune brings a different approach. Entities (employees, projects, documents, clients) are stored as nodes, and the relationships between them as edges. An agent can query “which documents are connected to project Alpha led by Ana” and get a precise answer without hallucinations. This is critical for enterprise environments where reliability is required.
The role of the Mem0 framework
Mem0 is an open-source framework that standardizes how agents manage memory. It provides APIs for three basic operations: writing new facts, retrieving relevant information in context, and forgetting outdated data. Without Mem0, every team has to write such logic themselves.
The AWS architecture shows how Mem0 works with Bedrock models and a Neptune database. When an agent receives a query, Mem0 first retrieves relevant memory from the Neptune graph, then injects it into the prompt for the Bedrock model. After the response, new facts are saved back to Neptune as new nodes and edges.
Human-in-the-loop validation
A key element of the architecture is human-in-the-loop validation. The agent does not automatically save everything it learns — people review and approve important facts before they become part of permanent memory. This prevents memory contamination from bad or incorrect information.
This approach is particularly useful in regulated industries such as finance and healthcare, where auditability is mandatory. Each fact in memory has metadata about who validated it and when, making later audits easier.
What this means for enterprise
The combination of Bedrock + Neptune + Mem0 enables building agents that remember context for weeks and months. This is a prerequisite for real production use cases such as customer service automation, process agents in operations, and specialized agents for legal and financial teams.
Frequently Asked Questions
- Why do AI agents need persistent memory?
- Without memory, agents forget context as soon as a session ends. This means they must re-learn user preferences, project history, and relationships between entities every time. Persistent memory allows an agent to remember what it worked on last week, which employee leads which project, and which decisions have already been made.
- What does Amazon Neptune bring to this architecture?
- Neptune is a graph database that enables complex queries about connected entities. For example, an agent can ask 'which employees worked on projects led by Ivan' — such a query requires multiple JOIN operations in a relational database, while in a graph it is natural and fast. The graph structure matches how people remember relationships.
- What is the role of the Mem0 framework?
- Mem0 is an open-source framework specialized in managing AI agent memory. It abstracts the writing, retrieval, and forgetting of information, so developers don't have to manually write logic for each type of memory. It combines with Neptune for long-term storage and Bedrock for response generation.
Related news
arXiv:2605.22502: Compiling agentic workflows into LLM weights achieves near-frontier quality at 100× lower cost
arXiv:2605.22794: MOSS shows agents that self-improve by rewriting their own source code
arXiv:2605.22535: TerminalWorld benchmark measures LLM agents on real Linux terminal tasks without simulation