VentureBeat AI

New agentic memory framework uses 118K tokens per query. LangMem burns through 3.26M.

June 26, 2026•6 min read•

Level:Advanced

For:AI Engineers

✦TL;DR

Researchers at the National University of Singapore have developed MRAgent, a framework that enables AI agents to dynamically develop their memory based on accumulating evidence, reducing token consumption and runtime costs. MRAgent uses a "Cue-Tag-Content" mechanism to organize its database, allowing for efficient and scalable active exploration of memory. This approach overcomes the limitations of passive retrieval pipelines, which can fill the LLM's context window with noise and degrade reasoning. The framework uses 118K tokens per query, significantly less than other agentic memory management approaches like LangMem, which burns through 3.26M tokens. This reduction in token consumption has significant practical implications for engineers building AI systems, as it can lead to cost savings and improved performance.

⚡ Key Takeaways

MRAgent uses 118K tokens per query, while LangMem uses 3.26M tokens.
The "Cue-Tag-Content" mechanism is a multi-layered associative graph with three node types: Cues, Tags, and Content.
MRAgent's active memory reconstruction approach allows it to revise its retrieval strategy mid-reasoning and prune irrelevant branches.
The framework's use of a backbone LLM's reasoning abilities to explore multiple candidate retrieval paths enables it to piece together deeply buried information without filling the LLM's context with noise.
The "Cue-Tag-Content" mechanism is a key component of MRAgent's architecture, enabling efficient and scalable active exploration of memory.

💡 Why It Matters

The development of MRAgent has significant implications for engineers building AI systems, as it provides a more efficient and scalable approach to memory management. By reducing token consumption and runtime costs, MRAgent can help improve the performance and cost-effectiveness of AI systems, making them more viable for real-world applications.

✅ Practical Steps

Implement the "Cue-Tag-Content" mechanism in your AI system to enable efficient and scalable active exploration of memory.
Use MRAgent's active memory reconstruction approach to revise your retrieval strategy mid-reasoning and prune irrelevant branches.
Integrate MRAgent with your backbone LLM to enable it to piece together deeply buried information without filling the LLM's context with noise.

Want the full story? Read the original article.

Read on VentureBeat AI ↗

New agentic memory framework uses 118K tokens per query. LangMem burns through 3.26M.

⚡ Key Takeaways

✅ Practical Steps

More like this

Claude Code turned every engineer into three. Now companies need more product thinkers

We Built a Routing Layer to Cut Our AI Costs. It Broke the Product.

Using Local Coding Agents

Build interactive PDF text extraction from Amazon S3