Varun Pratap Bhardwaj
← Back to blog
·5 min read·research

SuperLocalMemory V3: Mathematical Foundations for Production-Grade Agent Memory

How information geometry replaced cloud LLMs for 74.8% accuracy on LoCoMo with data staying local. Fisher-Rao retrieval, sheaf cohomology, and Langevin dynamics explained.

memoryresearchinformation-geometryopen-sourceeu-ai-act

TL;DR: We applied information geometry, algebraic topology, and stochastic dynamics to AI agent memory. 74.8% on LoCoMo with data staying local — the highest score reported without cloud dependency. 87.7% in full-power mode. Open source under MIT.


The Problem Is Scale, Not Storage

Every AI coding assistant — Claude, Cursor, Copilot, ChatGPT — starts every session from scratch. The memory problem has been solved at development scale: Mem0, Zep, Letta, and others provide memory layers that work well for individual developers and small teams.

The unsolved problem is what happens at production scale.

At 10,000 memories, cosine similarity stops discriminating between relevant and irrelevant results. At 100,000 memories, contradictions accumulate silently — "Alice moved to London" and "Alice lives in Paris" coexist without detection. At enterprise scale, hardcoded lifecycle thresholds fail because usage patterns vary across teams, projects, and domains.

And there is a regulatory dimension. The EU AI Act takes full effect August 2, 2026. Every memory system that sends data to cloud LLMs for core operations faces a compliance question that engineering alone cannot resolve — it requires an architectural answer.

We spent the last year applying mathematics to these problems.

Three Mathematical Techniques — Each a First in Agent Memory

1. Fisher-Rao Geodesic Distance (Retrieval)

Standard memory systems use cosine similarity. Cosine treats every embedding as equally confident — a memory accessed once scores identically to one accessed a thousand times, if their directions match.

We model each memory embedding as a diagonal Gaussian distribution with learned mean and variance. The Fisher-Rao geodesic distance — the natural metric on statistical manifolds — measures similarity along the curved surface of the probability space, not through flat Euclidean space.

In practice: memories that have been accessed more become more precise. Variance shrinks with repeated access via Bayesian conjugate updates. The system provably improves at finding things the more you use it.

Ablation: Removing Fisher-Rao drops multi-hop accuracy by 12 percentage points.

2. Sheaf Cohomology (Consistency)

Pairwise contradiction checking is O(n²) and misses transitive contradictions. At enterprise scale, it is both too slow and too weak.

We model the knowledge graph as a cellular sheaf — an algebraic structure from topology that assigns vector spaces to nodes and edges. Computing the first cohomology group H¹(G, F) reveals global inconsistencies from local data:

  • H¹ = 0 → All memories are globally consistent
  • H¹ ≠ 0 → Contradictions exist, even if every local pair looks fine

This scales algebraically, not quadratically. And it catches contradictions that no pairwise method can detect.

3. Riemannian Langevin Dynamics (Lifecycle)

Memory lifecycle management in current systems means hardcoded thresholds: "archive after 30 days," "promote after 10 accesses." These thresholds are tuned for average workloads and fail on everything else.

We replace thresholds with stochastic gradient flow on the Poincaré ball. The potential function encodes access frequency, trust score, and recency. The dynamics provably converge to a stationary distribution — the mathematically optimal allocation of memories across lifecycle states (Active → Warm → Cold → Archived).

No manual tuning. The system self-organizes based on actual usage patterns.

Results

Evaluated on the LoCoMo benchmark (Long Conversation Memory):

| Configuration | Score | What It Means | |:---|:---:|:---| | Mode A Retrieval | 74.8% | Data stays on your machine. Highest local-first score. | | Mode C (Full Power) | 87.7% | Cloud LLM at every layer. Comparable to industry systems. | | Mode A Raw (zero-LLM) | 60.4% | No LLM at any stage. First in the field. |

For context — the competitive landscape:

| System | Score | Cloud LLM Required | |:---|:---:|:---:| | EverMemOS | 92.3% | Yes | | MemMachine | 91.7% | Yes | | Hindsight | 89.6% | Yes | | SLM V3 Mode C | 87.7% | Yes (synthesis only) | | Zep | ~85% | Yes | | SLM V3 Mode A | 74.8% | No | | Mem0 | ~58–66% | Yes | | SLM V3 Mode A Raw | 60.4% | No (zero-LLM) |

The gap between Mode A Raw (60.4%) and Mode A Retrieval (74.8%) demonstrates that the four-channel mathematical retrieval pipeline captures most benchmark value without any cloud dependency. The remaining gap between 74.8% and 87.7% is answer synthesis quality — not knowledge retrieval.

Three Operating Modes

V3 offers a privacy-accuracy spectrum:

Mode A: Local Guardian — All processing local. No cloud calls. EU AI Act compliant by architecture. 74.8% on LoCoMo.

Mode B: Smart Local — Mode A + local LLM via Ollama. Still fully private. No data leaves your machine.

Mode C: Full Power — Cloud LLM at every layer. 87.7% on LoCoMo. This is the configuration comparable to other memory systems. Data leaves the machine for processing.

The choice is yours. Switch anytime. Your memories stay consistent across all modes.

Getting Started

npm install -g superlocalmemory
slm setup
slm warmup    # Optional: pre-download embedding model
slm dashboard # 17-tab web dashboard at localhost:8765

Works with 17+ AI tools: Claude Code, Cursor, VS Code Copilot, Windsurf, ChatGPT Desktop, Gemini CLI, JetBrains, Zed, Continue, Cody, and more.

What We Believe

Current memory systems are impressive engineering. Every system in the competitive table represents meaningful work solving real problems for real users.

Our contribution is mathematical. We believe the future of agent memory is not more heuristics, but principled mathematics — techniques that provide guarantees, scale predictably, and can be adopted by any system.

The three techniques in V3 (Fisher-Rao, sheaf cohomology, Langevin dynamics) are not specific to our product. They are mathematical tools. We open-sourced everything under MIT because we believe the entire field benefits from mathematical foundations.

If these techniques make other memory systems better, we have succeeded.


Paper: arXiv:2603.14588 Code: github.com/qualixar/superlocalmemory Website: superlocalmemory.com

Part of Qualixar | Varun Pratap Bhardwaj — Independent Researcher

VP

Varun Pratap Bhardwaj

AI Agent Reliability Researcher & Builder

Stay Updated

Weekly insights on AI agent reliability, new research, and tools I'm building. No spam, unsubscribe anytime.

Comments