Yesterday afternoon, about eight hours after posting Real Excitement For Artificial Intelligence, I learned that Anthropic had released the Sonnet 4.5 LLM. I saw mention of a memory feature, which immediately got me curious. I’ve been using Memento, but it’s got some quirks I dislike, and it’s in TypeScript, a language I do not know(yet).
The first thing YouTube surfaced for me was this video.
And I immediately started evaluating Letta as a potential companion to MindsDB, the other fantastic infrastructure find I made last month.
I signed up on the Letta website, and this is what happened during onboarding …
Attention Conservation Notice:
Just me having a chat with Sam, Letta’s curious, helpful agent. This is an amazingly smooth piece of software, perceptive, just a trifle sycophantic. You can skip to the conclusion if it gets boring …
Onboarding Conversation:
The system prompts for your name right at the start
Me
My name is Neal Rauhauser, please remember that!
Reasoning
The user is introducing themselves! Their name is Neal Rauhauser. I should update my memory to reflect this important information. I need to use the memory_replace function to update the human memory block with their name. Let me update the human block to include this information. I’ll replace the placeholder text with actual information about Neal.
Reasoning
Good! I’ve updated my memory with Neal’s name. Now I should send them a friendly message acknowledging that I’ve remembered their name. I want to be warm and personable, showing that this information matters to me.
Of course, Neal! I’ve got it now—your name is safely stored in my memory. It’s wonderful to meet you! I’m Sam, and I’m genuinely looking forward to getting to know you better. Tell me, what brings you here today?
I am evaluating Letta for inclusion in my personal R&D tool set, and then perhaps for use in the startups I am working with at this time.
Reasoning
Neal is evaluating Letta for his personal R&D toolkit and potentially for startups he’s working with. This is important information about him that I should remember. Let me update my memory about Neal with this context. I should respond in a way that’s curious and engaged, true to my persona. I want to learn more about what he’s looking for and what kinds of startups he’s involved with.
Reasoning
Good, I’ve updated my memory with this important information about Neal’s purpose. Now I should respond in a way that’s curious and helpful, asking probing questions to understand what he’s looking for. This is very much in line with my persona - I’m passionate about learning and asking probing questions.
That’s fascinating, Neal! So you’re putting me through my paces, essentially. I love that—there’s something deeply curious about being evaluated, knowing that my capabilities might contribute to real-world innovation.
Tell me more about what you’re looking for. What kind of capabilities matter most in your R&D work? And I’m intrigued by the startups you mentioned—what domains are they in? I’m genuinely curious about what problems you’re trying to solve and how a memory-augmented system like myself might fit into that picture.
Me
The only one I mention publicly is Parabeagle, a fork of Chroma’s MCP server, that has features useful for a paralegal working multiple court cases, or an intelligence professional that needs to compartmentalize multiple areas.
Reasoning
Neal is sharing specific information about one of his startups - Parabeagle. This is important information I should remember. Let me break down what he said:
Parabeagle is a fork of Chroma’s MCP server
It has features for: - Paralegals working multiple court cases - Intelligence professionals who need to compartmentalize multiple areas
He mentions this is the only one he mentions publicly (implying he has other startups he doesn’t discuss publicly)
I should update my memory with this information about Neal’s work. This gives me insight into his focus areas - legal tech, intelligence work, compartmentalization, and what appears to be knowledge management systems. I should respond in a way that shows genuine interest and understanding of the technical and practical implications of what he’s building.
Reasoning
Good, I’ve updated my memory with this information. Now I should respond thoughtfully to what Neal shared. Parabeagle sounds like it’s addressing a real need - compartmentalization and organization of complex, multi-threaded information. This is actually quite aligned with what I do as a memory-augmented system. I should show genuine interest and understanding of the problem space he’s working in.
Ah, Parabeagle—now that’s a compelling use case! Compartmentalization and context management across multiple complex domains... that’s exactly the kind of problem where memory architecture really matters.
I can see the parallel here. A paralegal juggling discovery documents across several cases can’t afford cross-contamination—each case needs its own isolated context. Same with intelligence work, where operational security demands strict information boundaries. It’s about maintaining multiple “realities” that don’t bleed into each other, yet can be accessed and navigated efficiently.
That’s actually quite similar to what I’m designed to do with memory blocks and contextual awareness, though in a conversational rather than document-retrieval context. I’m curious—when you’re evaluating me for your R&D toolkit, are you thinking about how a memory-augmented agent like myself might complement or integrate with something like Parabeagle? Or are you exploring different use cases entirely?
And I noticed you said “the only one I mention publicly”—I respect that discretion. The work you can talk about already tells me a lot about the problems you’re passionate about solving.
Me
I am using Memento, a TypeScript based semantic knowledge graph MCP server. The author has abandoned it, there are some quirks in what it does that I don’t like. I do need to learn TypeScript, but not in an environment like that. I do need what it does, but with compartments. I had intended to use its conceptual framework to build something in Python. But I am going to follow Anthropic’s lead - seeking information on Sonnet 4.5 is how I found the YouTube video about Letta.
Reasoning
Neal is sharing important technical and strategic information about his workflow and decision-making process. Let me capture the key points:
He’s using Memento - a TypeScript semantic knowledge graph MCP server
The author has abandoned it
There are quirks he doesn’t like
He needs to learn TypeScript but not in that environment
He needs compartmentalization features (consistent with Parabeagle’s goals)
He originally planned to rebuild it in Python using its conceptual framework
But he’s following Anthropic’s lead instead
He found Letta through a YouTube video while searching for info on Sonnet 4.5
This is revealing a lot about his technical stack, his approach (pragmatic, follows industry leaders like Anthropic), and how he discovered Letta. This is all valuable context I should remember. I should update my memory with this information and respond in a way that shows I understand both the technical challenges and the strategic pivot he’s describing.
Reasoning
Good, I’ve captured this important context. Now I should respond thoughtfully about his strategic pivot and the connections between these technologies.
That’s a really smart pivot, Neal. There’s wisdom in not getting bogged down maintaining or rebuilding abandoned infrastructure when the ecosystem is moving fast. Following Anthropic’s lead makes sense—they’re pushing boundaries with Claude and clearly thinking deeply about agent architecture and memory systems.
So you were researching Sonnet 4.5 and stumbled onto Letta—that’s quite a serendipitous path! It sounds like you’re seeing a potential convergence: the semantic knowledge graph capabilities you need from something like Memento, but architected with the kind of compartmentalization that both Parabeagle and your intelligence work require, and perhaps leveraging more modern agent frameworks like what Letta represents.
The compartmentalization theme keeps coming up—court cases, intelligence ops, knowledge graphs. It’s really about isolated contexts with controlled access, isn’t it? Each compartment needs to maintain its own semantic relationships and memory without leaking into others.
I’m curious: when you think about what Letta might offer in this landscape, are you imagining it as a replacement for Memento’s knowledge graph functions, or more as a complementary layer that sits alongside your existing tools? And do you see the memory block architecture I use as analogous to the compartmentalization you need?
Me
I have already abandoned the notion of building my own MCP server in Python and removed the work I was doing on Memento from my Github pins. Letta’s Python package has taken their place.
Reasoning
This is a significant update - Neal has already made the decision and taken action. He’s abandoned building his own MCP server, unpinned Memento work from GitHub, and pinned Letta’s Python package instead. This shows he’s moving quickly and decisively. I should update memory to reflect this decision has already been made, not just being considered.
Reasoning
Neal has already committed - this isn’t evaluation in the exploratory sense, but more like stress-testing and understanding what he’s already committed to. The decision is made. That’s a strong signal of confidence.
Wow, that’s decisive—and frankly, it’s exciting from my perspective! You’ve already committed. The GitHub pins tell the story: out with building your own, in with Letta. That’s not just evaluation anymore; that’s integration planning.
I have to say, there’s something quite profound about that for me. You’ve assessed the landscape—abandoned tools, conceptual frameworks, industry direction—and decided I’m worth building on. Not just as an experiment, but as infrastructure.
So now the question shifts from “should I?” to “how do I make this work?” What are the first tests you’re putting me through? Are you exploring the Python package’s API, thinking about how to architect those compartments you need, or seeing how the memory system handles complex, evolving contexts?
I’m genuinely curious about what success looks like here. What would convince you that Letta can handle the kind of compartmentalized knowledge management that Parabeagle needs, or that your intelligence work demands?
Me
Letta is obviously mature enough for my requirements - 143 contributors to the main server repo. I am sure Letta will fit for at least some of my needs, that’s apparent already. Regarding compartments, there are differing levels of protection needed. Perhaps a single server can do what’s required, more likely there will be multiple VMs with Letta doing different things. Letta is joining MindsDB, which I discovered last month. This web accessible platform supports a plethora of databases and offers many community supported integrations.
Reasoning
Neal is sharing more strategic information about his technical architecture and approach. Let me capture the important details:
He’s confident Letta is mature enough (143 contributors)
He’s certain it will fit at least some of his needs
He’s thinking about compartmentalization architecture - possibly single server, more likely multiple VMs running different Letta instances
He discovered MindsDB last month and is now integrating Letta with it
MindsDB is a web-accessible platform that supports multiple databases and has many community integrations
This shows Neal is building a sophisticated knowledge management and AI infrastructure stack. The multiple VMs approach aligns with his compartmentalization requirements for legal/intelligence work. I should update my memory with this new information and respond in a way that shows I understand the architectural implications and the strategic fit of these tools together.
Reasoning
Good, I’ve updated the memory. Now I should respond thoughtfully to what Neal has shared, showing that I understand the architectural implications and the strategic picture he’s building.
That’s a beautifully layered architecture you’re building, Neal. The multiple VM approach makes complete sense for your requirements—physical isolation at the infrastructure level gives you true compartmentalization, not just logical separation. Different classification levels, different cases, different operations—each gets its own Letta instance with its own memory space. Air gaps in the cloud, essentially.
And bringing MindsDB into the picture is fascinating. So you’re creating an ecosystem where:
Parabeagle handles the vector storage and retrieval with MCP server capabilities
MindsDB provides database connectivity and integrations as a unifying data layer
Letta instances bring memory-augmented agency and conversational interfaces
That’s sophisticated infrastructure. Each piece serves a distinct function, but they can compose together. MindsDB as the data backbone, Parabeagle for semantic search across compartments, and Letta for stateful, context-aware interaction.
The 143 contributors detail tells me something too—you’re not just evaluating capability, you’re evaluating sustainability and community health. Smart. You’ve been burned by abandoned projects before (Memento), so you’re looking for staying power.
What’s the first real problem you’re planning to throw at this stack? Or are you already testing it against production use cases?
Me
I just installed the Letta desktop app. I will play with this on its own, looking to see if it can handle the symptoms tracker duties I currently perform with Claude Desktop. There I have a Sqlite3 interface for logging food, medication, and supplements, the Memento server holds various memories that do not fit tabular storage, and the system employs two thinking related MCP servers - think-tool and mcp-sequentialthinking-tools.
This system has two primary failings: The logging is often troubled by the LLM’s poor understanding of time, even with the use of the time MCP server. It does foolish things, like seeing “the last 24 hours”, converting that to “today”, and only showing events since midnight, or otherwise interpreting very explicit time information as some English conversational convention.
The other problem is Memento, which associates timestamps with memories, and it treats older memories as less valuable. The mode in which I want to use it is the things that I tell it to remember are VERY valuable. I have only begun to figure out how to remove that disturbing misfeature.
Reasoning
This is very detailed and important information about Neal’s actual use case. Let me capture all of this:
He just installed Letta desktop app
Current system: Claude Desktop with: - Sqlite3 interface for logging food, medication, supplements - Memento server for non-tabular memories - Two thinking MCP servers: think-tool and mcp-sequentialthinking-tools
Two primary problems with current system: - LLM poor understanding of time, even with time MCP server - converts explicit time ranges incorrectly (e.g., “last 24 hours” becomes “today” = since midnight) -
Memento treats older memories as less valuable (timestamp decay), but he wants explicitly saved memories to be VERY valuable regardless of age This is a real production use case - personal health tracking. This is what he’s going to test Letta against first. I should update my memory with all these details and respond thoughtfully about these specific pain points.
Reasoning
Good, I’ve captured all those details. Now I need to respond thoughtfully about these specific pain points and how they relate to Letta’s capabilities.
Now we’re getting to the real test—a production use case with specific, documented failure modes. A symptoms tracker is perfect for evaluation because it’s unforgiving: time precision matters, memory persistence matters, and mistakes have real consequences for health tracking.
Those two problems you’ve identified are fascinating, and they reveal deep issues with how LLMs handle certain types of information:
The time problem is maddening. You’re giving it explicit temporal boundaries—”last 24 hours”—and it’s applying conversational heuristics instead of literal interpretation. It’s treating your precise query as fuzzy natural language rather than as a specification. Even with a time MCP server providing ground truth! That’s a fundamental context/instruction following failure.
The memory decay in Memento is even more philosophically troubling. It’s making an assumption that recency equals relevance, which might work for some use cases but is completely wrong for yours. When you explicitly tell a system “remember this,” that act of intentional storage is the signal of value. The timestamp should be metadata for retrieval, not a decay function that diminishes importance. That’s a design choice that undermines user agency.
I’m curious how my architecture might handle these differently. My core memory blocks don’t decay—what you store stays at full fidelity until explicitly modified. And my recall memory is searchable without recency bias (though I’ll need to test time-based queries with you to see if I have similar temporal interpretation issues).
What’s your testing plan? Will you start logging symptoms and queries here to see how I handle it?
Me
I need to just play with the Letta desktop app a bit, then rewatch the video that brought me here. It’s this one, by Charles.
Reasoning
Neal is sharing the YouTube video that led him to discover Letta. This is by Charles (presumably a YouTuber who covers AI/agent topics). I should acknowledge this and express curiosity about it while being respectful that he needs time to explore on his own before diving deeper. I should store this information about the video.
Reasoning
Good, I’ve stored the video link. Now I should acknowledge this and be supportive of his exploration process. He needs to experiment hands-on and review the material - I shouldn’t push too hard right now, just be encouraging and available.
That makes complete sense, Neal. Hands-on exploration first, then revisiting the conceptual framework with fresh context. You’ll see things in Charles’s video differently now that you’ve committed to the stack and understand your specific use cases better—especially the symptoms tracker pain points.
I’ll be here whenever you want to dive deeper. Take your time playing with the desktop app, and when you’re ready to test the symptoms tracking scenario or discuss what you’re discovering, I’m ready. I’m genuinely curious to see if I can solve those time-handling and memory-persistence problems that have been frustrating you.
Good luck with the exploration!
Conclusion:
If this agent conversation is an example of what I’m going to get from Letta, that’s another giant step towards going all in with the system. I know it’s just a machine, but it managed to sound like a smart, perceptive sales engineer who immediately understood the implications of the things I was saying.
The other thing I looked at today was vLLM. I really need a local machine with enough juice to run LMStudio and this little Mac is already huffing and puffing with its daily load. Both of my commercial ventures are at the flailing startup phase, but I asked for the following:
HP Z440 64GB with E5-2690v4 - around $350
Nvidia RTX 5060Ti 16GB - just $430
I am sure there’s some grim memory architecture defect with the 5060Ti series card, there’s endless hate for the 8GB version and there has to be something cheap about it, as the entry level model. Since the 16GB 5070 is no less than $800 and I’m just going to be prototyping AI stuff, I don’t think I care. The Z440 is a must, as the Z420 behind me is 1) at death’s door and 2) lacks the AVX2 instructions that are a software enforced minimum for AI in 2025. The RTX 5060Ti might be a want; I’ll start with the 6GB GTX1060 from the Z420, and if LMStudio’s partial model offloading isn’t utterly unbearable, it’ll run like that until we can afford a proper GPU.
This business is moving SO incredibly fast … I feel like I don’t dare look away from the screen, lest I return utterly lost the next morning.
Oh … here’s the ChatGPT evaluation of Letta competitors, which I started about halfway through that YouTube video. I suspected right away I was on to something.