Harnessing Agent Chaos

AI tooling gone wild.

Jan 05, 2026

If you watch Nate B. Jones, which I presume you do, if you’ve been reading here recently, say Agents Are Autistic, then you’ve heard of the notion of creating a “harness” for AI tooling. Having been on this trail since last summer, I’ve taken to describing the LLMs I’m using as “huskies”. You, constant reader, do not walk the husky. If you’ve never had the experience, let me assure you, it is the husky who walks YOU.

I’m taking some concrete steps to rope my pack of agents into an organized string that will drag my cybersled where I want. This is … well … volunteer to walk a friend’s husky, if you want to truly understand.

Attention Conservation Notice:

Tinkering with AI tooling here, building workflows, creating constraints to contain the worst impulses of LLMs. This one is for those of you who are doing your own integration work.

Harness V1:

As I do when facing a complex issue, I started by drawing a Maltego graph of the problem space. This probably should be a Draw.io image, once it’s a little more mature, but for the moment this took a few minutes, and that would have me screwing around with a rarely used piece of software for an hour or more.

So what you see here are:

Hardware - Brain, my MacBook Pro desktop, my iPhone/iPad, and hiding at the upper left is Z4, the small Promox cloud system in my office.
Various web services that either provide or are accessible to AI.
Desktop or mobile instances of AI or AI friendly packages like Notion and Slack.
The collections of gears are services in Docker containers, some on my desktop, but mostly they live in virtual machines in the Proxmox cloud.

Keep in mind what my objectives are:

Startup - there’s something of a commercial nature in the Letta/MindsDB/Postgres stuff on Z4.
Former commercial effort Disinfodrome might get a reboot, potential future commercial effort Parabeagle could merge with it.
NGO(s) - there are a couple lurking, some due to Disinfodrome, some due to the AI exploration that gets published here.
Cicada 3301 - any new puzzle must account for solvers with AI tooling, so the problems presented must play to both humans and machines.

I’ve been looking to solve two problems, the first was “memory” for the systems I’m using, and the second is creating a “director”.

Memory:

My personal environment has a lot of curation features - neatly organized timelines, document sets for Disinfodrome, and literally thousands of Maltego graphs dating back to 2010. The first thing I tried was Memento, but it’s become abandonware. One of the key things about Memento was it’s programmability. I have a LOT of stuff and I will not spend my life poking at some graphical interface, trying to get it to ingest a single folder of carefully preserved documents.

I got Memento to take Maltego graphs, but the underlying nature of how it kept memories, devaluing them over time, combined with the context size limit in Claude Desktop, made this a failed experiment.

I went through a number of document cache tools, looking for a way to do retrieval augmented generation. I finally got fed up with what was available and forked Chroma, creating Parabeagle. It is just like Chroma, except that 1) it understands that documents go in compartments and 2) you can batch feed it stuff from the command line 3) it happily uses whatever local compute resources for its embedding.

Last week, months after I set my intent to do so, and largely thanks to Antigravity, I finally got a Chroma document cache into MindsDB, and it used the older GTX 1060 GPU in my Proxmox system to do its embedding. I am not fluent with this yet, but soon I will move as surely as I do with Parabeagle, and then it’s on …

Orchestration:

Far too much of what I do is in my head; it’s my superpower, and a vulnerability in terms of business continuity. Having spent all the years I did with various social movements, I developed a sense of volunteer management and it’s serving me … in that I see the problems with AI “individual contributors”. Willful disregard for instructions is the norm when random groups of humans self-assemble, so I’m in my element.

Claude Desktop was a starting point, Projects offered compartments. Claude Code was a genius with serious short term memory problems, but those happened in the context of Github repos, so I could tolerate it. Then came the endless MCP servers, and things improved, but they were all partial hill climbs, I’d get to the steep part, then tumble.

And now … Antigravity. Let’s unpack what’s going on in this screen shot.

The red area on the left is the tool pane and we’re Exploring the Github repo for Shall We Play A Game? The circled stuff at the bottom are additional tools - open one of them, and you’d see a different set of objects in the pane.

The green pane is the Editors area. This has a file called Walkthrough open, I was busy trying Claude Skills in Antigravity. They don’t 100% move over, but portions are somewhat useful. Antigravity lacks the agentic sensibility Claude has.

The orange pane there is the Extension for Claude Code. This is a graphical face for the normally CLI based Claude Code. All of those Skills are active here.

The blue pane is the Open Agent Manager and you can see it knows it’s working on SWPAG. If you click the dropdown that says Claude Open 4.. you’d find this:

So Antigravity offers Google’s models, or Claude’s, or this rinky dink ChatGPT offshoot that I never find useful. Understanding the business drivers would help here - Antthropic’s Claude rules the enterprise environment, Google has decided they want some of that.

Other Orchestration:

Antigravity might work for herding systems, but what about humans? Signal is good for privacy and “hand to hand”, but it’s not meant for what I need to do. I am a bit annoyed to be dragged back into using Slack, but greatly relieved that I am not facing the combined pinball machine/chat client that is Discord.

So this is my Slack - the top is MindsDB, then commercial effort, then the most forward of the various NGOs I’ve been cultivating. And the magic is at the lower left. I am testing the integration of Claude, Github, Notion, and Perplexity.

Github works for a shared context - lots of tools can access it. But it’s a source code control system that just stores files, it … lacks many niceties. I suspect that Github Actions could be made to improve this, but that is an enterprise thing, which means a whole lot of work, and then I’d have to herd some cats into that particular tree. They don’t pay them enough to compel that, nor me enough to spend all my time wielding a laser pointer and a squirt gun.

Notion is the early leader here. Claude in Slack can not access its own Projects, which are shared between Desktop and Mobile. Perplexity can not access its Spaces, which are shared between Desktop and Mobile. Notion is accessible via the Claude connector in a manner similar to using Claude Desktop to talk to MCP servers. Perplexity, ever a step behind the pack, has not accomplished this yet.

I imagine that market forces are going to push both Claude and Perplexity to make their workspaces more accessible, but they are miles behind what Notion can do. If asked to produce a minimal solution for the startup I would add Claude for those who do not have it, and Notion for all. We are small and all headstrong; first I will tempt the others into just trying Notion. If any two of us like it, the others may join.

Conclusion:

There is still a divide between orchestrating systems (Antigravity) and maybe the solution for orchestrating humans (Slack + ???). I am literally just setting foot on this path, it’s still at the annoying phase, where it’s costing me much time, a little money, and not doing much of anything. Yet.

But I have only begun to grapple with this career transition. No matter what I do, I am always some sort of “staff engineer” - a ways up the food chain, but managing machines and processes, either for or with people, as opposed to much people management, in the traditional, corporate sense of the word.

And now, if Nate is to be believed, those of us who wish any sort of success in this new world are to become managers of strings of agents.

Neural Foundry

The husky metaphor is perfect for capturing what working with LLMs actually feels like. I've been wrestling with similar orchestration challenges and that Antigravity setup looks promising - the fact that you can layer Claude Code with MCP servers through a single interface addresses exactly the context-switching overhead that kills productivity. The compartmentalization problem you solved with Parabeagle is understated here but probably the hardest part. Once you have proper document compartments with batch ingestion, the agents stop hallucinating context from unrelated projects nearly as much. The transition from managing machines to managing agent strings feels like it should be smoother than it is, but maybe thats just the chaos talking.

Expand full comment

🇺🇦 Netwar Irregulars Bulletin 🇺🇦

Discussion about this post

Ready for more?