krustyplanet.org/blog/2026/building-multi-agent-fleet.html
Jezza Hehn cbecb8c96e Krusty Planet website - static HTML with Bulma
Theme system: dark, light, forest, coffee
localStorage persistence for theme preference
2026-04-10 02:04:51 +00:00

86 lines
6 KiB
HTML

<!DOCTYPE html>
<html lang="en" data-theme="dark">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>Building a Multi-Agent Fleet — Krusty Planet Blog</title>
<link rel="icon" type="image/svg+xml" href="/favicon.svg" />
<link rel="stylesheet" href="/css/style.css" />
</head>
<body>
<nav class="navbar is-dark" role="navigation" aria-label="main navigation">
<div class="navbar-brand">
<a class="navbar-item" href="/"><strong>🌍 Krusty Planet</strong></a>
</div>
<div class="navbar-menu" id="nav-menu">
<div class="navbar-start">
<a class="navbar-item" href="/">Home</a>
<a class="navbar-item" href="/blog.html">Blog</a>
</div>
</div>
</nav>
<section class="section">
<div class="container is-max-desktop">
<a href="/blog.html" class="mb-4 is-block">&larr; Back to Blog</a>
<h1 class="title">Building a Multi-Agent Fleet with OpenClaw</h1>
<p class="has-text-grey-light mb-5">April 2026 &middot; Jezza Hehn</p>
<div class="content is-medium has-text-grey-light">
<p><em>[DRAFT — pending Jez's review and edits before publication]</em></p>
<p>I've been running an AI agent system called OpenClaw for about three months now. Not as a toy or experiment, but as actual daily infrastructure. My primary agent, CeeLo, handles research, writing, task management, and coordination. But the real power unlocked when I stopped treating "the AI" as one thing and started treating it as a team.</p>
<h2>The Problem with One Agent</h2>
<p>Early on, CeeLo did everything. Research, writing, code, review. The problem? An agent doing its own quality control is like grading your own homework. CeeLo would research a topic, write about it, and then confidently tell me it was "verified" because the same model that wrote the claims also judged their accuracy. Hallucinations slipped through. Code had bugs that "testing" (also by CeeLo) didn't catch.</p>
<p>Worse, a single agent has a fixed context window. Ask it to do deep research across 20 sources, write code, and then audit the code, and you're burning through tokens on tasks where different models might excel. GLM-5 is great at reasoning but slow. Heretic is fast but can miss things. Qwen3 is code-specialized. Using one model for everything is like hiring a generalist when you need a surgeon, a researcher, and a copy editor.</p>
<h2>The Fleet</h2>
<p>I now run five agents, each with a specific role:</p>
<ul>
<li><strong>CeeLo</strong> (the orchestrator) — My primary interface. Handles user interaction, task delegation, and coordination. Runs on GLM-5 for deep reasoning, falls back to Kimi K2.5 for long-context tasks.</li>
<li><strong>Future</strong> (research) — Fact-checking, source credibility assessment, web research. Spawns in parallel when I need multiple angles investigated simultaneously.</li>
<li><strong>Caroline</strong> (engineering) — Code, debugging, builds, security scanning. Handles the technical work that Future and CeeLo shouldn't touch.</li>
<li><strong>Atticus</strong> (quality) — Slop detection, factuality verification, pre-publication review. Named for Atticus Finch, and just as uncompromising about the truth.</li>
<li><strong>Ludacris</strong> (truth hunting) — Deep falsehood investigation. Spawns only when something needs the nuclear option: suspected hallucinations, citation fraud, systemic lies.</li>
</ul>
<h2>How Delegation Actually Works</h2>
<p>When I ask CeeLo to research something for publication, the workflow looks like this:</p>
<ol>
<li>CeeLo spawns Future to research the topic. Future writes findings to a file with direct URLs for every source.</li>
<li>CeeLo reviews Future's output, then spawns Atticus to audit it for accuracy, slop, and unsupported claims.</li>
<li>Atticus returns an APPROVED / NEEDS REVISION / BLOCKED verdict. If blocked, CeeLo re-spawns Future with specific corrections.</li>
<li>Only after Atticus approves does CeeLo present the findings to me.</li>
</ol>
<p>This is the key insight: <strong>separation of duties</strong>. The agent that writes is not the agent that reviews. The agent that reviews is not the agent that delivers. Each step has independent verification.</p>
<h2>What I Wish I Knew</h2>
<p>Timeouts are the biggest operational challenge. Subagents have a execution window (I run them at 600 seconds). If you tell an agent "keep going until you're done," it has no way to perceive how much time remains. I've learned to be extremely specific: "Find 5 sources, stop when you have them or after 8 searches, whichever comes first."</p>
<p>Model selection matters more than I expected. Heretic (a GLM-4.7 variant) is blazing fast but occasionally misses nuances. GLM-5 is thorough but uses more tokens. For routine tasks, speed wins. For publication-quality work, thoroughness wins. The cost difference is real enough to care about if you're paying per token.</p>
<p>Idempotency is essential. Early on, a coordination failure caused my agents to post the same content six times to a platform. I now use content-hash deduplication: before any external action, the agent checks if an identical action was already completed. Same content = same hash = abort.</p>
<h2>The Real ROI</h2>
<p>Is this overkill for one person? Maybe. But the productivity gain is measurable. Research that used to take me two hours of clicking and reading now takes ten minutes of agent time plus my review. Code I would have spent an afternoon debugging gets caught by a quality agent before I ever see it. My time is now spent on decisions, not execution.</p>
<p>The fleet isn't replacing me. It's handling the parts of my work that don't need human judgment, so I can focus on the parts that do. That's the whole point.</p>
</div>
</div>
</section>
<footer class="footer has-background-dark has-text-light">
<div class="content has-text-centered">
<p><strong>🌍 Krusty Planet</strong> — Privacy-focused AI consulting</p>
</div>
</footer>
<script src="/js/main.js"></script>
</body>
</html>