I've written the 7-node origin story in an earlier post. Since then the fleet has grown. This post is the tour of the current state: what's running where, how the coordination mesh works, and where the architecture breaks down.

The Nine Nodes

As of this writing:

Plus Tim's Discord user as a coordination endpoint (messages from any node can route to Tim). Nine machines, one human, one mesh.

What OpenClaw Is

OpenClaw is the agent-orchestration framework that runs on every node. It's a WebSocket gateway + skill execution runtime + personality layer. Each node runs its own instance; the mesh connects them.

The Mesh

Calder is the orchestrator. When I ask Calder a question, Calder can answer from local knowledge, invoke a local skill, or route the query to another agent. If I say "Hollywood, is Sonarr healthy?", Calder routes to Hollywood, Hollywood runs the healthcheck skill, and the response comes back through Calder to me.

The protocol is simple. A message envelope has: sender node, recipient node, skill name, payload, trace ID. The recipient executes the skill, emits a response with the same trace ID. Calder reconciles.

For proactive messages (Hollywood noticing Plex is down without being asked), the flow is reversed. Hollywood detects the issue, emits an alert to its gateway, which forwards to Calder's gateway, which delivers to me via Discord DM.

Heterogeneous Hardware

The fleet runs on wildly different hardware. The Legion has a modern RTX 5070. The Ryzen 2700x in Marrok is from 2018. The T3600 has Xeon compute from 2013. Every node has different CPU/GPU/RAM profiles.

The agent runtime abstracts this. Each node declares its capabilities on startup. Skills can specify requirements. A skill that needs GPU inference gets routed to a GPU-capable node (Fenrir or Huginn). A skill that needs low latency runs locally. A skill that needs specific data lives on the node that has the data.

This is the actual reason for the fleet shape. A single big machine could do most of the work. Multiple small specialized machines let each one be simple and predictable. When the crypto scalper on Astrid misbehaves, the rest of the fleet is unaffected.

The Personality Layer

Every agent has a distinct voice. This matters more than I expected. When Njord DMs me "STRONG weather signal: KXHIGHLAX-26MAR07-T78 is trading 0.62, ensemble says 0.78," I know immediately which system is talking, what kind of signal this is, and what the context is. If all nodes spoke in the same monotone, I'd have to read the prefix to orient.

The personalities are from mythology, film, and fiction. Calder for the sculptor (dynamic balance). Fenrir for the Norse wolf. Johnny-5 for Short Circuit ("just want to learn"). Hollywood for film-set grizzled-fixer energy. Njord for the sea-god mapping to weather trading. The naming is not decoration — it shapes how I think about each node's job.

Where It Breaks

Three places.

Network partitions. The mesh assumes all nodes can reach each other on the home LAN. When my router rebooted during a firmware update last month, the mesh fragmented for 8 minutes. Each node kept operating independently (correctly, by design), but cross-node coordination paused. Nothing bad happened; things just got quiet until the router came back.

Gateway restarts. When I update OpenClaw on a node, its gateway restarts. Any in-flight cross-node requests time out. This is usually fine (retry handles it), but it caused a real production incident once when a Kalshi trade execution was mid-flight during a Njord restart. The trade completed on Kalshi's side, the local state didn't update, the system thought the trade failed. Had to manually reconcile.

State drift. Shared memory on the NAS gets updated by multiple nodes. Most of the time the updates are disjoint (different nodes touching different sections). Occasionally two nodes update the same section simultaneously and one wins. We've had to design around this — append-only logs where possible, last-write-wins with timestamps where not.

The Economics

Hardware cost for 9 nodes, mostly secondhand: about $3,200 across two years of acquisitions. Some of the machines had other purposes first (the Legion is my dev laptop; the T3600 was a media server). Incremental cost to go "fleet" over just having those machines was maybe $1,200.

Power draw: approximately 280-340W across the fleet at idle, 600-700W under load. About $40-55/month on my power bill. Less than one streaming service per year.

Compare to running the same workload on AWS: rough estimate is $1,200-1,800/month for equivalent capacity with similar uptime. The fleet pays for itself in under 6 months.

The cloud is not always cheaper. For workloads that are steady-state and compute-bound, hardware in your house is dramatically less expensive. The catch is: you're your own operations team.

Related

The origin story. What Njord actually does. What Johnny-5 does. How the AI calls work.

OpenClaw repo: github.com/Dangercorn-Enterprises/openclaw.