How to Build an AI Agent Stack That Actually Runs Your Business
๐Ÿ“ข
← Back to Blog

How to Build an AI Agent Stack That Actually Runs Your Business

John Aspinall · · 18 min read

Most operators I talk to have built one or two AI automations. A daily briefing. A meeting-notes-to-tasks pipeline. Maybe a listing draft generator. Each one works. Each one saves real time. And each one is completely isolated โ€” an island automation that doesn't know the other islands exist.

I've been building automations for about a year now. At this point I run roughly 30 of them across four ventures. The shift that changed everything wasn't building the 15th automation. It was connecting the first five into an AI agent stack โ€” a system where agents share memory, feed each other context, and get better at their jobs without me touching them. The compound effect of a connected stack versus a folder of standalone scripts is the difference between hiring a team and hiring a series of temps who never talk to each other.

This is the architecture I use, what each layer does, how they connect, what it costs, and where most operators go wrong when they try to build one.

What Is an AI Agent Stack?

An AI agent stack is a connected system of AI automations that share a knowledge base, pass context to each other, run on triggers without human intervention, and feed results back into the system so future runs get smarter. It's the difference between "I have some AI tools" and "I have an AI operating system."

Think of it this way: a single automation is a contractor you hire for one job. An AI agent stack is a small team where each person knows what the others are doing, has access to the same institutional knowledge, and leaves notes for the next shift.

The key word is connected. Thirty standalone automations that each run in their own silo don't compound. Five automations that share context, update the same knowledge base, and trigger each other based on conditions โ€” that compounds.

Why Standalone Automations Hit a Ceiling

Here's the pattern I see with every operator who starts building AI automations, including myself about eight months ago.

You build your first one. It works. You save 30 minutes a day. You think: what else can I automate? So you build a second, a third, a fifth. Each one is a standalone script or Claude Code session with its own prompt, its own context, its own schedule. By automation number eight, you have a maintenance problem. Prompts drift. Context is duplicated across five different CLAUDE.md files. One automation doesn't know about a decision you made in another. You're spending more time managing the automations than they're saving you.

The ceiling isn't capability โ€” the models are plenty capable. The ceiling is architecture. Standalone automations have three structural problems:

Duplicated context. Your brand voice guidelines live in four different prompts. You update one, forget the others. Now three automations are producing output that doesn't match your current brand voice.

No shared memory. Your listing audit agent found a pattern โ€” supplement listings with clinical study callouts convert 23% better in your category. Your content generation agent doesn't know that. It keeps producing generic copy. The insight dies in one automation's output and never reaches the others.

No feedback loops. Each automation runs, produces output, and stops. Nothing about the quality of that output feeds back into the system. If a daily briefing starts surfacing irrelevant stories, it'll keep doing it until you manually rewrite the prompt.

An AI agent stack solves all three. Shared context lives in one place. Agents read from and write to a common knowledge base. And the system has feedback paths where the output of one run improves the input for the next.

The Five Layers of an AI Agent Stack

I think of my stack in five layers. You don't need to build them all at once โ€” I didn't โ€” but understanding the architecture helps you build each piece so it connects to the others instead of becoming another island.

  1. Knowledge layer โ€” where everything you know lives in machine-readable form
  2. Context layer โ€” how agents get the right knowledge at the right moment
  3. Agent layer โ€” the automations themselves
  4. Orchestration layer โ€” triggers, scheduling, and inter-agent handoffs
  5. Feedback layer โ€” how agents update the knowledge base, creating a compound loop

Let me walk through each one.

The Knowledge Layer: Your Agent's Shared Memory

I've written a full guide to building an AI second brain, so I won't rehash it here. The short version: your knowledge layer is a structured vault of markdown notes โ€” brand profiles, client briefs, operating procedures, A/B test results, decision logs, meeting notes, competitive intel โ€” that any agent in your stack can search and retrieve.

The critical design decision for an AI agent stack is that every agent reads from and writes to the same vault. Not separate databases per tool. Not one folder for your briefing agent and another for your content agent. One vault. One source of truth.

Mine is roughly 2,400 markdown notes with structured frontmatter (tags, date, source, linked entities). Every note follows the same template so agents can parse it consistently. The vault lives on a Mac mini that runs all my automations, and agents access it through file system reads โ€” no API, no database, just markdown files in a folder.

That simplicity is a feature. Every layer of indirection you add between an agent and your knowledge is a layer where things break at 2am when you're not watching.

The Context Layer: Getting the Right Knowledge to the Right Agent

I've also written a full guide to context engineering for operators. The short version: context engineering is how you structure what information each agent receives so it produces expert output instead of generic slop.

For an AI agent stack specifically, the context layer has two jobs:

Static context: CLAUDE.md files, skills files, and system prompts that give each agent its role, rules, and domain knowledge. My daily briefing agent has a CLAUDE.md that includes our brand profiles, our news sources, and our editorial voice guidelines. My listing audit agent has a different CLAUDE.md with our A/B test results database and our category-specific conversion benchmarks.

Dynamic context: retrieval from the knowledge vault at runtime. Before my content agent writes a product listing, it searches the vault for notes tagged with that brand, that category, and "conversion data." It pulls the three most relevant notes and injects them into its context window. The content it produces reflects what we've learned, not just what the model knows from training.

The difference between a connected stack and a pile of scripts is that dynamic retrieval layer. It's the reason automation number 25 produces better output than automation number 3 โ€” because there are 22 more automations feeding insights into the vault that automation number 25 can now draw on.

The Agent Layer: What to Build and in What Order

This is where most people start, and it's the right place to start โ€” you just need to build each agent so it connects to the knowledge and context layers instead of standing alone.

Here's the order I'd build in, based on what I've seen compound fastest:

Tier 1: Information capture agents (build these first). These fill the knowledge vault. Meeting-notes-to-structured-summaries. Email-to-vault. News monitoring and synthesis. Web research that files its findings. These are low-risk (they're read-only on your business), high-compound (every note they add makes every other agent smarter), and they build the habit of trusting unattended agents.

Tier 2: Analysis and synthesis agents. Daily briefings. Listing audits. Competitor monitoring. Inventory dashboards. These read from the vault and produce analysis you act on. They're the first ones where you feel the compound effect โ€” your briefing agent surfaces a trend because a capture agent filed a note three days ago that you never read manually.

Tier 3: Action agents. Content generation. Task creation from meetings. Email sequence management. Anything that creates or modifies something in the real world. These need guardrails โ€” human review gates, dry-run modes, detect-and-reverse patterns. I've written about building a detect-and-reverse agent that stops outbound emails when someone replies mid-sequence. That pattern belongs on every action agent.

Tier 4: Meta agents. Agents that monitor other agents. A weekly report on automation performance. An error watcher that alerts you when an agent fails silently. A context updater that notices when vault notes are stale and flags them for review. These only make sense once you have 10+ agents running, but once you do, they're what keep the stack healthy without constant manual checking.

Each agent is a Claude Code session โ€” usually headless, run from a shell script or a cron trigger โ€” with its own CLAUDE.md, its own skills files, and access to the shared vault. The CLAUDE.md is where you encode the agent's role, and the skills files give it specific capabilities. Here's a stripped-down example of how I structure an agent:

project-root/
โ”œโ”€โ”€ CLAUDE.md              # Agent role, rules, voice, constraints
โ”œโ”€โ”€ .claude/
โ”‚   โ””โ”€โ”€ skills/
โ”‚       โ”œโ”€โ”€ search-vault.md    # Skill: search and retrieve from the knowledge vault
โ”‚       โ”œโ”€โ”€ write-note.md      # Skill: write a structured note back to the vault
โ”‚       โ””โ”€โ”€ send-alert.md      # Skill: notify me via Slack/email if something needs attention
โ”œโ”€โ”€ run.sh                 # The trigger script (cron calls this)
โ””โ”€โ”€ output/                # Where this agent drops its artifacts

The skills files are the connective tissue. The search-vault skill tells the agent how to find and read notes from the shared vault. The write-note skill tells it how to create a new note that follows the vault's formatting conventions so other agents can parse it. Same vault format. Same retrieval pattern. That's what makes it a stack instead of a collection.

The Orchestration Layer: Making Agents Talk to Each Other

Standalone automations run on a timer: every morning at 7am, every 10 minutes, once a week. An AI agent stack adds two more trigger types: event-based triggers and chain triggers.

Scheduled triggers are the simplest. My daily briefing runs at 6:30am. My email-to-vault agent runs every 10 minutes. My weekly performance digest runs Sunday nights. These are cron jobs or launchd plists on the Mac mini โ€” nothing fancy.

Event-based triggers fire when something happens. A new Fathom transcript appears, so the meeting-notes agent runs. A client emails with a specific label, so the email-to-vault agent picks it up. A Seller Central notification arrives, so the inventory alert agent fires. These use MCP servers (Gmail MCP, Fathom MCP) as the event source, with a polling agent that checks for new items on a short interval.

Chain triggers are where the stack becomes a system. Agent A finishes and drops a file. Agent B watches for that file and runs when it appears. My meeting-notes agent writes structured notes to the vault. My task-creation agent watches for new meeting notes tagged "action-items" and creates Todoist tasks from them. Two agents, no shared code, connected only by a file in the vault and a naming convention.

The practical implementation is simple. Each agent's run script can check for a trigger condition before doing its work:

# Only run if there are new meeting notes since last run
NEW_NOTES=$(find "$VAULT_PATH/meetings" -name "*.md" -newer "$LAST_RUN_MARKER")
if [ -z "$NEW_NOTES" ]; then
  exit 0
fi

That's it. No message queue. No orchestration framework. No Kubernetes. A file timestamp check in a bash script. I run a multi-venture AI agent stack on a Mac mini with cron and shell scripts. The sophistication is in the prompts and the knowledge architecture, not the infrastructure.

The Feedback Layer: How the Stack Gets Smarter Over Time

This is the layer most people never build, and it's the one that makes everything else compound.

The feedback layer is any mechanism where the output of one agent run improves the input for future runs โ€” of any agent in the stack. Three concrete patterns I use:

Write-back to vault. My daily briefing agent doesn't just email me a briefing. It also writes a structured note to the vault summarizing the key stories and their relevance to our brands. Next time any agent searches the vault for recent industry context, those briefing notes are available. My listing audit agent can now reference a competitor's pricing change that the briefing agent surfaced three days ago.

Quality signals. When I manually edit an agent's output before using it โ€” rewriting a paragraph, cutting a section, adding context it missed โ€” I save a "correction note" to the vault tagged with the agent's name and the type of error. Periodically, I review those correction notes and update the agent's CLAUDE.md or skills files. The agent doesn't learn in real time, but the system learns on a weekly cycle. Over three months, my content agent went from producing output I edited 60% of to output I edit maybe 15% of. Same model. Better context, built from its own mistakes.

Cross-agent enrichment. Agent A surfaces an insight. Agent B uses that insight in a completely different context. My competitor monitoring agent noticed that a rival brand started using lifestyle images with human hands in every hero slot. That observation went into the vault. Two weeks later, my content strategy agent pulled that note when generating image direction for a new listing and recommended we test the same pattern. I didn't connect those dots manually. The vault did.

This is the compound effect that makes an AI agent stack fundamentally different from standalone automations. Each agent that writes to the vault makes every agent that reads from the vault slightly smarter. Over months, the difference is enormous.

What This Costs to Build and Run

I want to be honest about the investment because "just automate everything" glosses over real costs.

Build time. My first five agents took roughly 40 hours total to build, test, and stabilize. Each one after that takes 2-4 hours because the patterns (CLAUDE.md structure, skills files, vault integration, cron setup) are established. If you're starting from zero, budget a full weekend for your first two agents and the vault architecture.

API costs. My 30+ agents cost roughly $180-220/month in API calls. The heaviest is the daily briefing ($4-5/day because of web search tool calls). Most agents run for pennies per execution. The email-to-vault agent processes 8-15 emails a day for about $0.30.

Infrastructure. I run everything on a Mac mini that was already sitting on a shelf ($0 marginal cost). You could run the same stack on a $5/month VPS or a spare laptop. The infrastructure is not the bottleneck.

Maintenance. I spend roughly 2 hours per week on stack maintenance โ€” reviewing correction notes, updating CLAUDE.md files, fixing the occasional broken cron job, adding new agents. That's real time, and it doesn't go to zero. But those 2 hours replace what used to be 20+ hours of manual work the agents now handle.

The ROI math: ~$200/month in API costs and ~8 hours/month in maintenance to replace ~80+ hours/month of manual work. Even if you value your time conservatively, that's a 5-10x return.

Three Mistakes That Kill AI Agent Stacks

Building the orchestration layer before the knowledge layer. I see operators jump straight to complex agent chains and event triggers before they have a functioning vault with structured notes. Your agents are only as good as the context they can retrieve. Build the vault first. Stock it with 50+ notes. Then build agents that read from it. You'll immediately see why the vault architecture matters.

Over-engineering the infrastructure. You don't need a vector database, a message queue, a container orchestrator, or a custom API gateway. You need markdown files, shell scripts, and cron. I'm not being reductive โ€” I'm telling you what actually runs my businesses. Every layer of infrastructure you add is a layer that can fail at 3am, and the whole point of unattended agents is that you're asleep at 3am. Simple infrastructure fails in simple, fixable ways. Complex infrastructure fails in complex, 4-hour-debugging ways.

No feedback loops. If your agents never update the vault, your stack is frozen at the quality level of its initial context. The agents that write back โ€” the capture agents, the briefing agents, the ones that log their own corrections โ€” are what make the stack compound over time. Without them, you have 30 automations that are exactly as smart in month 6 as they were in month 1.

FAQ

How long until an AI agent stack actually saves time versus costs time to maintain?

In my experience, the crossover happens around agents 4-5. The first three feel like you're spending more time building than you're saving. By the fifth agent, the vault is stocked enough that new agents produce good output on the first run, and the compound effect starts pulling its weight. Budget 4-6 weeks before you're net-positive on time.

Do I need to use Claude Code specifically?

No, but you need something that runs headless (no human babysitting), can read/write local files, and can use external tools (MCP servers, APIs, shell commands). Claude Code checks all three. OpenAI's Codex CLI works too, with a different syntax. The architecture โ€” vault, context layer, agents, orchestration, feedback โ€” works regardless of which model or CLI you use. I use Claude Code because the CLAUDE.md and skills file system maps directly onto the context layer I've described.

What if I only run one business, not four?

The stack still works โ€” you just have fewer agents. A single-venture operator might run 8-12 agents instead of 30+. The architecture is identical. If anything, it's simpler because your vault has one domain instead of four, and your agents don't need to filter by venture. Start with the three highest-value tiers: an email/meeting capture agent, a daily briefing, and one action agent for whatever task eats the most time in your week.

What's the biggest risk of running unattended agents?

An action agent doing something wrong in the real world โ€” sending an email to the wrong person, creating duplicate tasks, publishing content with errors. The mitigation is simple: keep your first agents read-only (capture and analysis), add human review gates to action agents, and build detect-and-reverse patterns for anything that touches customers or clients. I never let an agent send external communication without a review step until it's proven itself over at least 30 runs.

Can I build this without coding?

The vault and context layer, yes โ€” it's just markdown files in folders. The agents require some comfort with the command line and shell scripts. You don't need to be a developer, but you need to be the kind of person who can edit a cron tab, run a shell command, and debug a file path error. If you can navigate a terminal, you can build this. If "terminal" means the airport to you, start with the vault and get comfortable with Claude Code's interactive mode before going headless.

The Three Things to Do This Week

An AI agent stack sounds like a big build, but the architecture is modular by design. You don't build it all at once. Here are the three moves that create the foundation:

  1. Build the vault. Create a folder of markdown notes with consistent frontmatter (tags, date, source, type). Start with 20 notes: your brand profiles, your operating procedures, your top 10 lessons learned. This is your AI agent stack's shared memory, and everything else connects to it.

  2. Build one capture agent. Pick the information stream that leaks the most โ€” meeting notes, forwarded emails, industry news โ€” and build an agent that captures it into structured vault notes automatically. This stocks the vault without requiring your manual effort, and it teaches you the patterns (CLAUDE.md, skills files, cron triggers) you'll use for every agent after it.

  3. Build one analysis agent that reads from the vault. A daily briefing, a weekly audit, a competitor summary โ€” anything that searches the vault, synthesizes what it finds, and delivers a result. This is where you'll feel the compound effect for the first time: the capture agent filed a note two days ago, and now your analysis agent is using it without you lifting a finger. That moment โ€” when agents start making each other useful โ€” is when the stack clicks.

The AI agent stack is the most valuable thing I've built in the past year. Not any individual automation. The system that connects them. If you've built one or two automations and they feel useful but isolated, the stack architecture is what turns them from hired help into a team.

Want results like these for your listings?

Book a free visual strategy audit and see exactly what changes your marketplace listings need.

Get Your Free Audit