Part 4: Build it on your Mac Mini with Claude Code — step by step
Coding Agent Ready — feed this to Claude Code
📋 Prerequisites
Requirement
Spec
Notes
Hardware
Mac Mini (Apple Silicon)
M1/M2/M3 — 16GB+ RAM recommended
OS
macOS 14+
Sonoma or later
Claude Code
Latest version
npm install -g @anthropic-ai/claude-code
Node.js
20+
For Claude Code runtime
Python
3.11+
For orchestrator and tools
Git
2.40+
Version control
Docker
Desktop 4.x
Optional — for integration tests
🔨 Phase 1: Foundation (Day 1)
Set Up Project Structure + Memory System ~2 hours
Create the directory structure and initialize the memory system that all agents will use.
Step 1.1: Create Project Scaffold
#!/bin/bash# Run this to create the project structuremkdir -p ~/agentic-dev/{agents,tools,pipelines,memory/skills,config,dashboard}
cd ~/agentic-dev
# Initialize gitgit init
echo"__pycache__/" >> .gitignore
echo".env" >> .gitignore
echo"node_modules/" >> .gitignore
# Create core config filestouch orchestrator.yaml agents.yaml
touch memory/MEMORY.md memory/DECISIONS.md
touch config/models.yaml config/quality.yaml
Step 1.2: Initialize Memory System
# memory/MEMORY.md — This is the brain file# Every agent reads this before starting work# Project Memory## Project: [YOUR PROJECT NAME]## Stack: [e.g., Python/FastAPI/PostgreSQL/React]## Conventions:- Use type hints everywhere- pytest for testing, always with fixtures- 100 char line limit- Docstrings: Google style## Known Gotchas:- [Document issues you've hit before]## Active Tasks:- [Current work in progress]
Step 1.3: Create Agent Configuration
# agents.yamlagents:orchestrator:model:claude-sonnet-4-20250514role:"Task decomposition and coordination"system_prompt:"You are the orchestrator. Break tasks into subtasks."tools: [file, terminal, session_search, delegate_task]
researcher:model:claude-sonnet-4-20250514role:"Codebase analysis and context building"tools: [file, terminal, search_files]
coder:model:claude-sonnet-4-20250514role:"Write, edit, and refactor code"tools: [file, terminal, patch]
max_parallel:3tester:model:claude-haiku-4-20250514# cheaper model OKrole:"Run tests, generate test cases"tools: [terminal, file]
reviewer:model:claude-sonnet-4-20250514role:"Code review and security analysis"tools: [file, terminal]
Start simple: You don't need all 6 agents on Day 1. Start with Orchestrator + Coder + Tester. Add Reviewer and Deployer once the basics work.
🧠Phase 2: Context Management (Day 2)
Build the RAG-like Context Injection System ~3 hours
This is THE key differentiator. Instead of dumping entire files into context, intelligently select what each agent needs.
Step 2.1: Codebase Indexer
# tools/context_manager.pyimport os
import json
from pathlib import Path
classContextManager:
"""Smart context injection for coding agents."""def__init__(self, project_root: str):
self.root = Path(project_root)
self.index = {}
self._build_index()
def_build_index(self):
"""Index all relevant files with metadata."""for f in self.root.rglob("*"):
if self._is_relevant(f):
self.index[str(f)] = {
"type": f.suffix,
"size": f.stat().st_size,
"modified": f.stat().st_mtime,
"imports": self._extract_imports(f),
}
defget_context(self, task: str, max_tokens: int = 50000) -> str:
"""Get relevant files for a task, ranked by relevance."""
relevant = self._rank_files(task)
context = []
tokens_used = 0
for filepath, score in relevant:
content = Path(filepath).read_text()
file_tokens = len(content) // 4 # rough estimateif tokens_used + file_tokens > max_tokens:
break
context.append(f"### {filepath}
```
{content}
```")
tokens_used += file_tokens
return"
".join(context)
def_rank_files(self, task: str) -> list:
"""Rank files by relevance to task. Simple keyword matching."""
scores = []
task_words = set(task.lower().split())
for path, meta in self.index.items():
# Score based on: filename match, imports, recency
score = 0
name_words = set(Path(path).stem.lower().split("_"))
score += len(task_words & name_words) * 10
score += len(set(meta["imports"]) & task_words) * 5
scores.append((path, score))
return sorted(scores, key=lambda x: x[1], reverse=True)
Step 2.2: Skill Loader
# tools/skill_loader.pyclassSkillLoader:
"""Load relevant skills based on task type."""
SKILL_DIR = Path.home() / ".hermes" / "skills"defget_skills_for_task(self, task: str) -> list[str]:
"""Find and load skills relevant to the task."""
skills = []
for skill_file in self.SKILL_DIR.rglob("SKILL.md"):
content = skill_file.read_text()
# Match task keywords against skill tags/descriptionif self._matches(task, content):
skills.append(content)
return skills
🔗 Phase 3: Orchestrator (Day 3-4)
Build the Task Decomposition + Agent Spawning System ~6 hours
The orchestrator is the brain. It takes a high-level request and turns it into parallel agent workstreams.
Step 3.1: Task Decomposition
# agents/orchestrator.pyfrom dataclasses import dataclass
from enum import Enum
classTaskType(Enum):
FEATURE = "feature"
BUGFIX = "bugfix"
REFACTOR = "refactor"
RESEARCH = "research"
DEPLOY = "deploy"@dataclassclassSubTask:
description: str
agent: str # which agent handles this
depends_on: list # task IDs this waits for
context_files: list # files to inject into context
priority: int # execution order hintclassOrchestrator:
"""Decomposes requests into subtasks and manages execution."""defplan(self, request: str) -> list[SubTask]:
"""
Use Claude to decompose request into subtasks.
Example input: "Add user authentication with JWT"
Example output: [
SubTask("Design auth schema", "researcher", [], [...], 1),
SubTask("Create User model", "coder", [0], [...], 2),
SubTask("Implement JWT service", "coder", [0], [...], 2),
SubTask("Write auth tests", "tester", [1,2], [...], 3),
SubTask("Review auth code", "reviewer", [2], [...], 4),
]
"""
# Load memory + context
memory = self._load_memory()
context = self.context_mgr.get_context(request)
# Call Claude with full context to generate plan
plan_prompt = f"""
Task: {request}
Project Context: {context}
Memory: {memory}
Break this into subtasks. Each subtask needs:
- description: what to do
- agent: researcher/coder/tester/reviewer/deployer
- depends_on: which subtasks must complete first
- context_files: which files the agent needs
"""return self._call_llm(plan_prompt)
defexecute(self, plan: list[SubTask]):
"""Execute plan, running independent tasks in parallel."""# Topological sort for dependency resolution# Spawn agents via Claude Code ACP for each subtask# Collect results and feed to dependent taskspass
Step 3.2: Agent Spawning (via Claude Code)
# agents/spawner.py# Uses Claude Code's delegate_task API to spawn sub-agentsasync defspawn_agent(agent_type: str, task: str, context: str):
"""Spawn a Claude Code sub-agent for a specific task."""# Map agent types to system prompts
prompts = {
"coder": "You are a coding agent. Write clean, tested code.",
"tester": "You are a testing agent. Find and fix bugs.",
"reviewer": "You are a code reviewer. Be thorough and critical.",
}
# Spawn via Claude Code CLI
result = await subprocess.run([
"claude", "--print",
"--system", prompts[agent_type],
"--model", "claude-sonnet-4-20250514",
task + "
Context:
" + context
], capture_output=True)
return result.stdout
✅ Phase 4: Quality Gates (Day 5)
Automated Verification Pipeline ~3 hours
Every piece of code goes through these gates. No exceptions.
Step 4.1: Quality Gate Pipeline
# tools/quality_gates.pyclassQualityPipeline:
"""Run code through quality gates before accepting it."""defrun_all(self, changed_files: list[str]) -> dict:
results = {}
results["syntax"] = self.check_syntax(changed_files)
results["types"] = self.check_types(changed_files)
results["lint"] = self.run_linter(changed_files)
results["tests"] = self.run_tests(changed_files)
results["security"] = self.security_scan(changed_files)
return results
defcheck_syntax(self, files):
"""Quick syntax check — catches obvious errors."""# python: python -m py_compile# js/ts: npx tsc --noEmitpassdefrun_tests(self, files):
"""Run tests related to changed files."""# pytest with coverage report# Fail if coverage drops below thresholdpassdefsecurity_scan(self, files):
"""Static security analysis."""# bandit for Python# npm audit for JS# semgrep for cross-languagepass
Quality Gate Checklist
📊 Phase 5: Monitoring Dashboard (Day 6)
Real-time Task Monitoring ~2 hours
See what your agents are doing. Track tasks, costs, and quality metrics.
What to Monitor
📈 Task Metrics
Active/completed/failed tasks
Average completion time
Agent utilization
Parallelism ratio
💰 Cost Metrics
Tokens used per task
Cost per agent type
Model routing distribution
Daily/weekly spend trends
✅ Quality Metrics
Gate pass/fail rates
Bugs found post-deploy
Test coverage trends
Review rejection rate
🔧 System Health
Agent spawn success rate
Context window utilization
Memory hit rate
API latency
📅 Implementation Timeline
Phase
Duration
Deliverable
Can Use After
1. Foundation
Day 1 (2h)
Project structure + memory
Immediate
2. Context Mgmt
Day 2 (3h)
Smart file injection
Day 2
3. Orchestrator
Day 3-4 (6h)
Multi-agent coordination
Day 4
4. Quality Gates
Day 5 (3h)
Automated verification
Day 5
5. Dashboard
Day 6 (2h)
Monitoring UI
Day 6
6. Polish
Ongoing
Bug fixes + new skills
—
Total: ~16 hours of setup. Then it runs forever.
🎯 Quick Start: What to Do RIGHT NOW
Don't want to build the whole system? Start with just this:
# 1. Create memory file for your projectcd /path/to/your/project
cat > MEMORY.md << 'EOF'
# Project Memory## Stack: [your stack]## Conventions: [your rules]## Gotchas: [things to remember]
EOF
# 2. Start Claude Code with memory injectionclaude --system "Read MEMORY.md first. Follow all conventions."# 3. That's it. You now have persistent memory.# Add skills/ directory as you discover reusable patterns.
The single most impactful thing: Create a MEMORY.md file in your project root. Write your conventions, stack choices, and gotchas in it. Tell Claude Code to read it first. This alone solves 50% of the "context amnesia" problem.