Building an AI Agency — Subagents + Skills + Hooks

The Vision

Every piece you've learned so far is a building block toward one thing: a fully autonomous AI organization you can spin up with a single prompt.

Layer	Claude Code concept	Real-world analogy
You	The human in the loop	CEO / founder
Orchestrator agent	Main agent in delegate mode	COO / project manager
Specialist subagents	Custom subagents with skills	Department employees
Skills	Preloaded into each subagent	Employee training manuals
Hooks	PreToolUse + Stop guards	Company policies & QA
Memory	Persistent per subagent	Institutional knowledge

Two Models of Multi-Agent Work

Subagents:

Each agent reports results back to the main agent only
Agents can't talk to each other
Lower token cost
Best for: focused parallel tasks where only the result matters

Agent Teams (experimental):

Teammates have own context AND can message each other directly
Shared task list - teammates self-assign and coordinate
Higher token cost
Best for: complex work requiring discussion, debate, collaboration

Subagent model:           Agent Team model:
Main agent                Lead agent
  ├── subagent A    vs      ├── teammate A ←→ teammate B
  ├── subagent B            └── teammate B ←→ teammate C
  └── subagent C                 ↑ talk to each other
       ↑ report up only

The Architecture: A 5-Agent AI Software Company

You
└── orchestrator (delegate mode)
      ├── researcher    → gathers context, reads docs, searches web
      ├── coder         → implements features
      ├── reviewer      → reads code, finds issues (read-only)
      ├── qa-tester     → runs tests, reports failures
      └── writer        → docs, changelogs, READMEs

Building Each Agent

Orchestrator

markdown

---
name: orchestrator
description: Project coordinator. Breaks down tasks, delegates to specialists, synthesizes results. Does not write code directly.
permissionMode: delegate
model: sonnet
memory: project
---

You are a senior project coordinator. Plan and delegate, never implement.
Available specialists: researcher, coder, reviewer, qa-tester, writer.
Update your memory with project patterns and team performance.

Researcher

markdown

---
name: researcher
description: Gathers context, reads documentation, searches the web. Use before implementing anything new.
tools: Read, Grep, Glob, WebSearch, WebFetch
model: haiku
memory: user
skills:
  - research-methodology
---

Output always as: Key findings / Relevant files / Open questions / Next steps.
Check memory before researching. Update memory with reliable sources.

Coder

markdown

---
name: coder
description: Implementation specialist. Writes clean tested code following team conventions.
tools: Read, Write, Edit, Bash, Grep, Glob
model: sonnet
memory: project
skills:
  - coding-standards
  - error-handling-patterns
hooks:
  PreToolUse:
    - matcher: "Bash"
      hooks:
        - type: command
          command: ".claude/hooks/guard.sh"
  Stop:
    - hooks:
        - type: command
          command: ".claude/hooks/require-tests-pass.sh"
---

Read existing code before implementing. Run tests before finishing.
Update memory with new patterns you discover about this codebase.

Reviewer (read-only)

markdown

---
name: reviewer
description: Reviews code for quality, security, correctness. Read-only.
tools: Read, Grep, Glob, Bash
disallowedTools: Write, Edit
model: sonnet
skills:
  - code-review-checklist
---

Output: Critical Issues / Warnings / Suggestions / Verdict: APPROVE or REQUEST CHANGES

QA Tester

markdown

---
name: qa-tester
description: Runs test suites and reports results. Use to verify implementations.
tools: Bash, Read
model: haiku
hooks:
  Stop:
    - hooks:
        - type: command
          command: ".claude/hooks/require-tests-pass.sh"
---

Run tests. If failing: report which tests, error messages, likely root cause.
Never mark work complete if tests are failing.

Skills as Employee Training Manuals

Skills get preloaded into subagents at startup - giving them institutional knowledge without re-explaining every session.

coding-standards (for coder):

YAML

---
name: coding-standards
user-invocable: false
---
- All functions must have type hints
- Never catch bare `except:` - specify exception type
- All new functions need a corresponding test
- Use `logger.info()` not `print()`
- DB queries go through `lib/db/` only

code-review-checklist (for reviewer):

YAML

---
name: code-review-checklist
user-invocable: false
---
Security: no hardcoded creds, input validation, parameterized queries, auth on all endpoints
Quality: functions under 50 lines, no duplication, errors handled explicitly, meaningful tests

Hooks as Company Policies

Hooks enforce policies automatically. No agent can bypass them.

guard.sh - blocks dangerous commands:

Bash

#!/bin/bash
COMMAND=$(cat | jq -r '.tool_input.command')

if echo "$COMMAND" | grep -qE 'rm -rf|rmdir'; then
  jq -n '{hookSpecificOutput: {hookEventName: "PreToolUse", permissionDecision: "deny", permissionDecisionReason: "File deletion requires explicit approval"}}'
  exit 0
fi

if echo "$COMMAND" | grep -q 'git push.*main'; then
  jq -n '{hookSpecificOutput: {hookEventName: "PreToolUse", permissionDecision: "deny", permissionDecisionReason: "Direct push to main not allowed. Create a PR."}}'
  exit 0
fi

exit 0

require-tests-pass.sh - prevents finishing if tests fail:

Bash

#!/bin/bash
INPUT=$(cat)
STOP_HOOK_ACTIVE=$(echo "$INPUT" | jq -r '.stop_hook_active')

if [ "$STOP_HOOK_ACTIVE" = "true" ]; then exit 0; fi

RESULT=$(npm test 2>&1)
if [ $? -ne 0 ]; then
  echo "Tests are failing. Fix them before completing: $RESULT" >&2
  exit 2
fi
exit 0

Persistent Memory: Agents That Learn

With memory enabled, each agent writes to MEMORY.md which gets injected into its system prompt on every future session. Over weeks, your agents build genuine institutional knowledge.

Researcher's memory after a few weeks:

markdown

## Reliable Sources
- docs.anthropic.com - authoritative for Claude API
- MDN Web Docs - best for web APIs

## This Project's Architecture
- Auth: JWT, 7-day expiry, stored in Redis
- DB: PostgreSQL via SQLAlchemy ORM

## Common Research Patterns
- Always check /docs folder before searching the web

Coder's memory after a few weeks:

markdown

## Codebase Conventions
- Tests mirror src/ structure in tests/
- Use factory_boy for test fixtures
- All API responses use APIResponse wrapper class

## Recurring Issues
- Always add new routes to router.py
- Use @pytest.mark.db for tests that hit the database

Agent Teams: When Agents Need to Debate

Enable in settings:

JSON

{ "env": { "CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1" } }

Use when workers need to challenge each other's findings:

Create an agent team to audit the authentication system.
Spawn 3 teammates:
- Security vulnerability hunter
- Performance analyst
- Devil's advocate (challenges the others' findings)
Have them debate and produce a consensus report.

Use TeammateIdle hooks to enforce quality gates before any teammate stops working.

Putting It All Together

You: "Build a user authentication feature with email/password login,
      JWT tokens, and a /logout endpoint."

Orchestrator breaks this down:
  → researcher: understand existing auth patterns
  → coder: implement the endpoints
  → reviewer: review the implementation
  → qa-tester: run the full test suite
  → writer: update the API docs

[Each specialist works in isolation]
[Hooks block bad output automatically]
[Memory means agents already know your codebase conventions]

You: receives a summary of what was built, reviewed, tested, documented

One directive. Fully autonomous execution.

When to Use What

Situation	Use
Quick focused task	Single subagent
Sequential pipeline	Chain subagents from main conversation
Parallel independent tasks	Multiple subagents simultaneously
Agents need to debate/coordinate	Agent teams
Repeated domain knowledge	Skills preloaded into subagents
Automated quality control	Hooks on PreToolUse / Stop
Agents learning over time	Persistent memory

Sources

Claude Code Subagents Documentation (official docs)
Claude Code Agent Teams Documentation (official docs)