A spec-driven, agentic development system for Claude Code. Turn vague ideas into working software with persistent memory, specialist agents, and automatic quality gates.
Each command orchestrates multiple specialist agents, reads persistent memory, and enforces quality gates. You provide intent. The system handles everything else.
An orchestrator reads project memory, classifies your request, routes to the right specialist, and enforces gates between phases.
Each agent has a single responsibility, clear inputs/outputs, and strict rules about what it can and cannot do.
Routes work to the right agent. Enforces gates between phases. Maintains project state. Never lets builder redefine requirements.
opusTranslates vague ideas into structured specs with acceptance criteria. Makes reasonable assumptions instead of asking 10 questions.
sonnetDesigns the simplest viable architecture. Every component traces to a spec requirement. No overengineering.
opusImplements one task at a time against approved spec and architecture. No silent scope creep. Tests required for core behavior.
sonnetSkeptical code review. Classifies findings as must-fix, should-fix, or optional. Checks security, edge cases, spec alignment.
opusVerifies work against documented acceptance criteria. "Code running" is not enough — observable behavior must match the spec.
sonnetPrepares release readiness. Deploy checklists, env var docs, rollback plans, monitoring notes. Flags missing items as blockers.
sonnetSeven markdown files in agent-memory/ persist project state across sessions. No more re-explaining context. Every session starts by reading these files.
Identity: name, goal, users, success metrics, constraints, non-goals
Requirements: problem, user flows, functional/non-functional reqs, acceptance criteria, risks
Design: system diagram, components, data flow, interfaces, tradeoffs, phases
Decision log: context, decision, alternatives, rationale, consequences
Work tracking: backlog, in progress, blocked, done
Quality: acceptance criteria pass/fail, test coverage, regression risks
Current state: objective, last step, blockers, assumptions, next actions
Each command is a complete workflow that reads memory, routes to agents, and updates state.
The system enforces these automatically. No human discipline required.
Builder-agent refuses to start if acceptance criteria don't exist. Routes to spec-agent first.
Every /build-task runs reviewer-agent. Must-fix items block completion and go back to builder.
QA-agent verifies each acceptance criterion. Any FAIL sends work back. "It runs" is not enough.
Builder only implements what's in the current task. Extras go to TASKS.md backlog, not into the code.
Agents make reasonable assumptions and log them in DECISIONS.md. Only escalate when wrong guess = material risk.
Seven specialist roles, not twenty personas. Each has one job and clear boundaries.
Memory files, not conversation history. Every session reads the same persistent state.
Requirements and acceptance criteria must exist before implementation starts.
No task completes without review and QA. "It works" is insufficient.
Document reasonable assumptions. Only ask when wrong guess = material risk.
One command bootstraps the entire system into any repo. Agents are project-agnostic.
Four commands from empty directory to structured project with specs, architecture, and a task backlog.
# Create your project mkdir ~/my-app && cd ~/my-app && git init # In Claude Code: /start-project A CLI that watches for screenshots and uploads to R2 with a share link # Plan the first feature /plan-feature File watcher + R2 upload with clipboard copy # Build it (auto review + QA) /build-task # Ready to ship? /ship-check