Building an Automated Architecture Reviewer with Claude Opus 4.6
We built an Architect agent — a fully autonomous code reviewer powered by Claude Opus 4.6 — that explores a repository, runs audits, and produces a comprehensive architecture report. One command, zero human intervention, a professional-grade review in under 10 minutes.
This article covers how the agent is structured, how it leverages Anthropic's agentic tool-use loop, and what we learned shipping it.
Why an Architect Agent?
Architecture reviews are expensive. They require a senior engineer to read thousands of lines of code, trace data flows, audit dependencies, and synthesise findings into actionable recommendations. For a solo-contributor project like ours, that means reviewing your own work — the worst kind of audit.
We wanted something that could do a cold-start review of any repo and produce a structured report covering security, performance, reliability, code quality, and a prioritised roadmap. Claude Opus 4.6's extended thinking and tool-use capabilities made this possible.
Agent Architecture
The Architect agent is a single-turn agentic loop — one user prompt triggers a multi-step autonomous workflow where the model decides which tools to call and in what order.
The Runner Script
The entry point is a TypeScript script that configures the agent and streams its execution:
Key design decisions in the runner:
- Timeout + SIGINT — a 10-minute hard timeout and graceful SIGINT handling prevent runaway costs. The
AbortControllerpropagates cancellation into the streaming loop. - Tool usage tracking — every tool call is counted in a
Map<string, number>and logged with a JSON preview, giving visibility into what the agent is actually doing. - Summary JSON — alongside the markdown report, a JSON file captures metadata (timestamp, byte count, turn count, duration, tool usage breakdown) for programmatic consumption.
The Anthropic Agent SDK Pattern
The core of the system is streamAgent — a generator function that implements Anthropic's agentic loop pattern. Here's how it works:
This is the standard Anthropic tool-use loop: the model generates a response, the SDK executes any tool calls, appends results, and sends the extended conversation back. The loop continues until the model emits end_turn or the turn limit is reached.
Tool Configuration
The agent has access to six tools — chosen to give it full read access to the codebase plus the ability to run shell commands and write the final report:
The tool selection is intentionally minimal. The agent doesn't need code execution or test running — it's doing static analysis augmented by shell commands like pnpm audit, wc -l, and grep -c.
System Prompt Design
The system prompt is the most critical piece. buildArchitectSystemPrompt(outputFile) generates a structured prompt that tells the agent:
The prompt doesn't micromanage the agent's exploration order — it describes the desired output and lets the model decide how to get there. This is key to the agentic pattern: you define the goal, not the steps.
Agent Execution Flow
When run against our codebase (277 commits, ~40K LOC), the agent typically completes in 25–35 turns over 5–8 minutes. Here's the typical exploration pattern:
The agent self-reviews — it reads its own output and revises it. This happens naturally within the agentic loop without any special prompting.
Configuration & Constants
The agent relies on shared constants that define models, tools, and effort levels:
Using EFFORT_LEVELS.HIGH with Opus 4.6 enables extended thinking — the model takes more time to reason before each tool call, producing more thorough analysis.
Output: What the Agent Produces
The agent writes two files:
The markdown report from our latest run was ~12 KB covering 11 sections with specific file references, line numbers, and severity ratings. The agent identified issues we'd missed — like the fetch-all-then-filter anti-pattern in our resolvers and a missing auth check on a mutation.
Lessons Learned
What works well
Tool-use as exploration. Giving the agent glob, grep, and bash lets it explore a codebase the way a human would — starting broad, then drilling into areas of concern. It naturally runs pnpm audit without being told to.
Structured output via system prompt. Defining the report structure in the system prompt produces remarkably consistent output across runs. The agent fills in every section even when some are "nothing to report."
Self-review. The agent reads its own report and catches mistakes. On several runs, it rewrote entire sections after re-reading a file more carefully.
What to watch for
Turn budget matters. With maxTurns: 40, the agent has room to be thorough. At 20 turns, it rushes and skips sections. At 60, it over-explores and the timeout kicks in.
Cost awareness. A full run with Opus 4.6 HIGH effort costs roughly $2–5 depending on repo size. For CI integration, Sonnet 4.5 at MEDIUM effort produces 80% of the quality at 20% of the cost.
Hallucination risk on line numbers. The agent sometimes cites approximate line numbers. We mitigate this by having it quote the actual code in the report.
Running It Yourself
# Set your API key
echo 'ANTHROPIC_API_KEY="sk-ant-..."' >> .env
# Run the architect agent
npx tsx scripts/run-architect-agent.ts
The agent will explore your repo, run audits, and produce ARCHITECTURE_REPORT.md and ARCHITECTURE_REPORT.json in your repo root.
