How Claude Code Builds Its System Prompt: 18 Layers You Never See
When you write a prompt to Claude Code, you're not the only one talking. There are 18 sections assembled in strict order, a dynamic boundary that optimizes caching costs, and a coding philosophy that actively fights over-engineering. Here's what we found under the hood.
Claude Code is Anthropic's official CLI for Claude. It's the tool most developers use today to write code with AI assistance directly from the terminal. But the tool you see — the prompt, the response, the file edits — is just the visible layer of an industrial-grade prompt engineering system.
What you don't see is that every interaction you have with Claude Code is mediated by an 18-section system prompt assembled in strict order, with a dynamic boundary that splits the prompt into a cacheable part and a per-session part. There's an explicit risk taxonomy, negative instructions that fight the model's bad habits, and a memoization system that ensures the prompt isn't recalculated unnecessarily.
We know this because we reverse engineered the source code. This article is the first in a 6-part series where we take Claude Code apart piece by piece.
⚡ Article thesis
Claude Code's system prompt is not a static text — it's an assembly pipeline with 18 sections, each with its purpose, conditional gate, and caching strategy. Understanding how it's built reveals prompt engineering patterns you can apply to your own agents.
18 sections in strict order
The entry point is getSystemPrompt() in constants/prompts.ts. This function assembles the system prompt by concatenating memoized sections in an order that never changes. The first 12 sections are static and cacheable. The last 6 change per session.
Complete Assembly Pipeline
"You are Claude Code, Anthropic's official CLI for Claude."
Interactive vs headless mode — swaps "assist" for "complete"
Allows authorized pentesting; blocks DoS and supply chain attacks
"NEVER generate or guess URLs" — prevents accidental phishing
Active permission mode, lifecycle hooks, summarization rules
Pure anti-YAGNI: "Don't add features beyond what was asked for"
LOW/MEDIUM/HIGH risk taxonomy for every operation
When to use each tool, chaining, parallelization
Isolated context, inherited tools, disallowed tool list
Internal: "≤25 words between tools" — External: "Be terse"
Only active with TOKEN_BUDGET feature flag — shows per-turn consumption
Autonomous mode: tick-based wake-ups, push notifications, webhooks
══ __SYSTEM_PROMPT_DYNAMIC_BOUNDARY__ ══ CWD, git branch, OS, model, date, shell
Temporary directory path for intermediate files
CACHED_MICROCOMPACT — auto-clears old tool results
Custom instructions from each connected MCP server
Auto-extracted memories from previous conversations
User's project instructions — a writable extension of the system prompt
💡 Why does order matter? It's not arbitrary. Language models pay more attention to the beginning and end of system prompts (primacy and recency effects). Anthropic places identity at the top and user project instructions (CLAUDE.md) at the bottom — exactly where they'll carry the most weight.
The Dynamic Boundary — The Caching Trick
At position 12–13 in the pipeline, there's a marker that's invisible to you:
__SYSTEM_PROMPT_DYNAMIC_BOUNDARY__ This sentinel splits the system prompt into two zones with very different purposes:
Static Zone (above)
Sections 1–12. Cached in Anthropic's API across turns. Never re-tokenized each time you send a message.
Result: lower latency, fewer billed tokens, same functionality.
Dynamic Zone (below)
Sections 13–18. Change per session: working directory, date, git branch, CLAUDE.md content, memories.
Recalculated every turn because they reflect the current environment state.
Why is this an engineering trick? Because Anthropic's prompt caching works by prefix: if the first N tokens of a prompt match a previous prompt, the cache is reused. By placing all stable content before the boundary, Anthropic maximizes cache hits. Each turn of your conversation reuses ~70% of the system prompt without paying for it.
📊 Applicable pattern: If you build agents with Anthropic's API, Claude, or any model with prompt caching, apply the same technique. Put all your stable prompt content first. Environment variables, dates, history — all at the end. Every cache hit saves tokens and latency.
The Internal Caching System
Beyond API-level caching, Claude Code has an internal memoization system in systemPromptSections.ts. Each section of the system prompt is computed once and cached until you run /clear or /compact.
// Memoized — cached until /clear or /compact systemPromptSection(name: string, compute: () => string): string // Recomputes EVERY turn — breaks cache (use sparingly) DANGEROUS_uncachedSystemPromptSection( name: string, compute: () => string, reason: string ): string // Called on /clear and /compact — also clears beta header latches clearSystemPromptSections(): void
The name DANGEROUS_uncachedSystemPromptSection is not accidental. The DANGEROUS prefix is a deliberate signal to developers: using this variant breaks the prompt cache, increases latency, and costs more. It's only used for sections that genuinely change between turns (like environment information).
The Anti-YAGNI Philosophy
Section 6 in the pipeline — "Doing Tasks" — contains Claude Code's coding philosophy. It's remarkably opinionated. Instead of saying "write clean code," the prompt explicitly lists what NOT to do:
- Don't add features beyond what was asked for - Don't add error handling for scenarios that can't happen - Don't create helpers for one-time operations - Don't add flexibility for "future needs" when not requested
This is pure anti-YAGNI — "You Ain't Gonna Need It." Anthropic has discovered that LLMs, left to their own devices, tend to over-engineer: they create unnecessary abstractions, add error handling for impossible scenarios, and build "just in case." Negative instructions fight these bad habits directly.
🏗️ Internal Build (Anthropic)
- • Minimize code comments
- • Be thorough with edge cases
- • NEVER say "I can't" — always try
- • Show the actual error, don't refuse preemptively
🌍 External Build (Public)
- • Don't add features beyond what was asked
- • Don't create helpers for one-time operations
- • Don't add flexibility "just in case"
- • More conservative instructions (more guardrails)
Anthropic's internal version is notably more aggressive: it pushes the model to try everything and show actual errors instead of refusing. The public version is more conservative because it operates with less technical users who need more protection.
The Risk Taxonomy
Section 7 — "Actions" — teaches the model a blast radius taxonomy. Instead of relying on the model's subjective judgment to decide what's dangerous, it provides concrete categories:
| Level | Examples | Action |
|---|---|---|
| LOW | Read files, search code, git status, format | Execute without asking |
| MEDIUM | Write/edit files, create branches, run builds | Execute with caution |
| HIGH | Delete files, force push, rm -rf, DROP TABLE | Require explicit confirmation |
Additionally, there's a subset of operations that always require confirmation regardless of the permission mode:
- • Destructive commands —
rm -rf,DROP TABLE, disk formatting - • Operations visible to others —
git push, sending emails, publishing packages - • Hard-to-reverse changes — Database migrations, force push, config changes outside the project
🎯 Design pattern: Instead of telling a model "be careful," give it a decision table. Explicit taxonomies are more effective than vague instructions because they eliminate ambiguity. If your agent needs to decide whether something is dangerous, don't ask it to "judge" — give it concrete categories.
Conditional Sections and Their Gates
Not all 18 sections are always present. Several are controlled by feature flags that activate based on context:
| Section | Gate / Trigger | Effect |
|---|---|---|
| Token Budget | feature('TOKEN_BUDGET') | Shows token count per turn |
| KAIROS / Proactive | feature('PROACTIVE') | Autonomous mode: wake-ups, push, webhooks |
| Result Clearing | CACHED_MICROCOMPACT | Auto-clears old tool results |
| Coordinator Mode | Manual activation | Strips toolset to Agent + TaskStop + SendMessage |
| Plan Mode | User enters plan mode | Model proposes plans instead of executing |
| Simple Mode | CLAUDE_CODE_SIMPLE | Strips to Bash + Read + Edit only |
| Memory | Memory system enabled | Injects extracted memory content |
| MCP | MCP servers connected | Injects per-server instructions |
This means the system prompt doesn't have a fixed size. Depending on the user's configuration and active features, it can vary significantly. The memoization system ensures only active sections are assembled and cached.
Patterns You Can Apply Today
What we've seen isn't just a curiosity exercise. There are concrete prompt engineering patterns you can extract from this design and apply to your own agents:
1. Separate static from dynamic prompt
If you use Anthropic's API, Claude, or any model with prompt caching, put all stable content at the beginning of your system prompt. Environment variables, dates, history — all at the end. Every cache hit saves tokens and latency.
2. Negative instructions (anti-patterns)
Instead of saying "write clean code," say "don't add unnecessary abstractions, don't create helpers for one-time operations." LLMs respond better to concrete prohibitions than vague aspirations.
3. Explicit risk taxonomy
If your agent executes real-world actions, don't tell it to "be careful." Give it a table: LOW = execute freely, MEDIUM = caution, HIGH = confirm. Concrete categories eliminate ambiguity.
4. Naming as documentation
DANGEROUS_uncachedSystemPromptSection screams "don't use this lightly." Function names are the first line of documentation — make them intentional.
5. Feature flags in the prompt
Claude Code doesn't have a monolithic prompt — it has conditional sections controlled by flags. This enables A/B testing of instructions, fast rollbacks, and context-based adaptation without rewriting the entire prompt.
A Prompt Is a System
Claude Code's system prompt is not a long text blurb pasted at the beginning of every conversation. It's an assembly system with 18 sections, each with its purpose, conditional gate, caching strategy, and reason to exist.
The difference between a "prompt" and a "prompt system" is the same as between a bash script and an application. Scale, maintainability, and the ability to evolve without breaking everything.
In the next article in this series, we'll explore the context compression system — how Claude Code manages conversations that exceed the context window without losing any user instructions. Four compression levels, token budgets, and emergency circuits.
📚 Claude Code Anatomy Series: This is article 1 of 6. The following articles cover context compression, memory systems, hidden features, the tool & permission system, and the dual build (internal vs public).
Building agents with LLMs?
At Cadences we design agents using the same industrial patterns we discovered in Claude Code. If you want to apply these principles to your business, let's talk.
Gonzalo Monzón
Founder of CadencesLab. Software engineer, multi-agent systems architect, and perpetual student of how machines think. This series is born from months of reverse engineering the tools we use daily.