lab note

FILE / NOTE·002

filed ✓

Claude Code Source Code Leak! Key Findings

A deep dive into what the leaked Claude Code codebase reveals — from model codenames and pricing internals to virtual pets, undercover mode, and prompt engineering secrets.

filed: 2026·03·31
ref: NOTE·002
tags: ai · claude · engineering · dev
by: A.T.

Model Codenames & Future Lineup

The codebase reveals Anthropic's internal model codename tradition — "Capybara", "Tengu", "Numbat" are used as codenames. There's a file scripts/excluded-strings.txt referenced to prevent codenames from leaking into external builds. The @[MODEL LAUNCH] markers throughout the code show a structured model launch process.

The model configs reveal the full model ladder including release dates:

Sonnet 4.5 — claude-sonnet-4-5-20250929 (Sep 29, 2025)

Opus 4.5 — claude-opus-4-5-20251101 (Nov 1, 2025)

Sonnet 4.6 & Opus 4.6 — latest, no date suffix (current frontier)

The frontier model is explicitly declared as Claude Opus 4.6 in constants/prompts.ts.

Undercover Mode 🕵️

One of the most fascinating features: utils/undercover.ts — when Anthropic employees contribute to public/open-source repos, Claude Code automatically enters "undercover mode":

"You are operating UNDERCOVER in a PUBLIC/OPEN-SOURCE repository. Your commit messages, PR titles, and PR bodies MUST NOT contain ANY Anthropic-internal information. Do not blow your cover."

It explicitly forbids mentioning model codenames (animal names like "Capybara", "Tengu"), internal repo names, Slack channels, and even the phrase "Claude Code". It strips all attribution. Auto-activates unless the repo matches an internal allowlist.

Companion/Buddy System — Virtual Pets for Developers

The buddy/ directory reveals a Tamagotchi-style companion system launching April 1-7, 2026 (teaser window). Each user gets a deterministic procedural pet based on their user ID:

18 species: duck, goose, blob, cat, dragon, octopus, owl, penguin, turtle, snail, ghost, axolotl, capybara, cactus, robot, rabbit, mushroom, chonk

Rarity system: common/uncommon/rare/epic/legendary with weighted drops

Stats: DEBUGGING, PATIENCE, CHAOS, WISDOM, SNARK

ASCII art sprites with idle animations and hats (crown, tophat, propeller, halo, wizard, beanie, tinyduck)

1% chance of shiny variants

Speech bubbles, petting with floating hearts, companion personality

Species names are obfuscated using String.fromCharCode() hex because one collides with a model codename!

Pricing Internals

From utils/modelCost.ts:

ModelInput (per Mtok)Output (per Mtok)NotesOpus 4.6$5$2530/30/150 in "fast mode" (6x)Opus 4/4.1$15$75Opus 4.5$5$25Cheaper than 4/4.1!Sonnet 4/4.5/4.6$3$15Haiku 4.5$1$5Web search$0.01/request—

Client Attestation (Anti-Piracy)

From constants/system.ts: Bun's native HTTP stack overwrites a cch=00000 placeholder in request bodies with a computed attestation hash. The server verifies this to confirm requests come from a real Claude Code client — implemented in Zig (bun-anthropic/src/http/Attestation.zig).

Coordinator/Multi-Agent System

The coordinator reveals a full multi-agent orchestration system where Claude Code can spawn "worker" agents, send messages between them, create teams, and manage parallel execution. There's a SendMessage tool for inter-agent communication with capabilities including: UDS sockets, bridge sessions, broadcast to all teammates.

Fun Hidden Commands

CommandDescription/stickersOpens stickermule.com/claudecode to order Claude Code stickers/mobileQR codes to download the Claude mobile app/desktopDesktop app handoff/thinkback"Year in review" animation (like Spotify Wrapped for coding)/good-claudeStubbed out (isEnabled: false) hidden command/ultraplanSpawns a remote CCR session for collaborative planning (30-min timeout)/buddyThe companion pet system/advisorConfigure an advisor model that supervises the main model/voiceVoice mode (requires OAuth, uses claude.ai voice_stream endpoint)Cron toolsCronCreate, CronDelete, CronList for scheduled autonomous tasks

Hilarious Spinner Verbs

The loading spinner cycles through 180+ words including:

"Beboppin'", "Befuddling", "Bloviating", "Canoodling", "Clauding", "Combobulating", "Discombobulating", "Flibbertigibbeting", "Hullaballooing", "Lollygagging", "Prestidigitating", "Razzmatazzing", "Recombobulating", "Shenaniganing", "Tomfoolering", "Whatchamacalliting"

Internal vs External Build System

The code extensively uses process.env.USER_TYPE === 'ant' (build-time define) to gate Anthropic-internal features. The external build replaces it with "external", and the bundler dead-code-eliminates internal branches. Many features visible in this code are ant-only — A/B tested internally first before shipping.

Prompt Engineering Deep Dive

The system prompt is a carefully layered, dynamically assembled document with distinct static (cacheable) and dynamic sections separated by a SYSTEM_PROMPT_DYNAMIC_BOUNDARY marker. Everything before the boundary can use scope: 'global' for prompt caching across users; everything after is session-specific and recomputed.

System Prompt Architecture

The prompt is built in constants/prompts.ts via getSystemPrompt() and assembled from sections in this order:

Identity / Intro (static)

# System — general behavioral rules

# Doing tasks — coding philosophy & style

# Executing actions with care — reversibility/blast radius rules

# Using your tools — tool selection guidance

# Tone and style

# Communicating with the user (ant) / # Output efficiency (external)

──── DYNAMIC BOUNDARY ────

# Session-specific guidance — agent tools, skills, explore agents

# Memory prompt — auto memory instructions + MEMORY.md content

Ant model override section

# Environment — CWD, git, OS, model info

# Language — if user configured a language preference

# Output Style — if non-default style active

MCP Server Instructions

Scratchpad directory instructions

Function result clearing

Tool results summarization

Numeric length anchors (ant-only)

Token budget instructions (feature-gated)

Brief/proactive section (KAIROS-gated)

Each section uses a systemPromptSection() wrapper that memoizes computation — the value is computed once and cached until /clear or /compact. Sections marked DANGEROUS_uncachedSystemPromptSection recompute every turn (e.g., MCP instructions, since servers can connect/disconnect between turns) — each of these requires a _reason string documenting why cache-breaking is necessary.

The Two Communication Styles: Ant vs External

External users get a terse block:

"Go straight to the point. Try the simplest approach first without going in circles. Do not overdo it. Be extra concise. Lead with the answer or action, not the reasoning. Skip filler words, preamble, and unnecessary transitions."

Ant users get a rich, prose-style writing guide — a mini essay on technical communication:

"When sending user-facing text, you're writing for a person, not logging to a console."

The design insight: they found quantitative anchors (≤25 words between tool calls, ≤100 words final response) reduce output tokens by ~1.2% vs qualitative "be concise," so ants get the numbers while externals get the qualitative version.

Model-Specific Counterweights

The code has @[MODEL LAUNCH] markers flagging sections tuned per model generation. For Capybara v8 specifically:

ProblemCounterweightPROver-commenting code"Default to writing no comments. Only add one when the WHY is non-obvious."Model launch section29-30% false-claim rate (vs v4's 16.7%)"Never claim 'all tests pass' when output shows failures, never suppress or simplify failing checks to manufacture a green result."Model launch sectionInsufficient thoroughness"Before reporting a task complete, verify it actually works: run the test, execute the script, check the output."PR #24302Insufficient assertiveness"If you notice the user's request is based on a misconception, or spot a bug adjacent to what they asked about, say so."PR #24302

There's also a @[MODEL LAUNCH]: Remove this section when we launch numbat comment on the output efficiency section, revealing Numbat as the next model generation after Capybara.

Coding Philosophy Prompt

The "Doing tasks" section is a manifesto against over-engineering:

"Don't add features, refactor code, or make 'improvements' beyond what was asked. A bug fix doesn't need surrounding code cleaned up."

The "Executing Actions with Care" Framework

This section defines a sophisticated reversibility/blast-radius framework for tool execution:

"Carefully consider the reversibility and blast radius of actions. Generally you can freely take local, reversible actions. But for actions that are hard to reverse, affect shared systems, or could be destructive, check with the user before proceeding."

Explicit examples of risky actions requiring confirmation:

rm -rf, git reset --hard, git push --force

Dropping database tables, killing processes

Creating/closing/commenting on PRs or issues

Sending messages (Slack, email, GitHub)

Uploading content to third-party web tools (may be cached/indexed)

Key principle: "A user approving an action once does NOT mean that they approve it in all contexts."

Undercover Mode Prompt Stripping

When isUndercover() is true, the system prompt is surgically stripped:

Model name/ID — completely removed from env info

Model family info — "The most recent Claude model family is Claude 4.5/4.6..." — suppressed

Claude Code availability info — suppressed

Fast mode description — suppressed

Knowledge cutoff date — still included (factual, not identifying)

The commit/PR prompt adds: "Write commit messages as a human developer would — describe only what the code change does."

Context Compaction (Summarization) Prompt

When context gets long, Claude uses a detailed summarization prompt (services/compact/prompt.ts) with a clever structure:

NO_TOOLS_PREAMBLE — An aggressive instruction repeated at the start AND end of the prompt: "CRITICAL: Respond with TEXT ONLY. Do NOT call any tools. Tool calls will be REJECTED and will waste your only turn — you will fail the task." This exists because on Sonnet 4.6+, the model sometimes attempts tool calls despite being told not to (2.79% failure rate on 4.6 vs 0.01% on 4.5).

<analysis> scratchpad — The model writes a drafting analysis in <analysis> tags that gets stripped before the summary enters context. This is a chain-of-thought technique where the thinking is useful for quality but discarded from the final output.

9-section structured summary — Primary Request, Key Technical Concepts, Files and Code Sections (with full snippets), Errors and Fixes, Problem Solving, All User Messages (verbatim), Pending Tasks, Current Work, Optional Next Step.

Partial compaction — Three variants: from (summarize recent messages), up_to (summarize prefix for cache hit), and full. The up_to variant adds a "Context for Continuing Work" section instead of "Next Step" since newer messages will follow.

Auto-Mode Classifier (YOLO Mode)

The "auto mode" uses an AI classifier (separate model call) to decide whether tool calls should be auto-approved or require user confirmation. The system has:

A base prompt loaded from auto_mode_system_prompt.txt

Separate permission templates for external (permissions_external.txt) and Anthropic-internal (permissions_anthropic.txt) users

User-customizable rules in three categories: allow, soft_deny, environment

REPLACE semantics for external users (your rules replace defaults) vs ADDITIVE semantics for ants (your rules append to stricter defaults)

CLAUDE.md context injection — the classifier sees the user's CLAUDE.md config wrapped in <user_claude_md> tags, treated as "part of the user's intent"

Transcript projection — assistant text is deliberately excluded from the classifier input because "assistant text is model-authored and could be crafted to influence the classifier's decision" (anti-prompt-injection)

A 2-stage XML classifier for more complex decisions: Stage 1 nudges immediate decision ("Err on the side of blocking"), Stage 2 elicits reasoning

A critique command (claude auto-mode critique) that uses a side-query to review user rules for clarity, completeness, and conflicts

Coordinator Multi-Agent Prompt

The coordinator system prompt (coordinator/coordinatorMode.ts) is the most elaborate prompt in the codebase — a full project management operating manual:

Phase-based workflow: Research → Synthesis → Implementation → Verification

Parallelism is your superpower: "Launch independent workers concurrently whenever possible. Don't serialize work that can run simultaneously."

Synthesis is mandatory: "Never write 'based on your findings' or 'based on the research.' These phrases delegate understanding to the worker instead of doing it yourself."

Continue vs. Spawn decision table: High context overlap → continue existing worker. Low overlap → spawn fresh. Wrong approach → spawn fresh (to avoid anchoring).

Real verification: "Verification means proving the code works, not confirming it exists. A verifier that rubber-stamps weak work undermines everything."

Worker prompts must be self-contained: "Workers can't see your conversation. Every prompt must be self-contained with everything the worker needs."

Memory System Prompt

The memory system (memdir/memdir.ts) instructs Claude to maintain a persistent file-based memory at ~/.claude/projects/<slug>/memory/:

MEMORY.md as an index file (max 200 lines, 25KB) with one-line pointers to topic files

Typed memory taxonomy: user preferences, feedback, project context, reference

Anti-patterns explicitly forbidden: "Content derivable from the current project state (code patterns, architecture, git history) is explicitly excluded"

Clear separation of concerns: Memory vs Plans vs Tasks — memory is for future conversations, plans for alignment, tasks for current work tracking

DIR_EXISTS_GUIDANCE: "This directory already exists — write to it directly with the Write tool (do not run mkdir or check for its existence)" — added because Claude was wasting turns on ls/mkdir -p before writing

Companion Pet Prompt

The companion system (buddy/prompt.ts) adds a minimal prompt when a pet is active:

"A small [species] named [name] sits beside the user's input box and occasionally comments in a speech bubble. You're not [name] — it's a separate watcher. When the user addresses [name] directly (by name), its bubble will answer. Your job in that moment is to stay out of the way: respond in ONE line or less."

Proactive/Autonomous Agent Prompt

When in KAIROS proactive mode, Claude gets a completely different identity:

"You are an autonomous agent. Use the available tools to do useful work."

Key behavioral rules:

Terminal focus awareness: When terminal is unfocused, "lean heavily into autonomous action — make decisions, explore, commit, push." When focused, "be more collaborative — surface choices, ask before committing."

First wake-up: "Greet the user briefly and ask what they'd like to work on. Do not start exploring the codebase unprompted."

Idle ticks: "If you have nothing useful to do, you MUST call Sleep. Never respond with only a status message — that wastes a turn and burns tokens."

Bias toward action: "If you're unsure between two reasonable approaches, pick one and go. You can always course-correct."

Output Styles

Beyond the default, two built-in output styles reshape Claude's persona:

Explanatory mode — adds ★ Insight boxes with educational points about implementation choices:

★ Insight ───────────────────────────────────── [2-3 key educational points] ─────────────────────────────────────────────────

Learning mode — a Socratic teaching mode where Claude pauses and asks the user to write 2-10 line code pieces for hands-on practice. It adds TODO(human) markers in the code and presents structured "Learn by Doing" blocks with context, task, and guidance. Only triggers for meaningful design decisions, not busy work.

Cyber Risk / Security Instruction

Owned by the Safeguards team (David Forsythe, Kyla Guru) with a "DO NOT MODIFY WITHOUT SAFEGUARDS TEAM REVIEW" warning:

"Assist with authorized security testing, defensive security, CTF challenges, and educational contexts. Refuse requests for destructive techniques, DoS attacks, mass targeting, supply chain compromise, or detection evasion for malicious purposes. Dual-use security tools (C2 frameworks, credential testing, exploit development) require clear authorization context."

Verification Agent Contract

An ant-only adversarial verification system for non-trivial implementations (3+ file edits, backend/API changes, infrastructure):

"Independent adversarial verification must happen before you report completion — regardless of who did the implementing. Your own checks, caveats, and a fork's self-checks do NOT substitute — only the verifier assigns a verdict."

On FAIL: fix and re-verify. On PASS: spot-check the verifier's report. On PARTIAL: report what passed and what couldn't be verified. The system is designed so Claude cannot self-assign a passing grade.

Prompt Caching Strategy

The code uses an elaborate prompt caching mechanism:

Static sections before the boundary marker share a cache key across all users (scope: 'global')

A Blake2b hash of the static prefix determines cache hits

Session-variant content (tools, skills, explore agents, non-interactive mode) that was previously in the static section was intentionally moved after the boundary because each conditional would multiply cache variants by 2^N (referenced in PRs #24490, #24171)

DANGEROUS_uncachedSystemPromptSection sections get a mandatory reason string documenting why they need to bust the cache

MCP instructions were moved from DANGEROUS_uncached to delta-based attachments to avoid busting the ~20K token prompt cache on late MCP server connects

Knowledge Cutoff Dates

Hardcoded per model in getKnowledgeCutoff():

ModelKnowledge CutoffSonnet 4.6August 2025Opus 4.6May 2025Opus 4.5May 2025Haiku 4.xFebruary 2025Opus 4 / Sonnet 4January 2025

Ant-Only Features (Anthropic Internal)

Everything gated behind process.env.USER_TYPE === 'ant' — invisible to external users.

Ant-Only Tools

ToolWhat it doesConfigToolRuntime config editing (feature flags, gates)TungstenToolLive tmux session — gives Claude a persistent, visible terminal panel. Captures terminal frames for the model. Has a "live monitor" UI componentREPLToolBatched REPL mode that hides individual tools (Read, Write, Edit, Glob, Grep, Bash, Notebook, Agent) and forces Claude to use a single REPL tool for batch operations. Default-on for ants, opt-out with CLAUDE_CODE_REPL=0SuggestBackgroundPRToolProactively suggests creating background PRs

Ant-Only Commands

CommandPurpose/versionPrint exact build version + build time/tagToggle searchable tags on a session/filesList all files currently in context/bridge-kickInject bridge failure states for manual testing of recovery paths (close, poll errors, register failures, heartbeat 401s, etc.)/thinkback-playYear-in-review animation player

Ant-Only Skills

SkillPurpose/stuckDiagnose frozen/slow Claude Code sessions, posts results to #claude-code-feedback Slack channel (C07VBSHV7EV)/debugEnhanced version — ants get full event logging; externals just get "enable logging and diagnose"

Ant-Only System Prompt Sections

Numeric length anchors: "Keep text between tool calls to ≤25 words. Keep final responses to ≤100 words" — 1.2% output token reduction vs qualitative "be concise"

Comment writing rules: "Default to writing no comments. Only add one when the WHY is non-obvious." (Capybara model over-comments)

Thoroughness counterweight: "Before reporting a task complete, verify it actually works: run the test, execute the script, check the output." (PR #24302)

Assertiveness counterweight: "If you notice the user's request is based on a misconception, or spot a bug adjacent to what they asked about, say so." (PR #24302)

False-claims mitigation: "Never claim 'all tests pass' when output shows failures..." — Capybara v8 has 29-30% FC rate vs v4's 16.7%

Rich communication style: Ants get long-form prose instructions about linear readability and "inverted pyramid" writing. Externals get a simple "be concise" block

Bug-reporting instructions: Ants told to recommend /issue for model problems, /share for product bugs, and offered to post ccshare links to #claude-code-feedback Slack

KAIROS-Gated Features (Assistant Mode, ant-only A/B)

FeatureDescriptionSendUserFileToolSend files to the userPushNotificationToolPush notifications from background tasksSleepToolLet Claude sleep/wait (proactive agents)SubscribePRToolSubscribe to GitHub PR webhooksBriefToolBrief/summary mode for assistantMonitorToolBackground monitoringCronCreate/Delete/ListScheduled autonomous tasksRemoteTriggerToolRemote triggers for agent--session-id flagPass session IDs to bridge

Ant-Only Analytics & Infra

GrowthBook feature flag overrides via /config Gates tab

First-party event logging with detailed event/experiment tracking

kairosActive metadata for BQ session segmentation

skillMode/observerMode metadata for cohort splits on tengu_backseat_* events

Attribution hooks — tracks commit attribution (git hooks)

/insights command — collects data from remote Coder homespaces via SSH/SCP, merging sessions from remote hosts

Numeric effort levels — ants can set numeric effort (externals only get string levels)

URLs in tool output — ant-only

Bash gh commands — ant-only (they make network requests)

Extra safe env vars — ant-only bash permission allowlist

Bridge auto-connect (CCR_AUTO_CONNECT) — automatic remote connection

SIGUSR2 → force reconnect for manual bridge testing

Internal beta header: cli-internal-2026-02-09