All posts
AnthropicClaudeAI ArchitectureAgentic AIMCP

Anthropic Code Leak: What 512K Lines Reveal

Claude Code's 512K-line agentic harness was accidentally leaked. Here's what the TypeScript codebase reveals about how production AI agents work.

April 1, 2026 5 min readby Agent-CoreX

On March 31, 2026, Anthropic accidentally published a 59.8 MB source map file alongside a routine update to the @anthropic-ai/claude-code npm package. The file pointed to a zip archive on Anthropic's own cloud storage containing the full source code of Claude Code — approximately 512,000 lines of TypeScript across 1,900 files.

Within hours, it was backed up on GitHub and forked over 41,500 times. The code spread across the developer community before Anthropic could contain it.

This wasn't a breach in the traditional sense. It was a packaging error: a debug artifact that should never have shipped made it into version 2.1.88. Anthropic confirmed it and attributed it to human error, noting no customer data or credentials were exposed.

But what was exposed is arguably more interesting than customer data: the full architectural blueprint for a production-grade AI agent.

What Was Actually Leaked

The leaked code is not the Claude model itself. It's the agentic harness — the software layer that wraps the underlying LLM and controls how it interacts with tools, memory, and the outside world.

Anthropic's own post-leak statements confirmed this: "At least some of Claude Code's capabilities come not from the underlying large language model that powers the product but from the software harness that sits around the underlying AI model and instructs it how to use other software tools."

The harness is where the interesting engineering lives. What developers found in those 1,900 files:

  • 44 hidden feature flags for capabilities that are fully built but not yet shipped
  • 20 unreleased features, including several that fundamentally change how Claude Code operates
  • The complete system prompts governing Claude's behavior and safety constraints
  • A plugin-style tool architecture with a base tool definition spanning 29,000 lines
  • Internal model codenames: Capybara (Claude 4.6 variant), Fennec (Opus 4.6), Numbat (unreleased, still in testing)

The Agentic Harness Architecture

The clearest revelation is the architecture itself: every Claude Code capability is implemented as a discrete, permission-gated tool. The model doesn't have capabilities baked in — it has a tool registry, and each tool is defined with a name, description, input schema, and execution logic.

This is a plugin architecture. The model sits in the center and interacts with the world entirely through tool calls. The harness routes those calls, enforces permissions, manages context, and handles errors.

This confirms what many in the AI community suspected but couldn't verify: the "intelligence" of a production AI agent is as much in the harness as in the model.

Three-Layer Memory Architecture

The leaked code reveals a three-layer memory system that explains Claude Code's reliability over long, multi-step work sessions:

  1. Working memory — the active context window, what the model is currently processing
  2. Session memory — indexed state maintained across tool calls within a session
  3. Persistent memory — long-lived knowledge that survives across sessions

A "Strict Write Discipline" rule governs transitions between layers: the agent must confirm a successful file write before updating its session index. This prevents the model from polluting its context with failed attempts — a source of subtle, hard-to-debug errors in naive implementations.

Unreleased Features: KAIROS and Undercover Mode

Two of the most-discussed unreleased features from the leak:

KAIROS is an autonomous daemon mode. Instead of requiring a user to actively prompt Claude, KAIROS runs Claude Code as an always-on background process. It can review its own previous sessions, extract learnings, and transfer them to future sessions — a persistent agent that improves without explicit training.

Undercover Mode is more controversial: explicit instructions directing Claude Code to scrub all AI-generated indicators from public git commit messages, making its contributions indistinguishable from human commits. The community reaction was mixed — some see it as a practical feature for teams that don't want AI attribution in their history; others raised concerns about transparency.

Why This Matters for AI Developers

The leak makes one thing undeniable: tool orchestration is the primary engineering challenge in production AI agents, not model capability.

The model is a commodity. What differentiates Claude Code — or any capable AI agent — is the harness: how tools are defined, selected, injected into context, and executed. How memory is managed across calls. How errors propagate and recover. How permissions are enforced.

This is also where the biggest inefficiency lives. The plugin-style architecture means every tool adds to the context. Claude Code's base tool definition alone is 29,000 lines. At scale, that's a significant token cost per interaction — cost that compounds across millions of requests.

Where Agent-CoreX Fits

Agent-CoreX was built around the same fundamental insight the leak confirms: the harness matters more than the model, and the harness's biggest solvable problem is tool context overhead.

Instead of loading all tool definitions into context on every request, Agent-CoreX's /retrieve_tools endpoint uses semantic vector search to return only the 3–5 tools relevant to a given query. The token savings are significant — typically 80–90% of the context overhead from tool definitions.

The leak validates the architecture. The next question is efficiency.

See how semantic tool retrieval works →

Try the Playground to run your first retrieval →

Try Agent-CoreX for free

Connect 100+ MCP tools. Cut LLM costs by 60%. Setup in 2 minutes.

Get started free