Why Managed Agents · ~12 min read

Agents Need More Than a Long Context Window

Every production AI agent hits the same wall: the model is stateless, but the task isn't. Managed Agents gives you a runtime — containerized execution environments, append-only event logs, and session lifecycle management — so you can build agents that run for minutes or hours without hacking around API limits.

−60%

p50 TTFT drop

−90%

p95 TTFT drop

4

core primitives

$0.08

per runtime hour

Monthly Cost Calculator — Adjust Sliders to Estimate Your Cost

Sessions per day50

Avg task duration (min)5 min

Avg tokens per session10k

Hover a component to see what it does inside a Managed Agent.

    The key insight: Traditional agents stuff everything into one massive prompt and wait for the model to respond. Managed Agents decouples the model (brain) from execution (hands) so the model can start streaming before all tools have run.
  

Local-First Problem

Traditional LLM calls are stateless round-trips. Long tasks require re-sending the entire conversation each turn, growing the prompt and increasing latency exponentially.

What Changes

Sessions persist state server-side. The model receives only what's new. Tool results stream in as events rather than blocking the next model call.

What Stays Same

The Claude API, prompt engineering, and tool-use patterns you already know. Managed Agents is an orchestration layer on top — not a different model.

Why "Managed"? →

Why "Managed"?

You Build the Recipe. Anthropic Manages the Kitchen.

The word "Managed" means Anthropic runs the infrastructure underneath — containers, session state, security, scaling. You only write the agent logic. Here is exactly what that difference looks like.

The Problem: The Growing Prompt Wall

Every time your agent takes a new action, you must re-send the entire conversation history to the model — because it has no memory between API calls. The prompt grows with every turn. Eventually it hits the context limit and the task dies.

Click "Watch it grow" — see how traditional agents re-send the full history every turn, while Managed Agents only send new events.

What "Managed" Actually Means — Click a Row to Compare

Click any row to see what building it yourself looks like vs what Anthropic handles for you.

    The restaurant analogy: A self-managed agent is like owning the restaurant building — you buy the equipment, hire the staff, handle maintenance, pay the utilities. Managed Agents is a managed kitchen space — you just bring your recipes (agent logic). The kitchen, staff, and utilities are handled for you. You focus on the food, not the plumbing.
  

Without Managed Agents — You Build

📄 Session state storage (database)
💻 Tool execution server (VM or container)
🔄 Retry + timeout logic (custom code)
🤖 Multi-agent orchestration (custom scheduler)
🔒 Credential isolation (your own vault/proxy)
📈 Scaling + monitoring (DevOps work)

With Managed Agents — Anthropic Handles

✓ Append-only event log (session state)
✓ Isolated container per session (execution)
✓ Session lifecycle API (created/paused/done)
✓ callable_agents API (multi-agent routing)
✓ Vault + MCP proxy (credential security)
✓ Auto-scaling + observability (built-in)

Real-World Use Cases →

Real-World Use Cases

What Do You Actually Build With This?

Managed Agents is not an abstract framework — it's built for tasks that take minutes, not milliseconds. Here are four real patterns, each showing which tools fire and how the session unfolds.

📊

Code Reviewer

Reviews a pull request: runs tests, reads diffs, checks coverage, writes structured feedback.

🔎

Research Agent

Given a topic, searches the web, reads papers, synthesizes a structured briefing document.

📈

Data Analyst

Loads a CSV, runs Python analysis, generates charts, and writes an executive summary.

💬

Support Bot

Handles customer queries: looks up docs, checks order status, escalates to a specialist agent if needed.

Select a use case above to see its event flow and tool breakdown.

Four Core Primitives →

Core Concepts

Four Primitives, One Runtime

Everything in Managed Agents reduces to four nouns: Agent, Environment, Session, Event. Understand these and the entire API makes sense.

Click a node to explore its role in the runtime.

Agent

The blueprint. Defines which model, system prompt, tool list, and MCP servers an agent will use. Creating an Agent does nothing — it just registers configuration. Think of it as a class definition.

Environment

The container template. Specifies what execution resources a session gets — filesystem, network access, installed binaries, memory limits. Environments are reusable across agents.

Session

The running instance. Combines an Agent with an Environment and starts executing. Sessions have lifecycle states: created → running → paused → complete → failed. This is where computation happens.

Event

The message unit. Everything that happens in a session is an event: user messages, assistant turns, tool invocations, tool results, status changes. The event log is append-only and queryable.

Session lifecycle — click any state to trigger a transition

    Agent vs Session: An Agent is like a Docker image — a recipe. A Session is like a running container. You create many Sessions from one Agent, each isolated, each with its own event log.
  

What is an event log and why is it append-only? ▼

The event log is a sequential record of everything that happened in a session. Append-only means events are never modified or deleted — only new events are added. This gives you a complete audit trail, enables time-travel debugging, and allows resuming a session exactly where it left off after a crash. The model always sees the log as its context.

Can I run multiple agents in one session? ▼

Not directly — a session belongs to one agent. But you can have multiple sessions communicate via the multi-agent API (callable_agents). One session acts as orchestrator and spawns sub-sessions, each with their own agent, environment, and event log. Results flow back as events to the orchestrator.

How does pausing a session work? ▼

A session can be paused mid-execution — for example when it needs human approval before proceeding, or when waiting for an async external event. The environment is preserved (filesystem, process state). When you resume, the agent picks up from the exact event it paused on, as if nothing happened.

Agent → Session → Event: The One-to-Many Relationships

Click any node to understand the cardinality: how one Agent spawns many Sessions, one Environment serves many Sessions, and each Session produces many Events.

Inside the Architecture →

Architecture

Brain, Hands, and the Session Log

Managed Agents achieves its latency gains through a clean three-way split: a stateless model harness (Brain), isolated execution containers (Hands), and a persistent event log (Session). Each can scale independently.

Click a component to see its responsibilities and scaling properties.

🧠 Brain — Stateless Harness

The model layer. Receives events, generates the next assistant turn, emits tool calls. Completely stateless — it reads the event log, produces output, done. Can be scaled horizontally with zero coordination. No session state stored here.

👔 Hands — Execution Env

The container layer. Runs tools: shell commands, file I/O, web fetches, MCP servers. Each session gets its own isolated environment. Results are written back to the event log. The Brain never directly touches the environment.

📄 Session — Event Log

The memory layer. An append-only ledger of every event. Both Brain and Hands read from it; both write to it. Durable, queryable, resumable. This decoupling is what enables the -60%/-90% TTFT improvements.

// Why TTFT drops so dramatically: // // BEFORE (traditional agents): // [collect all tool results] → [build full prompt] → [model starts streaming] // p50 first token: ~4.2s p95 first token: ~18s // // AFTER (managed agents): // [model reads partial event log] → [starts streaming immediately] // [tool results arrive as new events while model is running] // p50 first token: ~1.7s (-60%) p95 first token: ~1.8s (-90%)

Why is statelessness in the Brain an advantage? ▼

Stateless services are trivially scalable — you can add more Brain instances without coordination, routing, or sticky sessions. If a Brain instance crashes mid-generation, the next one picks up from the event log with no data loss. It also simplifies reasoning about correctness: the model always operates on the canonical log.

How are containers isolated between sessions? ▼

Each session gets a fresh container from the Environment template. Filesystem namespaces, network policies, and resource limits (CPU/memory) are applied per-session. Sessions cannot see each other's filesystems. When a session ends, the container is destroyed (though you can snapshot the filesystem before teardown).

The Event Loop →

The Event Loop

Everything Is an Event

The event loop is the heartbeat of a Managed Agent session. User messages, model turns, tool invocations, tool results, and status changes all flow as typed events through the same append-only log.

Session Replay — Drag to Scrub Through a Real Agent Session

Task: "Analyze sales_data.csv and find the top 5 products by revenue." Drag the scrubber to move through every event as it happened.

Drag the scrubber to step through each event in the session.

00:00:00 session.created { id: "sess_01", agent: "code-reviewer" }

Event Types

user_message

Human turn — text, files, or structured data sent to the session

assistant_turn

Model response — text and/or tool_use blocks

tool_use

Tool invocation emitted by the model — name, input, call ID

tool_result

Execution environment returns the result for a tool_use call

status_change

Session transitions: running → paused → complete → failed

Event Schema

Click an event type to see its JSON schema and an example payload.

How do I stream events in real time? ▼

Use the GET /v1/sessions/{id}/events?stream=true endpoint with Server-Sent Events (SSE). Each event is delivered as a JSON line as soon as it's written to the log. You can also poll with after_event_id for simpler integrations that don't need sub-second delivery.

Can I inject events from outside the session? ▼

Yes — POST a user_message event to a running session to send a new human turn. You can also post structured tool_result events for tools that run outside the managed environment (e.g., calling your own API). This is how human-in-the-loop approval flows work.

Multi-Agent Threads →

Multi-Agent Orchestration

Orchestrators, Subagents, and Session Threads

Complex tasks decompose naturally into parallel workstreams. An orchestrator session can spawn multiple subagent sessions, each with its own agent definition, isolated environment, and independent event log. Results flow back as events.

// Multi-agent API pattern: const agent = await client.beta.agents.create({ model: "claude-opus-4-6", system: "You are a research orchestrator.", callable_agents: [ { agent_id: "web-researcher", alias: "search" }, { agent_id: "code-analyst", alias: "analyze" }, { agent_id: "summarizer", alias: "summarize" } ] }); // The model calls sub-agents like tools: // { type: "agent_use", agent: "search", input: { query: "..." } }

Orchestrator Pattern

One session acts as coordinator. It decomposes the task, dispatches subtasks to specialist subagents via callable_agents, collects results as agent_result events, and synthesizes the final response.

Parallel Isolation

Subagent sessions run concurrently in fully isolated containers. A crash in one subagent doesn't affect others. The orchestrator receives a failed event and can retry, skip, or escalate.

Session Threading

Each subagent call creates a child session with a full event log. You can inspect, replay, and debug the subagent's reasoning independently — without re-running the entire orchestration.

    Rate limits: 60 session create/min · 600 event read/min. For high-throughput orchestration, batch subagent calls and use event streaming rather than polling individual sessions.
  

How does the orchestrator wait for all subagents? ▼

The orchestrator model emits multiple agent_use calls in a single turn (like parallel tool calls). The runtime dispatches all of them concurrently and writes the results back as agent_result events once each subagent completes. The orchestrator's next turn only begins when all pending agent calls have resolved.

Can subagents call other subagents? ▼

Yes — the tree can be arbitrarily deep. A subagent can itself have callable_agents, spawning grandchildren. Anthropic recommends limiting depth to 3 levels for practical observability. Deep trees are fully inspectable — each node has its own queryable event log.

Tools & Security →

Tools & Security

What Agents Can Do — and How They're Contained

Managed Agents provides a rich tool set out of the box. Security is enforced at the container level — tools run inside the isolated execution environment, not the model harness.

💻

Bash

Run shell commands in the container

📄

File I/O

Read, write, create, delete files

🔍

Web Search

Query the web, return structured results

🌎

Web Fetch

Fetch and parse any URL's content

🔌

MCP Servers

Connect to any MCP-compatible service

🛠

Code Run

Execute Python, JS, and more in sandbox

📷

Screenshot

Capture browser or desktop screenshots

🤖

Agents

Call subagent sessions as tools

Click a tool to see its capabilities, input schema, and security boundaries.

Click a security layer to learn how it protects the host system.

    Resource-bound auth: Credentials are never injected into the container environment directly. Instead, a Vault+MCP proxy mediates all auth flows — the agent requests access, the proxy evaluates the resource policy, and issues a scoped token. The container never sees the master credentials.
  

How do I restrict which tools an agent can use? ▼

Define the tools array in your Agent config — only listed tools are available in sessions. For finer control, add a tool_policy to the Environment: specify allowed paths for file operations, allowed domains for web fetch, or a denylist of shell commands. Policies are enforced at the container level, not by the model.

What happens if a tool exceeds its resource limit? ▼

The container's cgroup enforces CPU, memory, and disk limits. If a tool call exceeds them (e.g., a runaway process), the container runtime sends SIGKILL to the offending process. A tool_result event with error: "resource_limit_exceeded" is written to the event log. The session continues — the model receives the error and can decide how to proceed.

Getting Started →

Getting Started

From Zero to Running Agent in 5 Steps

The Managed Agents API follows the same conventions as the Anthropic Messages API. Beta header required during preview. Five steps from setup to a running session.

Click "Run Steps" to walk through the API flow

import anthropic

client = anthropic.Anthropic()

# Step 1: Create an agent blueprint
agent = client.beta.agents.create(
    model="claude-opus-4-6",
    name="code-reviewer",
    system="You are an expert code reviewer.",
    tools=["bash", "files", "web_search"],
    betas=["managed-agents-2026-04-01"]
)

# Step 2: Create an execution environment
env = client.beta.environments.create(
    name="review-env",
    container_template="python3.11",
    memory_mb=2048,
    betas=["managed-agents-2026-04-01"]
)

# Step 3: Start a session
session = client.beta.sessions.create(
    agent_id=agent.id,
    environment_id=env.id,
    betas=["managed-agents-2026-04-01"]
)

# Step 4: Send a message
client.beta.sessions.events.create(
    session_id=session.id,
    type="user_message",
    content="Review this PR: [paste diff here]",
    betas=["managed-agents-2026-04-01"]
)

# Step 5: Stream events
for event in client.beta.sessions.events.stream(
    session_id=session.id,
    betas=["managed-agents-2026-04-01"]
):
    if event.type == "assistant_turn":
        print(event.content)
    elif event.type == "status_change" and event.status == "complete":
        break

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();
const BETA = 'managed-agents-2026-04-01';

// Step 1: Create agent
const agent = await client.beta.agents.create({
  model: 'claude-opus-4-6',
  name: 'code-reviewer',
  system: 'You are an expert code reviewer.',
  tools: ['bash', 'files', 'web_search'],
  betas: [BETA]
});

// Step 2: Create environment
const env = await client.beta.environments.create({
  name: 'review-env',
  container_template: 'node20',
  memory_mb: 2048,
  betas: [BETA]
});

// Step 3: Start session
const session = await client.beta.sessions.create({
  agent_id: agent.id,
  environment_id: env.id,
  betas: [BETA]
});

// Step 4: Send message
await client.beta.sessions.events.create({
  session_id: session.id,
  type: 'user_message',
  content: 'Review this PR...',
  betas: [BETA]
});

// Step 5: Stream events
const stream = client.beta.sessions.events.stream({
  session_id: session.id,
  betas: [BETA]
});

for await (const event of stream) {
  if (event.type === 'assistant_turn') console.log(event.content);
  if (event.type === 'status_change' && event.status === 'complete') break;
}

# Step 1: Create agent
curl https://api.anthropic.com/v1/beta/agents \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-beta: managed-agents-2026-04-01" \
  -d '{"model":"claude-opus-4-6","name":"code-reviewer","tools":["bash","files"]}'

# Step 2: Create environment
curl https://api.anthropic.com/v1/beta/environments \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-beta: managed-agents-2026-04-01" \
  -d '{"name":"review-env","container_template":"python3.11"}'

# Step 3: Start session
curl https://api.anthropic.com/v1/beta/sessions \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-beta: managed-agents-2026-04-01" \
  -d '{"agent_id":"agt_xxx","environment_id":"env_xxx"}'

# Step 4: Send message
curl https://api.anthropic.com/v1/beta/sessions/sess_xxx/events \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-beta: managed-agents-2026-04-01" \
  -d '{"type":"user_message","content":"Review this PR..."}'

# Step 5: Stream events (SSE)
curl -N "https://api.anthropic.com/v1/beta/sessions/sess_xxx/events?stream=true" \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-beta: managed-agents-2026-04-01"

Pricing

$0.08 per session-hour of active execution + standard model token pricing. Sessions that are paused or idle do not count toward the hourly rate. Environments are free to create; you only pay when sessions are running.

Rate Limits

60 session create/min · 600 event read/min · 20 concurrent active sessions per workspace (beta). Orchestration patterns that dispatch many subagents in parallel should fan-out through a single orchestrator session.

Beta Access

Available on the Anthropic Console under "Managed Agents (Beta)". Requires including anthropic-beta: managed-agents-2026-04-01 in every request. All endpoints are under /v1/beta/ during the preview period.

    What this enables: Long-running research agents, automated code review pipelines, multi-step data analysis, parallelized content generation — all without managing your own container infrastructure or building custom session state.
  

How Does It Compare? →

Alternatives Comparison

Managed Agents vs. LangChain vs. AutoGen vs. DIY

If you're already using one of these frameworks, here's exactly where Managed Agents fits and where it differs. Click any cell for the full explanation.

Click any cell in the matrix to see a detailed comparison for that feature and framework.

When to choose Managed Agents

Tasks that run for minutes, need real tool execution (shell, files, web), require session persistence, or involve parallel subagents. You want Anthropic to manage the infrastructure so your team can focus on agent logic.

When to stick with DIY / LangChain

Tasks that complete in a single round-trip, teams with existing container infrastructure, or cases where you need fine-grained control over every layer of the stack. LangChain excels for rapid prototyping with many model providers.