Claude Agent SDK: Building Persistent AI Teammates

When Anthropic launched Claude Tag on June 23, 2026, it replaced the old "Claude in Slack" app with something architecturally new: one shared Claude identity per channel, persistent memory across teammates, and ambient mode where Claude posts proactively without being tagged. It is the same agent loop that powers Claude Code, repackaged as a long-lived teammate inside a chat surface.

The good news for developers is that the building blocks are public. The Claude Agent SDK exposes the same loop, tools, sessions, and MCP plumbing that Claude Tag uses internally. You can use it to ship persistent teammates inside your own product — a Discord bot that triages incidents, a WhatsApp agent that follows up on quotes, a workspace agent that watches your CRM and pings sales when a deal goes quiet.

This guide walks through the four primitives you need to assemble a persistent teammate: identity, memory, tools, and triggers.

The persistent teammate pattern

A teammate is not a chatbot. Three properties separate them:

Shared identity, not per-user. A Slack channel has one @Claude that every teammate talks to and shares context with — not a separate instance per person. This is the opposite of the ChatGPT or "AI assistant per user" model.
Memory that survives the conversation. What the agent learns in Monday's thread is available Friday, to a different teammate, without re-prompting.
Ambient agency. The agent can act without being addressed. It watches the environment and decides what humans need to know.

These three properties are surprisingly hard to retrofit onto a stateless LLM API. The Agent SDK provides them as first-class concepts.

Install and bootstrap

The SDK ships in TypeScript and Python. The TypeScript package bundles a native Claude Code binary, so a single npm install is enough.

npm install @anthropic-ai/claude-agent-sdk
# or
pip install claude-agent-sdk

Set your API key as an environment variable. The SDK also accepts third-party providers — Amazon Bedrock, Google Vertex AI, Microsoft Foundry, Claude Platform on AWS — via dedicated env flags. For MENA teams routing through Bahrain (AWS) or UAE North (Azure) for data residency, this is the path of least resistance.

export ANTHROPIC_API_KEY=your-api-key

A minimal agent loop in TypeScript looks like this:

import { query } from "@anthropic-ai/claude-agent-sdk";
 
for await (const message of query({
  prompt: "Summarize the open issues in #engineering this week",
  options: { allowedTools: ["Read", "WebFetch", "Grep"] }
})) {
  console.log(message);
}

query returns an async iterator over the message stream — assistant turns, tool calls, tool results, and a final result message. The SDK handles the entire tool loop for you. Compare this to the bare Anthropic Client SDK, where you would write your own while (response.stop_reason === "tool_use") loop, marshal results, and pass them back manually.

Channel-scoped identity

To turn this one-shot query into a channel teammate, you need a stable identity tied to a chat surface. Two patterns work:

One process per channel. Long-running worker keeps the agent loop warm.
One session per channel. Stateless worker, but every message resumes a saved session ID.

For most teams the second pattern is cheaper and easier to operate. The SDK persists sessions to JSONL on your filesystem, and you resume them by ID.

import { query } from "@anthropic-ai/claude-agent-sdk";
 
async function handleMessage(channelId: string, userText: string, savedSessionId?: string) {
  let sessionId: string | undefined;
 
  for await (const message of query({
    prompt: userText,
    options: {
      resume: savedSessionId,
      allowedTools: ["Read", "WebFetch", "Grep", "Bash"]
    }
  })) {
    if (message.type === "system" && message.subtype === "init") {
      sessionId = message.session_id;
    }
    if ("result" in message) {
      await postToChannel(channelId, message.result);
    }
  }
 
  await saveSessionId(channelId, sessionId);
}

Now the agent in #engineering is a different identity from #support — different memory, different tools, different system prompt — but each one is durable across messages. Multiple teammates posting into the same channel share the same Claude.

Memory: CLAUDE.md plus the session log

The SDK exposes two memory layers and you should use both.

Project memory lives at .claude/CLAUDE.md (or CLAUDE.md at the working-directory root). The SDK loads it into every session as system context. Use it for facts that should never change between conversations: who the team is, what the product does, your conventions, escalation rules. For a channel teammate, write a different CLAUDE.md per channel.

Session memory is the JSONL log of every turn in a resumed session. It carries the model's view of past conversations, files read, tool results received. The SDK manages compaction as the context window fills.

A pragmatic third layer is your own database. Sessions can grow without bound, and you do not want a runaway log to become the agent's whole worldview. A common pattern is to summarize sessions older than a week into a few paragraphs, write those paragraphs into CLAUDE.md, and start a fresh session ID. The agent keeps the durable knowledge and drops the noise.

Tool sandboxing with MCP

The Model Context Protocol is the part of the Agent SDK that makes persistent teammates worth running. MCP is an open standard for connecting agents to external systems — databases, ticket trackers, browsers, your internal APIs — without writing custom tool implementations.

Connecting an MCP server is one line of config:

for await (const message of query({
  prompt: "What changed in the repo since yesterday?",
  options: {
    mcpServers: {
      github: { command: "npx", args: ["@modelcontextprotocol/server-github"] },
      slack: { command: "npx", args: ["@modelcontextprotocol/server-slack"] }
    },
    allowedTools: ["Read", "Grep"]
  }
})) {
  if ("result" in message) console.log(message.result);
}

For a channel teammate you almost always want per-channel tool scoping. The #finance agent should not have GitHub write access. The #engineering agent should not see customer PII. The pattern is to compute mcpServers and allowedTools from channel metadata before each query — not to ship one mega-agent with everything turned on.

The other key control is permissionMode. Default is to prompt; acceptEdits auto-approves edits; bypassPermissions is for trusted automation only. For an ambient channel agent, expose a narrow set of pre-approved tools and require human confirmation for anything that crosses a boundary (a payment, a public post, a deletion).

Hooks: the audit trail you will need

Every persistent teammate eventually needs an answer to "what did the agent actually do last week?" Hooks are the SDK's lifecycle callbacks for this. PreToolUse, PostToolUse, Stop, SessionStart, SessionEnd, UserPromptSubmit — each one fires synchronously and lets you log, validate, or block.

import { query, HookCallback } from "@anthropic-ai/claude-agent-sdk";
import { appendFile } from "fs/promises";
 
const logToolUse: HookCallback = async (input) => {
  const tool = (input as any).tool_name;
  const args = JSON.stringify((input as any).tool_input);
  await appendFile("./audit.log", `${new Date().toISOString()} ${tool} ${args}\n`);
  return {};
};
 
for await (const message of query({
  prompt: "Process this week's expense reports",
  options: {
    permissionMode: "acceptEdits",
    hooks: {
      PreToolUse: [{ matcher: ".*", hooks: [logToolUse] }]
    }
  }
})) {
  if ("result" in message) console.log(message.result);
}

For Tunisia INPDP and Saudi PDPL compliance this is not optional. Every tool call against a system holding personal data — your CRM, your invoicing tool, your support inbox — needs a record of what was read, by which agent, in which channel, on whose behalf. A PostToolUse hook that streams to your audit table covers the standard case.

Ambient mode: triggers, not prompts

The hardest part of a persistent teammate is the part that has nothing to do with the Agent SDK. The SDK gives you the loop; you have to give it reasons to wake up.

Three trigger sources cover most use cases:

Webhooks. New Slack message, new GitHub issue, new Stripe charge. The webhook handler resumes the channel's session with a prompt like "A new high-priority issue was filed; review and triage."
Cron. Every weekday at 9am, resume the session with "Summarize the overnight on-call logs and post anything worth flagging." This is how you get end-of-day rollups and morning briefs.
Internal monitors. A small process that watches a queue, a metric, or a calendar, and resumes the session when a threshold is crossed. This is the closest match to Claude Tag's "ambient mode."

The common thread is that the trigger is your code. The Agent SDK does not subscribe to anything on your behalf. You decide what counts as an event, the trigger fires query({ resume: sessionId, prompt: "..."}), and the agent reasons over its accumulated memory before responding.

To keep ambient mode from becoming spam, gate every proactive post on a confidence check. A simple pattern: ask the model "should I post this to the channel, yes or no, and why" as a second pass, parse the answer, and only post on yes. Add a rate limit per channel.

Subagents for parallel work

Long-running teammates often need to fan out. The triage agent in #support might need to read the customer profile, search the docs, and check the recent deploys all at once. The SDK ships subagents — delegated child agents with their own context — to model this.

for await (const message of query({
  prompt: "Triage the ticket from acme.corp posted in this channel",
  options: {
    allowedTools: ["Read", "Glob", "Grep", "Agent"],
    agents: {
      "doc-searcher": {
        description: "Searches product docs and changelogs.",
        prompt: "Find the most relevant doc passages for the user's question.",
        tools: ["Read", "Glob", "Grep"]
      },
      "deploy-checker": {
        description: "Checks recent deploys and incidents.",
        prompt: "List deploys and incidents in the past 24 hours that could affect this customer.",
        tools: ["Bash", "WebFetch"]
      }
    }
  }
})) {
  if ("result" in message) console.log(message.result);
}

Each subagent runs in its own context window, so the parent does not pay token cost for the full doc search transcript — only the summary the subagent returns. Messages from inside a subagent carry a parent_tool_use_id so you can attribute logs and costs correctly.

SDK or Managed Agents

If running session storage and sandboxes in your own infra sounds like more than you want to operate, the Claude Platform also offers Managed Agents — a hosted REST API where Anthropic runs the agent loop and a per-session sandbox for you. You send events, you stream results back.

The split is straightforward:

Agent SDK when the agent should touch your filesystem, your services, your VPN-only databases. Local prototyping. Custom tools that are native code.
Managed Agents when you want zero infra, long-running sessions out of the box, and Anthropic operating the sandbox. Production scale without building your own session store.

A common path is prototype on the Agent SDK locally, then move the hot agents to Managed Agents once shape is clear.

What to build first

If you are starting from scratch, the smallest useful persistent teammate is roughly:

One channel, one session ID stored in your DB keyed by channel.
One CLAUDE.md describing the channel's purpose and rules.
Two or three MCP servers, scoped to the channel.
A webhook that resumes the session on new messages.
A PostToolUse hook that writes an audit row.
A cron that posts a morning summary.

That is roughly two days of work and it gives you a teammate that remembers what you talked about yesterday, knows which tools it is allowed to touch, and can post without being asked. From there, every additional channel is configuration, not code.

The architectural shift that Claude Tag represents — from per-user assistants to shared, persistent, ambient teammates — is open infrastructure now. The interesting question is not whether to build one, but where in your product a shared identity with a long memory would change how the team actually works.