Claude Sonnet 5 launched on June 30 2026 as the most capable Sonnet yet for agentic workloads. At 63.2% on agentic coding benchmarks — up from 58.1% for Sonnet 4.6 — it sits just below Opus 4.8 (69.2%) but at a fraction of the cost. Until August 31 2026, introductory pricing is $2 per million input tokens and $10 per million output tokens, making this the right moment to migrate agentic pipelines that previously ran on Opus.
This tutorial walks you through the full stack: project setup, basic completions, streaming, tool calling, and a production-ready agentic loop. Everything uses the official Anthropic TypeScript SDK and runs against the hosted API — no GPU required.
Prerequisites
Before starting, make sure you have:
- Node.js 20+ (
node --version) - Basic familiarity with TypeScript and
async/await - An Anthropic account with an API key from console.anthropic.com
- A code editor — VS Code is recommended
You do not need any GPU hardware. All calls go to Anthropic's hosted API.
What You'll Build
By the end of this tutorial you will have:
- A typed Anthropic client configured for
claude-sonnet-5. - A chat completion function with proper error handling.
- A streaming wrapper that prints tokens as they arrive.
- A tool-calling layer with concrete file-system tool implementations.
- A complete agentic loop that iterates until the model signals it is done.
- A usage tracker that calculates cost against the introductory pricing.
The final project is roughly 200 lines of TypeScript — small enough to understand fully, large enough to be a real template.
Understanding Claude Sonnet 5
A quick model overview before you write any code.
What changed from Sonnet 4.6
Sonnet 5 introduces a new tokenizer. For most English text the expansion ratio is around 1.0, but for code-heavy or multilingual content it can reach 1.35 — meaning the same prompt uses up to 35% more tokens than on Sonnet 4.6. This does not hurt quality; it reflects the tokenizer being re-trained for a broader vocabulary. The practical implication: always benchmark token counts against the live API rather than assuming your Sonnet 4.6 estimates carry over.
Safety defaults have also tightened. Sycophancy and hallucination rates are lower, and cyber safeguards are enabled by default. You will notice Sonnet 5 pushes back more firmly on edge-case requests — this is intentional and consistent across all API clients.
Pricing during the introductory window
| Tier | Input | Output |
|---|---|---|
| Introductory (until 2026-08-31) | $2 / MTok | $10 / MTok |
| Standard (from 2026-09-01) | $3 / MTok | $15 / MTok |
Opus 4.8 is roughly 5-7x more expensive per token. If you have agentic pipelines on Opus, the introductory window is an excellent time to evaluate a Sonnet 5 migration.
Model ID
The API identifier is claude-sonnet-5. On Amazon Bedrock the cross-region inference profile is us.anthropic.claude-sonnet-5-20260630-v1:0 — check the Bedrock console for regional availability. In Claude Code, the model is already available as the default Free and Pro tier model as of July 1 2026.
Step 1: Project Setup
Create a fresh TypeScript project and install the Anthropic SDK.
mkdir sonnet5-agent && cd sonnet5-agent
npm init -y
npm install @anthropic-ai/sdk
npm install -D typescript tsx @types/node
npx tsc --init --target ES2022 --module NodeNext --moduleResolution NodeNext --strictCreate an .env file for your API key:
echo "ANTHROPIC_API_KEY=sk-ant-..." > .envAdd a dev script to package.json:
{
"scripts": {
"dev": "tsx --env-file=.env src/index.ts",
"test": "node --test --import tsx/esm --env-file=.env src/agent.test.ts"
}
}Your final directory layout will look like this:
sonnet5-agent/
├── src/
│ ├── client.ts
│ ├── tools.ts
│ ├── agent.ts
│ ├── agent.test.ts
│ └── index.ts
├── .env
├── package.json
└── tsconfig.json
Step 2: Typed Client and First Completion
Create src/client.ts with a shared client singleton and a simple chat wrapper:
import Anthropic from "@anthropic-ai/sdk";
export const MODEL = "claude-sonnet-5";
export const client = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
export async function chat(prompt: string): Promise<string> {
const message = await client.messages.create({
model: MODEL,
max_tokens: 1024,
messages: [{ role: "user", content: prompt }],
});
const block = message.content[0];
if (block.type !== "text") throw new Error("Unexpected content type");
return block.text;
}Test it immediately in src/index.ts:
import { chat } from "./client.js";
const answer = await chat("What is 2 + 2? Reply with just the number.");
console.log(answer); // 4Run it:
npm run devThe round-trip for a short prompt is typically under 800 ms on the hosted API.
Step 3: Streaming Responses
For longer outputs — code generation, detailed explanations, documentation drafts — streaming lets you display tokens as they arrive instead of waiting for the full response. This dramatically improves perceived latency.
Create src/stream.ts:
import { client, MODEL } from "./client.js";
export async function streamChat(
prompt: string,
onText?: (chunk: string) => void
): Promise<string> {
let full = "";
const stream = client.messages.stream({
model: MODEL,
max_tokens: 4096,
messages: [{ role: "user", content: prompt }],
});
stream.on("text", (text) => {
full += text;
if (onText) onText(text);
else process.stdout.write(text);
});
await stream.finalMessage();
process.stdout.write("\n");
return full;
}The .stream() helper on the Anthropic SDK handles text_delta server-sent events automatically. The optional onText callback lets you pipe tokens wherever you need them — a WebSocket, a React state update, a log file.
Step 4: Tool Calling
Tool calling is where Sonnet 5's agentic improvements are most visible. You define a JSON Schema for each function, pass the tools array to the API, and let the model decide when and which tools to invoke. The model never calls your code directly — it returns a structured tool_use block, and you execute the call, then feed the result back.
Create src/tools.ts:
import Anthropic from "@anthropic-ai/sdk";
import * as fs from "node:fs";
import * as path from "node:path";
export const TOOLS: Anthropic.Tool[] = [
{
name: "read_file",
description: "Read the text content of a local file.",
input_schema: {
type: "object",
properties: {
file_path: {
type: "string",
description: "Relative path to the file from the project root.",
},
},
required: ["file_path"],
},
},
{
name: "write_file",
description: "Write or overwrite a local file with the given text content.",
input_schema: {
type: "object",
properties: {
file_path: {
type: "string",
description: "Relative path for the output file.",
},
content: {
type: "string",
description: "Full text content to write.",
},
},
required: ["file_path", "content"],
},
},
{
name: "list_directory",
description: "List the files and subdirectories inside a directory.",
input_schema: {
type: "object",
properties: {
dir_path: {
type: "string",
description: "Relative path of the directory to list.",
},
},
required: ["dir_path"],
},
},
];
type ToolInput = Record<string, string>;
export function executeTool(name: string, input: ToolInput): string {
switch (name) {
case "read_file": {
const absPath = path.resolve(input.file_path);
if (!fs.existsSync(absPath)) return `File not found: ${input.file_path}`;
return fs.readFileSync(absPath, "utf-8");
}
case "write_file": {
const absPath = path.resolve(input.file_path);
fs.mkdirSync(path.dirname(absPath), { recursive: true });
fs.writeFileSync(absPath, input.content, "utf-8");
return `Wrote ${input.content.length} characters to ${input.file_path}`;
}
case "list_directory": {
const absPath = path.resolve(input.dir_path);
if (!fs.existsSync(absPath))
return `Directory not found: ${input.dir_path}`;
return fs.readdirSync(absPath).join("\n");
}
default:
return `Unknown tool: ${name}`;
}
}These three tools cover the core agentic primitives: discover, read, and write. You can extend the same pattern with HTTP fetch, shell execution, database queries, or any third-party API.
Step 5: The Agentic Loop
The agentic loop is the heart of any multi-step AI system. It sends a task to Sonnet 5, processes tool calls when the model requests them, and keeps looping until stop_reason is "end_turn".
Create src/agent.ts:
import Anthropic from "@anthropic-ai/sdk";
import { client, MODEL } from "./client.js";
import { TOOLS, executeTool } from "./tools.js";
export interface AgentResult {
output: string;
inputTokens: number;
outputTokens: number;
iterations: number;
}
export async function runAgent(
task: string,
maxIterations = 15
): Promise<AgentResult> {
const messages: Anthropic.MessageParam[] = [
{ role: "user", content: task },
];
let totalInput = 0;
let totalOutput = 0;
let iteration = 0;
while (iteration < maxIterations) {
iteration++;
const response = await client.messages.create({
model: MODEL,
max_tokens: 4096,
tools: TOOLS,
messages,
});
totalInput += response.usage.input_tokens;
totalOutput += response.usage.output_tokens;
// Always append the assistant turn to the history
messages.push({ role: "assistant", content: response.content });
if (response.stop_reason === "end_turn") {
const finalText = response.content
.filter((b): b is Anthropic.TextBlock => b.type === "text")
.map((b) => b.text)
.join("\n");
return {
output: finalText,
inputTokens: totalInput,
outputTokens: totalOutput,
iterations: iteration,
};
}
if (response.stop_reason === "tool_use") {
const toolResults: Anthropic.ToolResultBlockParam[] = response.content
.filter((b): b is Anthropic.ToolUseBlock => b.type === "tool_use")
.map((b) => ({
type: "tool_result" as const,
tool_use_id: b.id,
content: executeTool(b.name, b.input as Record<string, string>),
}));
messages.push({ role: "user", content: toolResults });
continue;
}
// Unexpected stop reason (max_tokens, etc.) — break to avoid silent hang
break;
}
return {
output: "Agent reached the iteration limit without completing the task.",
inputTokens: totalInput,
outputTokens: totalOutput,
iterations: iteration,
};
}Key design decisions worth noting:
- Always append assistant turns. The Anthropic API requires every
assistantmessage that containstool_useblocks to be in the history before you send tool results. - Explicit break on unexpected stop reasons. If
stop_reasonis"max_tokens", the loop would otherwise spin silently. Thebreaklets the caller see the partial result. maxIterationsguard. A safety valve against runaway agents. 15 iterations covers most practical tasks; raise it for complex multi-file operations.
Step 6: Tracking Token Usage and Cost
Add cost calculation to src/agent.ts:
// Introductory pricing valid until 2026-08-31
const INTRO_INPUT_PER_TOKEN = 2 / 1_000_000;
const INTRO_OUTPUT_PER_TOKEN = 10 / 1_000_000;
export function calculateCost(inputTokens: number, outputTokens: number): number {
return (
inputTokens * INTRO_INPUT_PER_TOKEN +
outputTokens * INTRO_OUTPUT_PER_TOKEN
);
}Wire everything together in src/index.ts:
import { runAgent, calculateCost } from "./agent.js";
const result = await runAgent(
"List the files in the src directory, read client.ts, and summarize " +
"what it does in two sentences."
);
console.log("\n=== Agent Output ===");
console.log(result.output);
console.log("\n=== Usage ===");
console.log(`Iterations : ${result.iterations}`);
console.log(`Input tokens: ${result.inputTokens.toLocaleString()}`);
console.log(`Output tokens: ${result.outputTokens.toLocaleString()}`);
console.log(
`Estimated cost: $${calculateCost(result.inputTokens, result.outputTokens).toFixed(6)}`
);Run the full agent:
npm run devSonnet 5 will call list_directory, then read_file, then return a two-sentence summary. The cost for a task this small is typically under $0.001 — well within the budget for automated pipelines running hundreds of tasks per day.
Step 7: Testing
Write a minimal integration test using Node's built-in test runner:
// src/agent.test.ts
import { test } from "node:test";
import assert from "node:assert";
import { chat } from "./client.js";
import { executeTool } from "./tools.js";
test("chat returns a string response", async () => {
const response = await chat("Say the exact string 'test-ok' and nothing else.");
assert.ok(
response.includes("test-ok"),
`Expected 'test-ok' in response, got: ${response}`
);
});
test("executeTool list_directory returns file names", () => {
const result = executeTool("list_directory", { dir_path: "src" });
assert.ok(result.includes("client.ts"), `Unexpected listing: ${result}`);
});
test("executeTool returns error for missing path", () => {
const result = executeTool("read_file", { file_path: "nonexistent.ts" });
assert.ok(result.startsWith("File not found"), `Unexpected: ${result}`);
});Run:
npm testThe chat test makes a live API call, so it costs a few micro-dollars and takes under a second. Keep it in CI on a low-frequency schedule rather than every commit.
Troubleshooting
AuthenticationError: 401 — Your API key is missing or invalid. Confirm ANTHROPIC_API_KEY is set in .env and that the --env-file=.env flag is passed to tsx.
Error: Unexpected content type — The chat() function expects a text block as the first content item. If the prompt triggers a tool-use response, switch to runAgent() instead.
Agent loops to the iteration limit — The task description is likely too open-ended for the model to declare completion. Add explicit success criteria: "Do X, then Y, then reply with 'done' when finished."
Higher token counts than expected — Sonnet 5's tokenizer can expand up to 1.35x on code-dense content versus Sonnet 4.6. Log response.usage on each turn to identify which iterations are token-heavy.
stop_reason: "max_tokens" — Raise max_tokens in the messages.create call. Sonnet 5 supports up to 200K input tokens; the max_tokens parameter controls the output budget per turn, not the total context.
Tool result format errors — Each tool_result block must include tool_use_id matching the corresponding tool_use block. The SDK enforces this at the type level; if you see shape errors, ensure you are using the SDK types, not raw JSON.
Next Steps
- Add a system prompt to give the agent a persistent role, output format, or constraints. Pass it as the first message with
role: "user"and mark it with a<system>tag, or use the top-levelsystemparameter onmessages.create. - Implement additional tools — HTTP fetch, database queries, shell execution — following the same input-schema pattern.
- Persist conversation history across sessions by serializing the
messagesarray to a JSON file or database row. - Migrate from Opus 4.8 by swapping the model ID. Run both in parallel on a sample task set and compare outputs before committing to the migration.
- Explore batch processing via
client.beta.messages.batches.create()for bulk workloads where latency is not critical — batch mode offers up to 50% cost reduction. - Connect to MCP servers via the Anthropic SDK's experimental MCP client to give your agent access to any Model Context Protocol tool without writing custom tool schemas.
Conclusion
Claude Sonnet 5 brings Opus-class agentic capabilities to the Sonnet price tier. The patterns covered here — typed client, streaming, tool-calling, and an explicit agentic loop — form the core of most production AI systems. The introductory pricing window until the end of August 2026 makes this the right moment to build, benchmark, and optimize before the standard rates take effect. Everything in this tutorial scales cleanly from a local prototype to a serverless function or a long-running background worker.