OpenAI shipped the Agents SDK to give TypeScript developers a first-party, production-grade primitive for building autonomous AI systems. No more wiring tool loops by hand, no more reinventing handoff logic between specialized agents, no more guessing whether your guardrails actually fired.

In this hands-on tutorial, you will build a complete multi-agent customer support system using the OpenAI Agents SDK in TypeScript — covering tools, handoffs, guardrails, streaming, and tracing — and finish with patterns you can take straight to production.

Why the Agents SDK over a custom loop? OpenAI engineered the SDK around four primitives — Agents, Handoffs, Guardrails, and Tracing — that compose into arbitrarily complex workflows while staying readable. You spend zero time on plumbing and 100 percent of your time on the agent's behavior.

What You Will Learn

By the end of this tutorial, you will be able to:

Set up a TypeScript project with @openai/agents
Define agents with instructions, models, and tools
Add type-safe tools backed by Zod schemas
Orchestrate multi-agent workflows with handoffs
Validate inputs and outputs with guardrails
Stream agent output to the user in real time
Inspect every step with the built-in tracing dashboard
Deploy agents with retry, cost controls, and observability

Prerequisites

Before starting, make sure you have:

Node.js 20+ installed (node --version)
An OpenAI API key with access to the Responses API
Working knowledge of TypeScript and async/await
A code editor (VS Code recommended)
About 30 minutes of focused time

What You Will Build

A modular AI customer support platform with three specialized agents:

A triage agent that routes inquiries
A billing agent that answers invoice questions using a tool
A refunds agent that processes refund requests with guardrails

The system streams responses to the user, hands off between agents based on intent, and logs every decision through OpenAI's tracing dashboard.

Step 1: Project Setup

Create a fresh TypeScript project and install the SDK.

mkdir openai-agents-tutorial && cd openai-agents-tutorial
npm init -y
npm install @openai/agents zod
npm install -D typescript tsx @types/node
npx tsc --init

Update tsconfig.json to use modern Node settings:

{
  "compilerOptions": {
    "target": "ES2022",
    "module": "ES2022",
    "moduleResolution": "bundler",
    "esModuleInterop": true,
    "strict": true,
    "skipLibCheck": true,
    "outDir": "./dist"
  },
  "include": ["src/**/*"]
}

Set "type": "module" in your package.json, then create a .env file and add your API key:

OPENAI_API_KEY=sk-proj-...

Add a dev script:

{
  "scripts": {
    "dev": "tsx --env-file=.env src/index.ts"
  }
}

Step 2: Your First Agent

Create src/index.ts and define a minimal agent.

import { Agent, run } from "@openai/agents";
 
const helpAgent = new Agent({
  name: "Help Assistant",
  instructions:
    "You are a friendly support assistant for Noqta. Answer concisely in two sentences or fewer.",
  model: "gpt-5",
});
 
async function main() {
  const result = await run(helpAgent, "How do I reset my password?");
  console.log(result.finalOutput);
}
 
main();

Run it:

npm run dev

You should see a short, on-topic answer. The run function executes the agent loop until it produces a final output or hits its turn limit.

Model choice matters. Use gpt-5 for complex reasoning and tool orchestration, and gpt-5-mini for cheap, fast triage agents. The SDK lets you mix models per agent so you only pay for intelligence where it matters.

Step 3: Add a Type-Safe Tool with Zod

Real agents need to call your APIs. Tools in the OpenAI Agents SDK are defined with Zod schemas, giving you compile-time safety and runtime validation.

Create src/tools.ts:

import { tool } from "@openai/agents";
import { z } from "zod";
 
const fakeInvoices = new Map([
  ["INV-001", { amount: 89.0, status: "paid", date: "2026-04-12" }],
  ["INV-002", { amount: 240.5, status: "pending", date: "2026-05-03" }],
]);
 
export const getInvoice = tool({
  name: "get_invoice",
  description: "Fetch invoice details by invoice ID.",
  parameters: z.object({
    invoiceId: z.string().describe("Invoice ID like INV-001"),
  }),
  async execute({ invoiceId }) {
    const invoice = fakeInvoices.get(invoiceId);
    if (!invoice) return { error: `Invoice ${invoiceId} not found.` };
    return invoice;
  },
});

Now wire it into a billing agent:

import { Agent } from "@openai/agents";
import { getInvoice } from "./tools.js";
 
export const billingAgent = new Agent({
  name: "Billing Agent",
  instructions: [
    "You answer billing questions for Noqta customers.",
    "Always call get_invoice before quoting amounts.",
    "If a tool returns an error, apologize and suggest contacting support.",
  ].join(" "),
  model: "gpt-5",
  tools: [getInvoice],
});

Replace your main() call to test it:

const result = await run(
  billingAgent,
  "What is the status of invoice INV-002?"
);
console.log(result.finalOutput);

The agent will autonomously call get_invoice, parse the result, and respond with the correct status and amount.

Step 4: Multi-Agent Handoffs

Handoffs are the SDK's killer primitive. Instead of one giant prompt, you compose specialized agents and let them transfer control based on the user's intent.

Create src/refunds.ts:

import { Agent, tool } from "@openai/agents";
import { z } from "zod";
 
const issueRefund = tool({
  name: "issue_refund",
  description: "Issue a refund for a specific invoice.",
  parameters: z.object({
    invoiceId: z.string(),
    reason: z.string().describe("Customer-facing reason for the refund"),
  }),
  async execute({ invoiceId, reason }) {
    return {
      ok: true,
      refundId: `RF-${Date.now()}`,
      invoiceId,
      reason,
    };
  },
});
 
export const refundsAgent = new Agent({
  name: "Refunds Agent",
  instructions:
    "You process refund requests. Always confirm the invoice ID and reason before issuing.",
  model: "gpt-5",
  tools: [issueRefund],
});

Now build a triage agent that hands off to either billing or refunds:

import { Agent, run } from "@openai/agents";
import { billingAgent } from "./billing.js";
import { refundsAgent } from "./refunds.js";
 
const triageAgent = new Agent({
  name: "Triage Agent",
  instructions: [
    "You are the front desk for Noqta support.",
    "Hand off to the Billing Agent for invoice and payment questions.",
    "Hand off to the Refunds Agent if the user wants money back.",
    "Otherwise, answer directly in one polite sentence.",
  ].join(" "),
  model: "gpt-5-mini",
  handoffs: [billingAgent, refundsAgent],
});
 
const result = await run(
  triageAgent,
  "I want a refund for invoice INV-002, the service did not work."
);
console.log(result.finalOutput);

The triage agent detects refund intent, hands off to the refunds agent, which then calls issue_refund and returns the confirmation — all without you orchestrating a single state machine.

Step 5: Guardrails

Guardrails run in parallel with the main agent and can short-circuit dangerous or off-policy requests. Use them to enforce safety, scope, or compliance.

Create an input guardrail that blocks requests outside the support scope:

import { Agent, InputGuardrail, run } from "@openai/agents";
import { z } from "zod";
 
const scopeCheckAgent = new Agent({
  name: "Scope Check",
  instructions:
    "Decide if the user input is a customer support question. Return JSON.",
  model: "gpt-5-mini",
  outputType: z.object({
    isSupport: z.boolean(),
    reason: z.string(),
  }),
});
 
const supportScopeGuardrail: InputGuardrail = {
  name: "support_scope_guardrail",
  async execute({ input }) {
    const result = await run(scopeCheckAgent, input);
    return {
      outputInfo: result.finalOutput,
      tripwireTriggered: !result.finalOutput?.isSupport,
    };
  },
};

Attach the guardrail to your triage agent:

const triageAgent = new Agent({
  name: "Triage Agent",
  instructions: "...",
  model: "gpt-5-mini",
  handoffs: [billingAgent, refundsAgent],
  inputGuardrails: [supportScopeGuardrail],
});

When a user asks something off-topic, the guardrail's tripwireTriggered flips to true and the SDK throws an InputGuardrailTripwireTriggered error you can catch and handle gracefully.

Output guardrails work the same way — they wrap the final agent output before it reaches the user. Use them to scrub PII, enforce JSON shape, or block disallowed content.

Step 6: Streaming Responses

For chat UIs you want to stream tokens as they arrive. The SDK exposes a streaming event API:

import { run } from "@openai/agents";
 
const stream = await run(triageAgent, "Tell me about invoice INV-001.", {
  stream: true,
});
 
for await (const event of stream) {
  if (event.type === "raw_model_stream_event") {
    if (event.data.type === "output_text_delta") {
      process.stdout.write(event.data.delta);
    }
  }
  if (event.type === "agent_updated_stream_event") {
    console.error(`\n[handoff to: ${event.agent.name}]`);
  }
}
 
await stream.completed;
console.log("\n\nFinal:", stream.finalOutput);

You get fine-grained events for token deltas, tool calls, agent handoffs, and final output. Pipe output_text_delta to your front end via Server-Sent Events or WebSockets.

Step 7: Tracing and Observability

Every agent run is automatically traced to the OpenAI dashboard at platform.openai.com/traces. You see the full tree of LLM calls, tool invocations, handoffs, and guardrail evaluations — color-coded and timeline-aligned.

To add custom metadata for filtering later:

import { withTrace, run } from "@openai/agents";
 
await withTrace(
  "support_session",
  async () => {
    await run(triageAgent, userMessage);
  },
  {
    metadata: {
      userId: "user_42",
      tier: "pro",
      sessionId: "sess_abc123",
    },
  }
);

For production, you can also forward traces to Logfire, Langfuse, Braintrust, or AgentOps by installing their processor packages. The tracing pipeline is pluggable.

Step 8: Production Patterns

Here are the patterns we use at Noqta when shipping agents to real traffic.

Limit Turns and Cost

Set a maxTurns cap so a misbehaving agent cannot burn budget in a loop:

await run(triageAgent, userMessage, { maxTurns: 8 });

Combine that with token budgets at the OpenAI organization level for hard limits.

Persist Conversation History

The SDK accepts a history array of prior messages so users can have multi-turn conversations:

const result = await run(triageAgent, userMessage, {
  history: previousMessages,
});
 
const updatedHistory = result.history;

Persist updatedHistory to your database (Postgres, Redis, Convex) keyed by session ID.

Retry on Transient Failures

Wrap run() with a retry helper for rate-limit and network errors:

import { run } from "@openai/agents";
 
async function runWithRetry<T>(fn: () => Promise<T>, attempts = 3): Promise<T> {
  let lastError: unknown;
  for (let i = 0; i < attempts; i++) {
    try {
      return await fn();
    } catch (err) {
      lastError = err;
      await new Promise((r) => setTimeout(r, 1000 * 2 ** i));
    }
  }
  throw lastError;
}
 
await runWithRetry(() => run(triageAgent, userMessage));

Use Cheaper Models for Triage

Route requests through a fast triage agent on gpt-5-mini, and only escalate to gpt-5 when handoffs happen. This drops average cost by 60 to 80 percent in real workloads.

Sandbox Tool Side Effects

If a tool writes to a database or charges a card, gate it behind a dry-run mode during development and require an environment flag to enable real writes in production.

Testing Your Implementation

Quick sanity checks:

# Triage to billing
npm run dev -- "What is the status of INV-001?"
 
# Triage to refunds
npm run dev -- "Refund my invoice INV-002 please."
 
# Guardrail trip
npm run dev -- "Write me a poem about croissants."

Open the OpenAI tracing dashboard and confirm the trace tree matches your expectations: triage agent at the root, child runs for each handoff, tool calls beneath the responsible agent.

Troubleshooting

Agent never calls the tool. Make sure your tool description starts with a verb ("Fetch", "Issue", "Create") and that the Zod schema matches the parameters the LLM is producing. Inspect the trace to see the raw tool arguments.

Handoffs are ignored. The triage agent needs explicit instructions telling it when to hand off and to which agent. Vague instructions like "delegate appropriately" rarely work — be specific.

Guardrail throws but you cannot catch it. Wrap your run() call in a try/catch and check for InputGuardrailTripwireTriggered from @openai/agents. The SDK exports these error classes explicitly.

Token usage is exploding. Drop maxTurns, switch the triage tier to gpt-5-mini, and audit the trace for redundant tool calls. A common bug is an agent calling the same tool repeatedly because its output is unclear.

Next Steps

Now that the core works, here is where to go next:

Add voice support with the OpenAI Realtime API — agents can be wrapped in a voice loop
Connect MCP servers as tool sources using the SDK's built-in MCP support
Layer Langfuse or Braintrust evaluations on top of your traces
Move conversation state to Convex or Postgres for multi-session persistence
Deploy the agent runtime on Vercel, Cloudflare Workers, or AWS Lambda

If you have not seen it yet, our companion guide on the Claude Agent SDK in TypeScript covers the equivalent Anthropic primitives, useful when comparing trade-offs between providers.

Conclusion

You built a multi-agent customer support platform with the OpenAI Agents SDK — type-safe tools, intelligent handoffs, input guardrails, streaming, and full tracing — in less than 300 lines of TypeScript. The SDK removes the boilerplate so you can focus on what the agent should do instead of how it executes.

Ship it, instrument it, and iterate on the trace data. That is the loop that turns a clever prototype into an agent your customers actually rely on.