What Is AI SDK 5?

Vercel AI SDK 5 is the most significant update to the library since its launch. With more than 30 million combined weekly npm downloads across the ai core package and its provider packages, it is the standard TypeScript toolkit for building AI-powered products. Version 5 introduces a fundamental architectural redesign that separates UI concerns from model concerns, making full-stack AI applications dramatically easier to build, debug, and maintain.

The headline changes in AI SDK 5:

UIMessage vs ModelMessage — two distinct message types with different responsibilities
Transport-based useChat — replaces internal HTTP management with an explicit, swappable transport layer
SSE streaming — Server-Sent Events replace the custom binary streaming protocol, enabling native browser DevTools debugging
Agentic loop control — stopWhen and prepareStep give you surgical control over multi-step tool calls
Type-safe custom messages — infer full TypeScript types from your tool and schema definitions
Speech generation — first-class generateSpeech primitive for text-to-audio

Prerequisites

Before starting, make sure you have:

Node.js 18 or higher installed
A Next.js 15 project (run npx create-next-app@latest my-ai-app --typescript --app)
An OpenAI API key stored as OPENAI_API_KEY in .env.local
Basic familiarity with React, TypeScript, and async/await

What You'll Build

By the end of this tutorial you will have a full-stack AI chat application that:

Streams responses in real time from an OpenAI model
Uses the UIMessage/ModelMessage architecture correctly
Calls a custom tool and handles its result inside the stream
Controls the agentic loop with stopWhen and prepareStep
Defines a fully type-safe custom message shape
Converts any text to speech with generateSpeech

The completed project works with any AI SDK 5-compatible provider — Anthropic, Google Gemini, Mistral, and more than 20 others — by changing just two lines of code.

Step 1: Install AI SDK 5

In your Next.js project root, install the core package and the OpenAI provider:

npm install ai @ai-sdk/openai

Verify the installed version:

npm list ai

You should see ai@5.x.x in the output. If you are upgrading from AI SDK 4, review the official migration guide — the useChat API and message shape changed significantly.

Add your API key to .env.local:

OPENAI_API_KEY=sk-...

Step 2: Understand UIMessage vs ModelMessage

This is the most important concept in AI SDK 5. Before version 5, a single Message type served double duty — it held both the UI state rendered in the browser and the raw content payload sent to the LLM. That design created serialization complexity and made conversation persistence error-prone.

AI SDK 5 cleanly separates these into two distinct types:

Type	Lives on	Contains
`UIMessage`	Client and server boundary	Full message state: text parts, tool results, metadata, custom data
`ModelMessage`	Server only — sent to the LLM	Stripped-down payload optimized for model consumption

The data flow in every request:

Client sends UIMessage[] to your API route
Server calls convertToModelMessages(messages) to produce ModelMessage[]
ModelMessage[] goes to the LLM via streamText
The stream returns as a UIMessageStreamResponse
useChat on the client updates local UIMessage state from the stream

This separation makes persistence straightforward. The onFinish callback provides ready-to-store UIMessage[] with no manual conversion needed.

Step 3: Build the Server Route

Create the API route at app/api/chat/route.ts:

import { openai } from '@ai-sdk/openai';
import {
  convertToModelMessages,
  streamText,
  UIMessage,
  tool,
  stepCountIs,
} from 'ai';
import { z } from 'zod';
 
export const maxDuration = 30;
 
export async function POST(req: Request) {
  const { messages }: { messages: UIMessage[] } = await req.json();
 
  const result = streamText({
    model: openai('gpt-4o'),
    system: 'You are a helpful assistant. Use tools when appropriate.',
    messages: convertToModelMessages(messages),
    stopWhen: stepCountIs(5),
    tools: {
      getWeather: tool({
        description: 'Get the current weather for a city',
        inputSchema: z.object({
          city: z.string().describe('The city name'),
        }),
        execute: async ({ city }) => {
          // Replace with a real weather API call in production
          return { city, temperature: 22, condition: 'Sunny' };
        },
      }),
    },
    onFinish: async ({ messages: finalMessages }) => {
      // Persist conversation here — finalMessages is UIMessage[]
      // await db.saveMessages(finalMessages);
      console.log('Conversation finished:', finalMessages.length, 'messages');
    },
  });
 
  return result.toUIMessageStreamResponse();
}

Key decisions in this route:

convertToModelMessages(messages) translates the incoming UIMessage[] into ModelMessage[] before the LLM call
stopWhen: stepCountIs(5) acts as a safety cap — the agentic loop halts after at most 5 tool-call steps
toUIMessageStreamResponse() packages the stream as SSE that useChat understands natively
The onFinish callback is the canonical place to persist conversations

Step 4: Build the Client Chat UI

Create the page at app/chat/page.tsx:

'use client';
 
import { useChat } from '@ai-sdk/react';
import { useState } from 'react';
 
export default function ChatPage() {
  const [input, setInput] = useState('');
  const { messages, sendMessage, status } = useChat();
 
  return (
    <div className="flex flex-col max-w-2xl mx-auto h-screen p-4">
      <div className="flex-1 overflow-y-auto space-y-4 py-4">
        {messages.map(message => (
          <div key={message.id} className="space-y-1">
            <strong className="capitalize text-sm text-gray-500">
              {message.role}
            </strong>
            <div>
              {message.parts.map((part, i) => {
                if (part.type === 'text') {
                  return <p key={i} className="text-gray-800">{part.text}</p>;
                }
                if (part.type === 'tool-getWeather') {
                  return (
                    <div key={i} className="bg-blue-50 border border-blue-200 rounded p-3 text-sm">
                      <span className="font-medium">Weather in {part.result.city}:</span>{' '}
                      {part.result.temperature}°C, {part.result.condition}
                    </div>
                  );
                }
                return null;
              })}
            </div>
          </div>
        ))}
        {status === 'streaming' && (
          <div className="text-gray-400 text-sm italic">Thinking...</div>
        )}
      </div>
 
      <form
        onSubmit={e => {
          e.preventDefault();
          sendMessage({ text: input });
          setInput('');
        }}
        className="flex gap-2 pt-4 border-t"
      >
        <input
          value={input}
          onChange={e => setInput(e.target.value)}
          placeholder="Ask about the weather in any city..."
          className="flex-1 border rounded-lg p-2 focus:outline-none focus:ring-2 focus:ring-blue-500"
          disabled={status === 'streaming'}
        />
        <button
          type="submit"
          disabled={status === 'streaming'}
          className="px-4 py-2 bg-blue-600 text-white rounded-lg disabled:opacity-50 hover:bg-blue-700"
        >
          Send
        </button>
      </form>
    </div>
  );
}

The key difference from AI SDK 4 is how message content is accessed. Each UIMessage now has a parts array where every element has an explicit type discriminant:

Text content: { type: 'text', text: '...' }
Tool results: { type: 'tool-getWeather', result: { city, temperature, condition } }

Tool part types are namespaced as tool-<toolName> to prevent naming collisions between tools that share similar output shapes.

Step 5: Control the Agentic Loop

The stopWhen and prepareStep primitives give you fine-grained control over multi-step agent behavior — you no longer need to implement your own loop logic.

Stopping Conditions

stopWhen accepts a single condition or an array. The loop halts when any condition is satisfied:

import { stepCountIs, toolCalls, textIncludes } from 'ai';
 
const result = streamText({
  model: openai('gpt-4o'),
  messages: convertToModelMessages(messages),
  stopWhen: [
    stepCountIs(10),                // hard cap: stop after 10 steps
    toolCalls('submitFinalAnswer'), // stop when model signals it is done
    textIncludes('TASK_COMPLETE'),  // stop on sentinel string in output
  ],
  tools: { /* ... */ },
});

Full reference of built-in stop conditions:

Condition	Triggers when
`stepCountIs(n)`	Step count reaches n
`toolCalls(name)`	Specific tool is invoked
`toolResults(name)`	Result from specific tool is received
`textIncludes(str)`	Output text contains the string
`textDoesNotInclude(str)`	Output text no longer contains the string
`custom(() => boolean)`	Your custom predicate returns true

Per-Step Configuration with prepareStep

prepareStep runs before each step and lets you modify the model, available tools, or message history on a per-step basis:

const result = streamText({
  model: openai('gpt-4o'),
  messages: convertToModelMessages(messages),
  prepareStep: async ({ stepNumber, messages }) => {
    // Force the first step to call the weather tool immediately
    if (stepNumber === 0) {
      return {
        toolChoice: { type: 'tool', toolName: 'getWeather' },
        activeTools: ['getWeather'],
      };
    }
 
    // Trim history for long-running agents to control token costs
    if (messages.length > 20) {
      return { messages: messages.slice(-10) };
    }
 
    // Returning empty object uses default settings for this step
    return {};
  },
  tools: { /* ... */ },
});

A common pattern is to use prepareStep to swap in a cheaper model for intermediate tool-call steps and switch back to a more capable model only for the final synthesis step.

Step 6: Define a Type-Safe Custom UIMessage

AI SDK 5 lets you define the exact TypeScript shape of your messages using Zod and the InferUITools helper. Create lib/ai-types.ts:

import { InferUITools, ToolSet, UIMessage, tool } from 'ai';
import { z } from 'zod';
 
// 1. Metadata attached to every message (model info, latency, etc.)
const metadataSchema = z.object({
  model: z.string(),
  latencyMs: z.number().optional(),
  tokensUsed: z.number().optional(),
});
export type MessageMetadata = z.infer<typeof metadataSchema>;
 
// 2. Custom streaming data parts sent mid-response
const dataPartSchema = z.object({
  weatherCard: z.object({
    city: z.string(),
    temperature: z.number(),
    condition: z.string(),
  }),
});
export type MessageDataPart = z.infer<typeof dataPartSchema>;
 
// 3. Shared tool definitions — import these in your API route too
export const appTools: ToolSet = {
  getWeather: tool({
    description: 'Get the current weather for a city',
    inputSchema: z.object({ city: z.string() }),
    execute: async ({ city }) => ({
      city,
      temperature: 22,
      condition: 'Sunny',
    }),
  }),
};
 
// 4. Infer the full tool result types automatically
type AppTools = InferUITools<typeof appTools>;
 
// 5. Your fully typed message — use this everywhere instead of UIMessage
export type AppUIMessage = UIMessage<MessageMetadata, MessageDataPart, AppTools>;

Substitute AppUIMessage for UIMessage throughout your codebase:

// API route — typed request body
const { messages }: { messages: AppUIMessage[] } = await req.json();
 
// Client component — typed hook
const { messages } = useChat<AppUIMessage>();
 
// Now TypeScript knows:
// message.metadata.model         → string
// message.metadata.latencyMs     → number | undefined
// part.type === 'tool-getWeather' → part.result.city is a string

This eliminates an entire category of runtime errors where tool result shapes diverge between the server and the client rendering logic.

Step 7: Add Speech Generation

AI SDK 5 promotes speech generation to a first-class API primitive. Add a dedicated route at app/api/speech/route.ts:

import { generateSpeech } from 'ai';
import { openai } from '@ai-sdk/openai';
 
export async function POST(req: Request) {
  const { text } = await req.json();
 
  const { audio } = await generateSpeech({
    model: openai.speech('tts-1-hd'),
    text,
    voice: 'nova',
  });
 
  return new Response(audio.uint8Array, {
    headers: {
      'Content-Type': 'audio/mpeg',
      'Cache-Control': 'no-store',
    },
  });
}

Call it from the client to speak any assistant message:

async function speakMessage(text: string) {
  const response = await fetch('/api/speech', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ text }),
  });
  const audioBlob = await response.blob();
  const audioUrl = URL.createObjectURL(audioBlob);
  new Audio(audioUrl).play();
}
 
// In your chat UI:
{message.parts.map((part, i) => {
  if (part.type === 'text') {
    return (
      <div key={i} className="flex items-start gap-2">
        <p>{part.text}</p>
        <button
          onClick={() => speakMessage(part.text)}
          className="text-xs text-gray-400 hover:text-gray-600"
        >
          🔊
        </button>
      </div>
    );
  }
  return null;
})}

Available OpenAI TTS voices: alloy, echo, fable, onyx, nova, shimmer. The tts-1-hd model offers higher audio quality at slightly higher latency compared to tts-1.

Testing Your Implementation

Start the development server:

npm run dev

Open http://localhost:3000/chat and send the message: "What is the weather in Tunis?"

You should observe:

The assistant's text streaming word by word
The getWeather tool call firing and its result appearing in the UI
The assistant's follow-up text summarizing the weather data

Open your browser's Network DevTools, filter by EventStream, and inspect the /api/chat request. You will see the SSE events as human-readable text — a major improvement over the previous binary streaming protocol which required custom tooling to decode.

Troubleshooting

"messages is not iterable" — Your API route is receiving an empty body. Verify that the client sends Content-Type: application/json and that the body shape is { messages: [...] }.

Tool result missing from the UI — The part.type must match tool-<toolName> exactly. If your tool is named get_weather with an underscore, the part type is tool-get_weather.

Stream ends after one step — The model is not generating tool calls, so stopWhen is not the issue. Check that your system prompt encourages tool use and that the user's message is relevant to the tool's description.

TypeScript errors on message parts — Import UIMessage and InferUITools from 'ai' (the core package), not from '@ai-sdk/react'. The core types must come from the root package.

onFinish not firing — This callback only fires when the stream completes successfully. Ensure you are awaiting the response on the client with useChat — if the component unmounts early the stream is cancelled.

Next Steps

Add conversation persistence: save UIMessage[] from onFinish to a Postgres or SQLite database using Drizzle ORM
Explore the lightweight Agent class for node-based, non-streaming agent pipelines
Add LLM observability with Langfuse to trace every tool call, step, and token count
Swap OpenAI for Anthropic Claude by changing @ai-sdk/openai to @ai-sdk/anthropic — the rest of the code is identical
Build a multi-provider setup with vercel-ai-gateway-unified-ai-provider-routing-nextjs-2026 for fallback and cost routing

Conclusion

Vercel AI SDK 5 makes building production AI applications in TypeScript significantly more robust. The UIMessage/ModelMessage separation eliminates a whole class of serialization bugs, SSE streaming makes debugging natural with standard browser tools, and the agentic loop primitives — stopWhen and prepareStep — give you the control you need without rolling your own loop logic. With 30 million+ weekly downloads and support for more than 25 providers, AI SDK 5 is the most practical foundation for full-stack AI development in the JavaScript ecosystem today.