The OpenAI Assistants API is fundamentally different from the Chat Completions API. Instead of managing conversation history yourself, the Assistants API gives you persistent threads, built-in tools like file search and code interpreter, and run-based execution that handles everything from context windows to tool calls automatically.

In this tutorial, you will build a full-featured AI chatbot with Next.js that leverages these capabilities — from streaming responses to uploading documents your assistant can search through.

Why the Assistants API? The Chat Completions API requires you to manage conversation state, implement RAG pipelines, and build tool execution loops manually. The Assistants API handles all of this out of the box — threads persist automatically, files are indexed for search, and code runs in a sandboxed environment.

What You Will Learn

By the end of this tutorial, you will be able to:

Create and configure an OpenAI Assistant with custom instructions
Manage persistent conversation threads
Stream assistant responses in real time
Enable File Search so your assistant can answer questions from uploaded documents
Enable Code Interpreter so your assistant can write and execute Python code
Build a polished chat interface with Next.js and React
Handle errors and implement production best practices

Prerequisites

Before starting, ensure you have:

Node.js 20+ installed (node --version)
TypeScript knowledge (generics, async/await)
An OpenAI API key with Assistants API access (get one at platform.openai.com)
Next.js fundamentals (App Router, Server Actions)
A code editor — VS Code recommended

What You Will Build

A fully functional AI chatbot that supports:

Persistent conversations — users can return to previous chats
Streaming responses — tokens appear as the assistant generates them
Document Q&A — upload PDFs and ask questions about their contents
Code execution — the assistant can write and run Python code, returning results and charts
Multiple threads — manage separate conversation contexts

Step 1: Project Setup

Create a new Next.js project and install the required dependencies:

npx create-next-app@latest openai-chatbot --typescript --tailwind --app --src-dir
cd openai-chatbot

Install the OpenAI SDK:

npm install openai

Create your environment file:

# .env.local
OPENAI_API_KEY=sk-proj-your-api-key-here
OPENAI_ASSISTANT_ID=  # We'll fill this in Step 3

Add the environment variable types in src/env.d.ts:

declare namespace NodeJS {
  interface ProcessEnv {
    OPENAI_API_KEY: string;
    OPENAI_ASSISTANT_ID: string;
  }
}

Step 2: Understanding the Assistants API Architecture

Before writing code, it is important to understand the key concepts:

Assistant — A configured AI entity with specific instructions, a model, and enabled tools. Think of it as a persona that persists across conversations.

Thread — A conversation session. Threads store the full message history and are persistent — they survive server restarts and can be resumed at any time.

Message — A single user or assistant message within a thread. Messages can include text, images, and file attachments.

Run — An execution of the assistant on a thread. When you create a run, the assistant reads the thread, decides whether to call tools, and generates a response.

Run Step — A granular action within a run (tool call, message creation). Useful for debugging and showing progress.

The flow looks like this:

Create Assistant → Create Thread → Add Message → Create Run → Stream Response
                                        ↑                          |
                                        └──────────────────────────┘
                                           (conversation continues)

Step 3: Creating the Assistant

You can create an assistant via the API or the OpenAI dashboard. Let us do it programmatically so the configuration lives in code.

Create src/lib/openai.ts:

import OpenAI from "openai";
 
export const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

Create a setup script at scripts/create-assistant.ts:

import OpenAI from "openai";
 
const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});
 
async function createAssistant() {
  const assistant = await openai.beta.assistants.create({
    name: "Noqta Assistant",
    instructions: `You are a helpful, knowledgeable assistant. Follow these rules:
- Be concise but thorough
- Use code examples when explaining technical concepts
- When using file search results, cite the source document
- When asked to analyze data, use code interpreter to create visualizations
- Always respond in the same language the user writes in`,
    model: "gpt-4o",
    tools: [
      { type: "file_search" },
      { type: "code_interpreter" },
    ],
  });
 
  console.log("Assistant created:", assistant.id);
  console.log("Add this to your .env.local:");
  console.log(`OPENAI_ASSISTANT_ID=${assistant.id}`);
}
 
createAssistant();

Run it:

npx tsx scripts/create-assistant.ts

Copy the assistant ID into your .env.local file.

Tip: You can also create and manage assistants from the OpenAI Playground. The dashboard gives you a visual interface to tweak instructions and test tools interactively.

Step 4: Building the Thread Management API

Threads are the backbone of the Assistants API. Create the API routes to manage them.

Create src/app/api/threads/route.ts:

import { NextResponse } from "next/server";
import { openai } from "@/lib/openai";
 
export async function POST() {
  try {
    const thread = await openai.beta.threads.create();
 
    return NextResponse.json({ threadId: thread.id });
  } catch (error) {
    return NextResponse.json(
      { error: "Failed to create thread" },
      { status: 500 }
    );
  }
}

This endpoint creates a new conversation thread. Each thread gets a unique ID that you will store on the client to resume conversations.

Step 5: Implementing Streaming Responses

This is where the magic happens. The Assistants API supports streaming via Server-Sent Events, which gives your users a real-time typing effect.

Create src/app/api/chat/route.ts:

import { openai } from "@/lib/openai";
 
export async function POST(request: Request) {
  const { threadId, message } = await request.json();
 
  if (!threadId || !message) {
    return new Response("Missing threadId or message", { status: 400 });
  }
 
  await openai.beta.threads.messages.create(threadId, {
    role: "user",
    content: message,
  });
 
  const stream = openai.beta.threads.runs.stream(threadId, {
    assistant_id: process.env.OPENAI_ASSISTANT_ID,
  });
 
  const encoder = new TextEncoder();
 
  const readable = new ReadableStream({
    async start(controller) {
      try {
        for await (const event of stream) {
          if (event.event === "thread.message.delta") {
            const delta = event.data.delta;
            if (delta.content) {
              for (const block of delta.content) {
                if (block.type === "text" && block.text?.value) {
                  controller.enqueue(
                    encoder.encode(`data: ${JSON.stringify({
                      type: "text",
                      content: block.text.value,
                    })}\n\n`)
                  );
                }
              }
            }
          }
 
          if (event.event === "thread.run.completed") {
            controller.enqueue(
              encoder.encode(`data: ${JSON.stringify({ type: "done" })}\n\n`)
            );
          }
 
          if (event.event === "thread.run.failed") {
            controller.enqueue(
              encoder.encode(`data: ${JSON.stringify({
                type: "error",
                content: "Run failed",
              })}\n\n`)
            );
          }
        }
      } catch (err) {
        controller.enqueue(
          encoder.encode(`data: ${JSON.stringify({
            type: "error",
            content: "Stream error",
          })}\n\n`)
        );
      } finally {
        controller.close();
      }
    },
  });
 
  return new Response(readable, {
    headers: {
      "Content-Type": "text/event-stream",
      "Cache-Control": "no-cache",
      Connection: "keep-alive",
    },
  });
}

This route:

Adds the user message to the thread
Creates a streaming run against the assistant
Pipes text deltas to the client as Server-Sent Events
Signals completion or errors

Step 6: Enabling File Search (RAG)

File Search lets your assistant answer questions from uploaded documents. The API automatically chunks, embeds, and indexes your files — no external vector database needed.

Create src/app/api/files/route.ts:

import { NextResponse } from "next/server";
import { openai } from "@/lib/openai";
 
export async function POST(request: Request) {
  const formData = await request.formData();
  const file = formData.get("file") as File;
  const threadId = formData.get("threadId") as string;
 
  if (!file || !threadId) {
    return NextResponse.json(
      { error: "Missing file or threadId" },
      { status: 400 }
    );
  }
 
  const uploadedFile = await openai.files.create({
    file,
    purpose: "assistants",
  });
 
  await openai.beta.threads.messages.create(threadId, {
    role: "user",
    content: "I've uploaded a document for reference.",
    attachments: [
      {
        file_id: uploadedFile.id,
        tools: [{ type: "file_search" }],
      },
    ],
  });
 
  return NextResponse.json({
    fileId: uploadedFile.id,
    fileName: file.name,
  });
}

When a user uploads a file:

The file is uploaded to OpenAI's storage
It is attached to a message in the thread with the file_search tool
The assistant can now search through the document's contents when answering questions

Supported file types: PDF, DOCX, TXT, MD, JSON, CSV, HTML, and more. Each file can be up to 512 MB. The API automatically handles chunking and embedding — you do not need to manage a vector store manually for thread-level attachments.

Step 7: Enabling Code Interpreter

Code Interpreter lets your assistant write and execute Python code in a sandboxed environment. This is powerful for data analysis, math, chart generation, and file processing.

The code interpreter is already enabled in the assistant configuration from Step 3. Now let us handle the output in the streaming route.

Update the streaming handler in src/app/api/chat/route.ts to handle code interpreter events:

for await (const event of stream) {
  if (event.event === "thread.message.delta") {
    const delta = event.data.delta;
    if (delta.content) {
      for (const block of delta.content) {
        if (block.type === "text" && block.text?.value) {
          controller.enqueue(
            encoder.encode(`data: ${JSON.stringify({
              type: "text",
              content: block.text.value,
            })}\n\n`)
          );
        }
 
        if (block.type === "image_file" && block.image_file) {
          controller.enqueue(
            encoder.encode(`data: ${JSON.stringify({
              type: "image",
              fileId: block.image_file.file_id,
            })}\n\n`)
          );
        }
      }
    }
  }
 
  if (event.event === "thread.run.step.delta") {
    const stepDelta = event.data.delta;
    if (stepDelta.step_details?.type === "tool_calls") {
      for (const toolCall of stepDelta.step_details.tool_calls ?? []) {
        if (
          toolCall.type === "code_interpreter" &&
          toolCall.code_interpreter?.input
        ) {
          controller.enqueue(
            encoder.encode(`data: ${JSON.stringify({
              type: "code",
              content: toolCall.code_interpreter.input,
            })}\n\n`)
          );
        }
      }
    }
  }
 
  if (event.event === "thread.run.completed") {
    controller.enqueue(
      encoder.encode(`data: ${JSON.stringify({ type: "done" })}\n\n`)
    );
  }
 
  if (event.event === "thread.run.failed") {
    controller.enqueue(
      encoder.encode(`data: ${JSON.stringify({
        type: "error",
        content: "Run failed",
      })}\n\n`)
    );
  }
}

Now the stream emits three types of content:

text — regular assistant messages
code — Python code being executed by code interpreter
image — generated charts or visualizations (returned as file IDs)

To display images generated by code interpreter, add an endpoint to retrieve files:

// src/app/api/files/[fileId]/route.ts
import { NextResponse } from "next/server";
import { openai } from "@/lib/openai";
 
export async function GET(
  _request: Request,
  { params }: { params: Promise<{ fileId: string }> }
) {
  const { fileId } = await params;
 
  const response = await openai.files.content(fileId);
  const buffer = Buffer.from(await response.arrayBuffer());
 
  return new NextResponse(buffer, {
    headers: {
      "Content-Type": "image/png",
      "Cache-Control": "public, max-age=3600",
    },
  });
}

Step 8: Building the Chat UI

Now let us build the frontend. Create a custom hook to manage the streaming connection:

Create src/hooks/use-assistant-chat.ts:

"use client";
 
import { useState, useCallback, useRef } from "react";
 
interface ChatMessage {
  id: string;
  role: "user" | "assistant";
  content: string;
  codeBlocks?: string[];
  images?: string[];
}
 
export function useAssistantChat() {
  const [messages, setMessages] = useState<ChatMessage[]>([]);
  const [isLoading, setIsLoading] = useState(false);
  const [threadId, setThreadId] = useState<string | null>(null);
  const abortRef = useRef<AbortController | null>(null);
 
  const initThread = useCallback(async () => {
    const res = await fetch("/api/threads", { method: "POST" });
    const { threadId: id } = await res.json();
    setThreadId(id);
    return id;
  }, []);
 
  const sendMessage = useCallback(
    async (content: string) => {
      const currentThreadId = threadId ?? (await initThread());
      setIsLoading(true);
 
      const userMessage: ChatMessage = {
        id: crypto.randomUUID(),
        role: "user",
        content,
      };
 
      const assistantMessage: ChatMessage = {
        id: crypto.randomUUID(),
        role: "assistant",
        content: "",
        codeBlocks: [],
        images: [],
      };
 
      setMessages((prev) => [...prev, userMessage, assistantMessage]);
 
      abortRef.current = new AbortController();
 
      try {
        const res = await fetch("/api/chat", {
          method: "POST",
          headers: { "Content-Type": "application/json" },
          body: JSON.stringify({
            threadId: currentThreadId,
            message: content,
          }),
          signal: abortRef.current.signal,
        });
 
        const reader = res.body?.getReader();
        const decoder = new TextDecoder();
 
        if (!reader) throw new Error("No reader available");
 
        let buffer = "";
 
        while (true) {
          const { done, value } = await reader.read();
          if (done) break;
 
          buffer += decoder.decode(value, { stream: true });
          const lines = buffer.split("\n\n");
          buffer = lines.pop() ?? "";
 
          for (const line of lines) {
            if (!line.startsWith("data: ")) continue;
 
            const data = JSON.parse(line.slice(6));
 
            if (data.type === "text") {
              setMessages((prev) => {
                const updated = [...prev];
                const last = updated[updated.length - 1];
                last.content += data.content;
                return updated;
              });
            }
 
            if (data.type === "code") {
              setMessages((prev) => {
                const updated = [...prev];
                const last = updated[updated.length - 1];
                last.codeBlocks = [
                  ...(last.codeBlocks ?? []),
                  data.content,
                ];
                return updated;
              });
            }
 
            if (data.type === "image") {
              setMessages((prev) => {
                const updated = [...prev];
                const last = updated[updated.length - 1];
                last.images = [
                  ...(last.images ?? []),
                  `/api/files/${data.fileId}`,
                ];
                return updated;
              });
            }
 
            if (data.type === "done") {
              setIsLoading(false);
            }
 
            if (data.type === "error") {
              setIsLoading(false);
            }
          }
        }
      } catch (err) {
        if ((err as Error).name !== "AbortError") {
          setIsLoading(false);
        }
      }
    },
    [threadId, initThread]
  );
 
  const stopGeneration = useCallback(() => {
    abortRef.current?.abort();
    setIsLoading(false);
  }, []);
 
  const resetChat = useCallback(() => {
    setMessages([]);
    setThreadId(null);
    setIsLoading(false);
  }, []);
 
  return {
    messages,
    isLoading,
    threadId,
    sendMessage,
    stopGeneration,
    resetChat,
  };
}

Now build the chat component. Create src/components/chat.tsx:

"use client";
 
import { useState, useRef, useEffect } from "react";
import { useAssistantChat } from "@/hooks/use-assistant-chat";
 
export function Chat() {
  const { messages, isLoading, sendMessage, stopGeneration, resetChat } =
    useAssistantChat();
  const [input, setInput] = useState("");
  const messagesEndRef = useRef<HTMLDivElement>(null);
 
  useEffect(() => {
    messagesEndRef.current?.scrollIntoView({ behavior: "smooth" });
  }, [messages]);
 
  const handleSubmit = (e: React.FormEvent) => {
    e.preventDefault();
    if (!input.trim() || isLoading) return;
    sendMessage(input.trim());
    setInput("");
  };
 
  return (
    <div className="flex flex-col h-screen max-w-3xl mx-auto">
      <header className="flex items-center justify-between p-4 border-b">
        <h1 className="text-lg font-semibold">AI Assistant</h1>
        <button
          onClick={resetChat}
          className="text-sm text-gray-500 hover:text-gray-700"
        >
          New Chat
        </button>
      </header>
 
      <div className="flex-1 overflow-y-auto p-4 space-y-4">
        {messages.length === 0 && (
          <div className="text-center text-gray-400 mt-20">
            <p className="text-xl mb-2">Start a conversation</p>
            <p className="text-sm">
              Ask questions, upload files, or request data analysis.
            </p>
          </div>
        )}
 
        {messages.map((msg) => (
          <div
            key={msg.id}
            className={`flex ${
              msg.role === "user" ? "justify-end" : "justify-start"
            }`}
          >
            <div
              className={`max-w-[80%] rounded-2xl px-4 py-3 ${
                msg.role === "user"
                  ? "bg-blue-600 text-white"
                  : "bg-gray-100 text-gray-900"
              }`}
            >
              <p className="whitespace-pre-wrap">{msg.content}</p>
 
              {msg.codeBlocks?.map((code, i) => (
                <pre
                  key={i}
                  className="mt-2 p-3 bg-gray-900 text-green-400 rounded-lg text-sm overflow-x-auto"
                >
                  <code>{code}</code>
                </pre>
              ))}
 
              {msg.images?.map((src, i) => (
                <img
                  key={i}
                  src={src}
                  alt="Generated chart"
                  className="mt-2 rounded-lg max-w-full"
                />
              ))}
            </div>
          </div>
        ))}
 
        {isLoading && messages[messages.length - 1]?.content === "" && (
          <div className="flex justify-start">
            <div className="bg-gray-100 rounded-2xl px-4 py-3">
              <div className="flex space-x-1">
                <span className="w-2 h-2 bg-gray-400 rounded-full animate-bounce" />
                <span className="w-2 h-2 bg-gray-400 rounded-full animate-bounce [animation-delay:0.1s]" />
                <span className="w-2 h-2 bg-gray-400 rounded-full animate-bounce [animation-delay:0.2s]" />
              </div>
            </div>
          </div>
        )}
 
        <div ref={messagesEndRef} />
      </div>
 
      <form onSubmit={handleSubmit} className="p-4 border-t">
        <div className="flex gap-2">
          <input
            type="text"
            value={input}
            onChange={(e) => setInput(e.target.value)}
            placeholder="Type your message..."
            className="flex-1 rounded-xl border border-gray-300 px-4 py-3 focus:outline-none focus:ring-2 focus:ring-blue-500"
            disabled={isLoading}
          />
          {isLoading ? (
            <button
              type="button"
              onClick={stopGeneration}
              className="px-6 py-3 rounded-xl bg-red-500 text-white font-medium hover:bg-red-600 transition-colors"
            >
              Stop
            </button>
          ) : (
            <button
              type="submit"
              disabled={!input.trim()}
              className="px-6 py-3 rounded-xl bg-blue-600 text-white font-medium hover:bg-blue-700 disabled:opacity-50 disabled:cursor-not-allowed transition-colors"
            >
              Send
            </button>
          )}
        </div>
      </form>
    </div>
  );
}

Finally, add the chat to your page. Update src/app/page.tsx:

import { Chat } from "@/components/chat";
 
export default function Home() {
  return <Chat />;
}

Step 9: Adding File Upload to the UI

Let us add a file upload button to the chat interface so users can attach documents.

Create src/components/file-upload.tsx:

"use client";
 
import { useRef, useState } from "react";
 
interface FileUploadProps {
  threadId: string | null;
  onUploadComplete: (fileName: string) => void;
  disabled?: boolean;
}
 
export function FileUpload({
  threadId,
  onUploadComplete,
  disabled,
}: FileUploadProps) {
  const fileRef = useRef<HTMLInputElement>(null);
  const [uploading, setUploading] = useState(false);
 
  const handleUpload = async (e: React.ChangeEvent<HTMLInputElement>) => {
    const file = e.target.files?.[0];
    if (!file || !threadId) return;
 
    setUploading(true);
 
    const formData = new FormData();
    formData.append("file", file);
    formData.append("threadId", threadId);
 
    try {
      const res = await fetch("/api/files", {
        method: "POST",
        body: formData,
      });
 
      if (res.ok) {
        const { fileName } = await res.json();
        onUploadComplete(fileName);
      }
    } finally {
      setUploading(false);
      if (fileRef.current) fileRef.current.value = "";
    }
  };
 
  return (
    <>
      <input
        ref={fileRef}
        type="file"
        onChange={handleUpload}
        accept=".pdf,.docx,.txt,.md,.csv,.json"
        className="hidden"
      />
      <button
        type="button"
        onClick={() => fileRef.current?.click()}
        disabled={disabled || uploading || !threadId}
        className="p-3 rounded-xl border border-gray-300 hover:bg-gray-50 disabled:opacity-50 transition-colors"
        title="Upload a file"
      >
        {uploading ? (
          <svg
            className="w-5 h-5 animate-spin text-gray-500"
            viewBox="0 0 24 24"
            fill="none"
          >
            <circle
              className="opacity-25"
              cx="12"
              cy="12"
              r="10"
              stroke="currentColor"
              strokeWidth="4"
            />
            <path
              className="opacity-75"
              fill="currentColor"
              d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4z"
            />
          </svg>
        ) : (
          <svg
            className="w-5 h-5 text-gray-500"
            fill="none"
            viewBox="0 0 24 24"
            strokeWidth={1.5}
            stroke="currentColor"
          >
            <path
              strokeLinecap="round"
              strokeLinejoin="round"
              d="M18.375 12.739l-7.693 7.693a4.5 4.5 0 01-6.364-6.364l10.94-10.94A3 3 0 1119.5 7.372L8.552 18.32m.009-.01l-.01.01m5.699-9.941l-7.81 7.81a1.5 1.5 0 002.112 2.13"
            />
          </svg>
        )}
      </button>
    </>
  );
}

Integrate it into the chat form by adding the FileUpload component next to the input field in chat.tsx:

<form onSubmit={handleSubmit} className="p-4 border-t">
  <div className="flex gap-2">
    <FileUpload
      threadId={threadId}
      onUploadComplete={(name) =>
        sendMessage(`I uploaded a file: ${name}. Please review it.`)
      }
      disabled={isLoading}
    />
    <input
      type="text"
      value={input}
      onChange={(e) => setInput(e.target.value)}
      placeholder="Type your message..."
      className="flex-1 rounded-xl border border-gray-300 px-4 py-3 focus:outline-none focus:ring-2 focus:ring-blue-500"
      disabled={isLoading}
    />
    {/* Send/Stop button */}
  </div>
</form>

Step 10: Thread Persistence with Local Storage

To let users resume conversations across page reloads, persist thread IDs in local storage.

Update src/hooks/use-assistant-chat.ts to add persistence:

const STORAGE_KEY = "openai-chat-threads";
 
interface ThreadInfo {
  id: string;
  title: string;
  createdAt: string;
}
 
function getStoredThreads(): ThreadInfo[] {
  if (typeof window === "undefined") return [];
  const stored = localStorage.getItem(STORAGE_KEY);
  return stored ? JSON.parse(stored) : [];
}
 
function storeThread(thread: ThreadInfo) {
  const threads = getStoredThreads();
  threads.unshift(thread);
  localStorage.setItem(STORAGE_KEY, JSON.stringify(threads.slice(0, 50)));
}
 
export function useAssistantChat() {
  // ... existing state ...
 
  const initThread = useCallback(async () => {
    const res = await fetch("/api/threads", { method: "POST" });
    const { threadId: id } = await res.json();
    setThreadId(id);
 
    storeThread({
      id,
      title: "New conversation",
      createdAt: new Date().toISOString(),
    });
 
    return id;
  }, []);
 
  const loadThread = useCallback(async (id: string) => {
    setThreadId(id);
    setMessages([]);
 
    const res = await fetch(`/api/threads/${id}/messages`);
    const { messages: history } = await res.json();
 
    setMessages(
      history.map((msg: { id: string; role: string; content: { text: { value: string } }[] }) => ({
        id: msg.id,
        role: msg.role,
        content: msg.content
          .filter((c: { type: string }) => c.type === "text")
          .map((c: { text: { value: string } }) => c.text.value)
          .join(""),
      }))
    );
  }, []);
 
  return {
    messages,
    isLoading,
    threadId,
    sendMessage,
    stopGeneration,
    resetChat,
    loadThread,
    storedThreads: getStoredThreads(),
  };
}

Add the thread history API route. Create src/app/api/threads/[threadId]/messages/route.ts:

import { NextResponse } from "next/server";
import { openai } from "@/lib/openai";
 
export async function GET(
  _request: Request,
  { params }: { params: Promise<{ threadId: string }> }
) {
  const { threadId } = await params;
 
  try {
    const messages = await openai.beta.threads.messages.list(threadId, {
      order: "asc",
    });
 
    return NextResponse.json({ messages: messages.data });
  } catch {
    return NextResponse.json(
      { error: "Thread not found" },
      { status: 404 }
    );
  }
}

Step 11: Production Considerations

Before deploying, address these important concerns:

Rate Limiting

Protect your API routes from abuse:

// src/lib/rate-limit.ts
const rateLimitMap = new Map<string, { count: number; resetTime: number }>();
 
export function rateLimit(
  identifier: string,
  maxRequests = 20,
  windowMs = 60_000
): boolean {
  const now = Date.now();
  const entry = rateLimitMap.get(identifier);
 
  if (!entry || now > entry.resetTime) {
    rateLimitMap.set(identifier, { count: 1, resetTime: now + windowMs });
    return true;
  }
 
  if (entry.count >= maxRequests) {
    return false;
  }
 
  entry.count++;
  return true;
}

Apply it in your chat route:

import { rateLimit } from "@/lib/rate-limit";
import { headers } from "next/headers";
 
export async function POST(request: Request) {
  const headersList = await headers();
  const ip = headersList.get("x-forwarded-for") ?? "unknown";
 
  if (!rateLimit(ip)) {
    return new Response("Rate limit exceeded", { status: 429 });
  }
 
  // ... rest of the handler
}

Error Handling

Handle common Assistants API errors gracefully:

try {
  const stream = openai.beta.threads.runs.stream(threadId, {
    assistant_id: process.env.OPENAI_ASSISTANT_ID,
  });
  // ... process stream
} catch (error) {
  if (error instanceof OpenAI.APIError) {
    if (error.status === 429) {
      return new Response("Too many requests to OpenAI", { status: 429 });
    }
    if (error.status === 400) {
      return new Response("Invalid thread or assistant", { status: 400 });
    }
  }
  return new Response("Internal server error", { status: 500 });
}

Cost Management

The Assistants API charges for:

Token usage — input and output tokens at the model's rate
File storage — $0.20 per GB per day
Code interpreter sessions — $0.03 per session

To control costs:

Set max_prompt_tokens and max_completion_tokens on runs
Clean up old threads and files periodically
Monitor usage via the OpenAI dashboard

const stream = openai.beta.threads.runs.stream(threadId, {
  assistant_id: process.env.OPENAI_ASSISTANT_ID,
  max_prompt_tokens: 50000,
  max_completion_tokens: 4096,
});

Testing Your Implementation

Start the development server:

npm run dev

Test these scenarios:

Basic conversation — Send a message and verify streaming works
Follow-up questions — Confirm the assistant remembers context from earlier in the thread
File upload — Upload a PDF and ask questions about its contents
Code interpreter — Ask "Create a bar chart showing the population of the 5 largest countries"
Error recovery — Disconnect your network briefly and verify the UI handles it
New chat — Click "New Chat" and verify a fresh thread is created

Troubleshooting

"Assistant not found" error Verify your OPENAI_ASSISTANT_ID in .env.local matches an assistant in your OpenAI account. Assistants are scoped to the API key's organization.

Streaming stops mid-response This usually means the run hit a token limit or timed out. Increase max_prompt_tokens or check for long tool execution times.

File search returns no results Ensure the file was uploaded with purpose: "assistants". Files uploaded with other purposes are not indexed for search. Also check that the file format is supported.

Code interpreter times out Complex computations may exceed the execution timeout. Break down complex requests into smaller steps or ask the assistant to simplify its approach.

"Run already active" error Only one run can be active per thread at a time. Wait for the current run to complete before starting another, or cancel the active run with openai.beta.threads.runs.cancel(threadId, runId).

Next Steps

Now that you have a working AI chatbot, consider these enhancements:

Authentication — Add user auth with NextAuth.js or Clerk to scope threads per user
Vector stores — Create shared vector stores for organization-wide knowledge bases
Function calling — Add custom functions so the assistant can interact with your own APIs
Markdown rendering — Use react-markdown with syntax highlighting for formatted responses
Mobile responsiveness — Optimize the chat layout for smaller screens
Analytics — Track token usage, response times, and user satisfaction

Conclusion

You have built a complete AI chatbot using the OpenAI Assistants API and Next.js. The Assistants API eliminates much of the boilerplate that comes with building AI applications — thread management, document indexing, and code execution are all handled by the platform.

The key advantage of this approach is persistence. Unlike stateless chat completions, your threads preserve full conversation history, attached files remain indexed, and users can pick up where they left off. This makes the Assistants API particularly well-suited for customer support, internal knowledge bases, and data analysis tools.

The code from this tutorial is a solid foundation. Extend it with authentication, custom tools, and a polished UI to create an AI assistant tailored to your specific use case.