Building a RAG Chatbot with Supabase pgvector and Next.js

Large Language Models are impressive, but they have a critical limitation: they only know what they were trained on. What if you want an AI assistant that understands your documentation, your products, or your company knowledge?
That's where RAG (Retrieval-Augmented Generation) comes in. In this tutorial, you'll build a chatbot that can answer questions using your own data by combining Supabase's pgvector extension with OpenAI's APIs.
What You'll Build
By the end of this tutorial, you'll have:
- A Next.js application with a chat interface
- A Supabase database storing your documents as vector embeddings
- Semantic search that finds relevant content based on meaning
- A RAG-powered chatbot that answers questions using your data
Prerequisites
Before starting, ensure you have:
- Node.js 18+ installed
- A Supabase account (free tier works)
- An OpenAI API key with access to embeddings and chat models
- Basic knowledge of React and TypeScript
- Familiarity with Next.js App Router
Understanding the Architecture
Before diving into code, let's understand how RAG works:
User Question → Embed Question → Search Similar Docs → Augment Prompt → LLM Response
↓ ↓ ↓ ↓ ↓
"What is X?" [0.1, 0.2...] Find top 5 docs Add context Answer!
- User asks a question in natural language
- Embed the question into a vector (array of numbers)
- Semantic search finds documents with similar vectors
- Augment the prompt by adding retrieved documents as context
- LLM generates an answer based on the context
The magic is in step 3: instead of keyword matching, we're finding documents that are semantically similar—even if they don't share the same words.
Step 1: Project Setup
Create a new Next.js project with TypeScript:
npx create-next-app@latest rag-chatbot --typescript --tailwind --app --src-dir
cd rag-chatbotInstall the required dependencies:
npm install @supabase/supabase-js openai aiCreate a .env.local file with your credentials:
NEXT_PUBLIC_SUPABASE_URL=your-supabase-url
NEXT_PUBLIC_SUPABASE_ANON_KEY=your-anon-key
SUPABASE_SERVICE_ROLE_KEY=your-service-role-key
OPENAI_API_KEY=your-openai-api-keyStep 2: Configure Supabase with pgvector
Head to your Supabase dashboard and open the SQL Editor. Run the following to enable the pgvector extension:
-- Enable the pgvector extension
create extension if not exists vector with schema extensions;Now create a table to store your documents with their embeddings:
-- Create the documents table
create table documents (
id bigint primary key generated always as identity,
content text not null,
metadata jsonb,
embedding extensions.vector(1536) -- OpenAI text-embedding-3-small dimension
);
-- Enable Row Level Security
alter table documents enable row level security;
-- Create policy for reading (adjust as needed)
create policy "Allow public read access"
on documents for select
using (true);
-- Create an index for faster similarity search
create index on documents
using hnsw (embedding vector_cosine_ops);The vector(1536) matches OpenAI's text-embedding-3-small model output dimensions. The HNSW index enables fast approximate nearest neighbor search.
Step 3: Create the Embedding Function
Create a new file src/lib/embeddings.ts:
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
export async function generateEmbedding(text: string): Promise<number[]> {
const response = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: text,
});
return response.data[0].embedding;
}
export async function generateEmbeddings(texts: string[]): Promise<number[][]> {
const response = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: texts,
});
return response.data.map((item) => item.embedding);
}Step 4: Build the Document Ingestion API
Create src/app/api/ingest/route.ts to add documents to your knowledge base:
import { createClient } from '@supabase/supabase-js';
import { generateEmbeddings } from '@/lib/embeddings';
import { NextResponse } from 'next/server';
const supabase = createClient(
process.env.NEXT_PUBLIC_SUPABASE_URL!,
process.env.SUPABASE_SERVICE_ROLE_KEY!
);
interface Document {
content: string;
metadata?: Record<string, unknown>;
}
export async function POST(request: Request) {
try {
const { documents }: { documents: Document[] } = await request.json();
if (!documents || documents.length === 0) {
return NextResponse.json(
{ error: 'No documents provided' },
{ status: 400 }
);
}
// Generate embeddings for all documents
const contents = documents.map((doc) => doc.content);
const embeddings = await generateEmbeddings(contents);
// Prepare data for insertion
const rows = documents.map((doc, index) => ({
content: doc.content,
metadata: doc.metadata || {},
embedding: embeddings[index],
}));
// Insert into Supabase
const { data, error } = await supabase
.from('documents')
.insert(rows)
.select('id');
if (error) {
throw error;
}
return NextResponse.json({
success: true,
inserted: data.length,
});
} catch (error) {
console.error('Ingestion error:', error);
return NextResponse.json(
{ error: 'Failed to ingest documents' },
{ status: 500 }
);
}
}Step 5: Create the Semantic Search Function
Create src/lib/search.ts for finding relevant documents:
import { createClient } from '@supabase/supabase-js';
import { generateEmbedding } from './embeddings';
const supabase = createClient(
process.env.NEXT_PUBLIC_SUPABASE_URL!,
process.env.SUPABASE_SERVICE_ROLE_KEY!
);
export interface SearchResult {
id: number;
content: string;
metadata: Record<string, unknown>;
similarity: number;
}
export async function semanticSearch(
query: string,
topK: number = 5,
threshold: number = 0.5
): Promise<SearchResult[]> {
// Generate embedding for the query
const queryEmbedding = await generateEmbedding(query);
// Call the similarity search RPC function
const { data, error } = await supabase.rpc('match_documents', {
query_embedding: queryEmbedding,
match_threshold: threshold,
match_count: topK,
});
if (error) {
throw error;
}
return data;
}Now add the matching function in Supabase SQL Editor:
-- Create the similarity search function
create or replace function match_documents(
query_embedding vector(1536),
match_threshold float,
match_count int
)
returns table (
id bigint,
content text,
metadata jsonb,
similarity float
)
language sql stable
as $$
select
documents.id,
documents.content,
documents.metadata,
1 - (documents.embedding <=> query_embedding) as similarity
from documents
where 1 - (documents.embedding <=> query_embedding) > match_threshold
order by documents.embedding <=> query_embedding
limit match_count;
$$;The <=> operator computes cosine distance. We convert it to similarity by subtracting from 1.
Step 6: Build the RAG Chat API
Create src/app/api/chat/route.ts:
import OpenAI from 'openai';
import { semanticSearch } from '@/lib/search';
import { NextResponse } from 'next/server';
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
export async function POST(request: Request) {
try {
const { message, conversationHistory = [] } = await request.json();
if (!message) {
return NextResponse.json(
{ error: 'No message provided' },
{ status: 400 }
);
}
// Step 1: Search for relevant documents
const relevantDocs = await semanticSearch(message, 5, 0.5);
// Step 2: Build context from retrieved documents
const context = relevantDocs
.map((doc, i) => `[Document ${i + 1}]\n${doc.content}`)
.join('\n\n');
// Step 3: Create the augmented prompt
const systemPrompt = `You are a helpful assistant that answers questions based on the provided context.
CONTEXT:
${context || 'No relevant documents found.'}
INSTRUCTIONS:
- Answer the user's question based on the context above
- If the context doesn't contain relevant information, say so
- Be concise but thorough
- Cite which document(s) you're referencing when applicable`;
// Step 4: Generate response with GPT-4
const completion = await openai.chat.completions.create({
model: 'gpt-4-turbo-preview',
messages: [
{ role: 'system', content: systemPrompt },
...conversationHistory,
{ role: 'user', content: message },
],
temperature: 0.7,
max_tokens: 1000,
});
const assistantMessage = completion.choices[0].message.content;
return NextResponse.json({
response: assistantMessage,
sources: relevantDocs.map((doc) => ({
id: doc.id,
preview: doc.content.slice(0, 200) + '...',
similarity: doc.similarity,
})),
});
} catch (error) {
console.error('Chat error:', error);
return NextResponse.json(
{ error: 'Failed to process chat' },
{ status: 500 }
);
}
}Step 7: Create the Chat Interface
Create src/app/page.tsx:
'use client';
import { useState, useRef, useEffect } from 'react';
interface Message {
role: 'user' | 'assistant';
content: string;
sources?: Array<{
id: number;
preview: string;
similarity: number;
}>;
}
export default function ChatPage() {
const [messages, setMessages] = useState<Message[]>([]);
const [input, setInput] = useState('');
const [isLoading, setIsLoading] = useState(false);
const messagesEndRef = useRef<HTMLDivElement>(null);
const scrollToBottom = () => {
messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' });
};
useEffect(() => {
scrollToBottom();
}, [messages]);
const handleSubmit = async (e: React.FormEvent) => {
e.preventDefault();
if (!input.trim() || isLoading) return;
const userMessage = input.trim();
setInput('');
setMessages((prev) => [...prev, { role: 'user', content: userMessage }]);
setIsLoading(true);
try {
const response = await fetch('/api/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
message: userMessage,
conversationHistory: messages.map((m) => ({
role: m.role,
content: m.content,
})),
}),
});
const data = await response.json();
if (data.error) {
throw new Error(data.error);
}
setMessages((prev) => [
...prev,
{
role: 'assistant',
content: data.response,
sources: data.sources,
},
]);
} catch (error) {
console.error('Error:', error);
setMessages((prev) => [
...prev,
{
role: 'assistant',
content: 'Sorry, something went wrong. Please try again.',
},
]);
} finally {
setIsLoading(false);
}
};
return (
<main className="flex min-h-screen flex-col bg-gray-50">
<header className="bg-white border-b p-4">
<h1 className="text-xl font-semibold text-center">
RAG Chatbot
</h1>
</header>
<div className="flex-1 overflow-y-auto p-4 space-y-4">
{messages.length === 0 && (
<div className="text-center text-gray-500 mt-8">
<p>Ask me anything about your documents!</p>
</div>
)}
{messages.map((message, index) => (
<div
key={index}
className={`flex ${
message.role === 'user' ? 'justify-end' : 'justify-start'
}`}
>
<div
className={`max-w-[80%] rounded-lg p-4 ${
message.role === 'user'
? 'bg-blue-600 text-white'
: 'bg-white border shadow-sm'
}`}
>
<p className="whitespace-pre-wrap">{message.content}</p>
{message.sources && message.sources.length > 0 && (
<div className="mt-3 pt-3 border-t border-gray-200">
<p className="text-xs text-gray-500 mb-2">Sources:</p>
{message.sources.map((source, i) => (
<div
key={source.id}
className="text-xs bg-gray-50 p-2 rounded mb-1"
>
<span className="font-medium">
[{i + 1}] {(source.similarity * 100).toFixed(0)}% match
</span>
<p className="text-gray-600 truncate">{source.preview}</p>
</div>
))}
</div>
)}
</div>
</div>
))}
{isLoading && (
<div className="flex justify-start">
<div className="bg-white border rounded-lg p-4 shadow-sm">
<div className="flex space-x-2">
<div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce" />
<div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce delay-100" />
<div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce delay-200" />
</div>
</div>
</div>
)}
<div ref={messagesEndRef} />
</div>
<form onSubmit={handleSubmit} className="p-4 bg-white border-t">
<div className="flex gap-2">
<input
type="text"
value={input}
onChange={(e) => setInput(e.target.value)}
placeholder="Ask a question..."
className="flex-1 border rounded-lg px-4 py-2 focus:outline-none focus:ring-2 focus:ring-blue-500"
disabled={isLoading}
/>
<button
type="submit"
disabled={isLoading || !input.trim()}
className="bg-blue-600 text-white px-6 py-2 rounded-lg hover:bg-blue-700 disabled:opacity-50 disabled:cursor-not-allowed"
>
Send
</button>
</div>
</form>
</main>
);
}Step 8: Add Sample Documents
Create a script scripts/seed.ts to add sample documents:
const documents = [
{
content: `RAG (Retrieval-Augmented Generation) is a technique that combines
information retrieval with text generation. It first retrieves relevant
documents from a knowledge base, then uses them as context for an LLM
to generate accurate, grounded responses.`,
metadata: { topic: 'RAG', type: 'definition' },
},
{
content: `Vector embeddings are numerical representations of text that
capture semantic meaning. Similar texts have similar embeddings, enabling
semantic search even when exact keywords don't match.`,
metadata: { topic: 'embeddings', type: 'definition' },
},
{
content: `pgvector is a PostgreSQL extension for vector similarity search.
It supports exact and approximate nearest neighbor search, making it
perfect for AI applications requiring semantic search capabilities.`,
metadata: { topic: 'pgvector', type: 'definition' },
},
// Add more documents as needed
];
async function seed() {
const response = await fetch('http://localhost:3000/api/ingest', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ documents }),
});
const result = await response.json();
console.log('Seeding result:', result);
}
seed();Run it with:
npx tsx scripts/seed.tsTesting Your Implementation
Start the development server:
npm run devOpen http://localhost:3000 and try asking questions like:
- "What is RAG?"
- "How do vector embeddings work?"
- "What is pgvector used for?"
The chatbot should respond with accurate information based on your seeded documents, showing which sources it used.
Troubleshooting
"No relevant documents found"
- Check that documents were successfully ingested
- Lower the similarity threshold in
semanticSearch() - Verify embeddings are being generated correctly
Slow search performance
- Ensure the HNSW index is created
- Consider using
ivfflatindex for very large datasets - Adjust
ef_searchparameter for speed/accuracy tradeoff
API errors
- Verify all environment variables are set
- Check Supabase RLS policies allow read access
- Ensure OpenAI API key has embeddings access
Next Steps
Now that you have a working RAG chatbot, consider:
- Add document chunking - Split large documents into smaller chunks for better retrieval
- Implement hybrid search - Combine semantic search with keyword search
- Add conversation memory - Store chat history for context-aware responses
- Build an admin panel - Create a UI for managing documents
- Add authentication - Protect your API endpoints with Supabase Auth
Conclusion
You've built a RAG-powered chatbot that can answer questions using your own data. The combination of Supabase's pgvector extension and OpenAI's APIs provides a powerful, scalable foundation for AI applications.
The key takeaways:
- Vector embeddings enable semantic search beyond keyword matching
- pgvector brings vector search capabilities to PostgreSQL
- RAG grounds LLM responses in your actual data
- Supabase provides a complete backend with minimal setup
This architecture scales well—add more documents, improve chunking strategies, and fine-tune retrieval parameters as your knowledge base grows.
Resources:
Discuss Your Project with Us
We're here to help with your web development needs. Schedule a call to discuss your project and how we can assist you.
Let's find the best solutions for your needs.
Related Articles

AI Chatbot Integration Guide: Build Intelligent Conversational Interfaces
A comprehensive guide to integrating AI chatbots into your applications using OpenAI, Anthropic Claude, and ElevenLabs. Learn to build text and voice-enabled chatbots with Next.js.

Building an Autonomous AI Agent with Agentic RAG and Next.js
Learn how to build an AI agent that autonomously decides when and how to retrieve information from vector databases. A comprehensive hands-on guide using Vercel AI SDK and Next.js with executable examples.

AI SDK Tutorial Hub: Your Complete Guide to Building AI Applications
Your comprehensive guide to AI SDKs and tools. Find tutorials organized by difficulty covering Vercel AI SDK, ModelFusion, OpenAI, Anthropic, and more.