Building a RAG Chatbot with Supabase pgvector and Next.js

Large Language Models are impressive, but they have a critical limitation: they only know what they were trained on. What if you want an AI assistant that understands your documentation, your products, or your company knowledge?
That's where RAG (Retrieval-Augmented Generation) comes in. In this tutorial, you'll build a chatbot that can answer questions using your own data by combining Supabase's pgvector extension with OpenAI's APIs.
What You'll Build
By the end of this tutorial, you'll have:
- A Next.js application with a chat interface
- A Supabase database storing your documents as vector embeddings
- Semantic search that finds relevant content based on meaning
- A RAG-powered chatbot that answers questions using your data
Prerequisites
Before starting, ensure you have:
- Node.js 18+ installed
- A Supabase account (free tier works)
- An OpenAI API key with access to embeddings and chat models
- Basic knowledge of React and TypeScript
- Familiarity with Next.js App Router
Understanding the Architecture
Before diving into code, let's understand how RAG works:
User Question → Embed Question → Search Similar Docs → Augment Prompt → LLM Response
↓ ↓ ↓ ↓ ↓
"What is X?" [0.1, 0.2...] Find top 5 docs Add context Answer!
- User asks a question in natural language
- Embed the question into a vector (array of numbers)
- Semantic search finds documents with similar vectors
- Augment the prompt by adding retrieved documents as context
- LLM generates an answer based on the context
The magic is in step 3: instead of keyword matching, we're finding documents that are semantically similar—even if they don't share the same words.
Step 1: Project Setup
Create a new Next.js project with TypeScript:
npx create-next-app@latest rag-chatbot --typescript --tailwind --app --src-dir
cd rag-chatbotInstall the required dependencies:
npm install @supabase/supabase-js openai aiCreate a .env.local file with your credentials:
NEXT_PUBLIC_SUPABASE_URL=your-supabase-url
NEXT_PUBLIC_SUPABASE_ANON_KEY=your-anon-key
SUPABASE_SERVICE_ROLE_KEY=your-service-role-key
OPENAI_API_KEY=your-openai-api-keyStep 2: Configure Supabase with pgvector
Head to your Supabase dashboard and open the SQL Editor. Run the following to enable the pgvector extension:
-- Enable the pgvector extension
create extension if not exists vector with schema extensions;Now create a table to store your documents with their embeddings:
-- Create the documents table
create table documents (
id bigint primary key generated always as identity,
content text not null,
metadata jsonb,
embedding extensions.vector(1536) -- OpenAI text-embedding-3-small dimension
);
-- Enable Row Level Security
alter table documents enable row level security;
-- Create policy for reading (adjust as needed)
create policy "Allow public read access"
on documents for select
using (true);
-- Create an index for faster similarity search
create index on documents
using hnsw (embedding vector_cosine_ops);The vector(1536) matches OpenAI's text-embedding-3-small model output dimensions. The HNSW index enables fast approximate nearest neighbor search.
Step 3: Create the Embedding Function
Create a new file src/lib/embeddings.ts:
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
export async function generateEmbedding(text: string): Promise<number[]> {
const response = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: text,
});
return response.data[0].embedding;
}
export async function generateEmbeddings(texts: string[]): Promise<number[][]> {
const response = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: texts,
});
return response.data.map((item) => item.embedding);
}Step 4: Build the Document Ingestion API
Create src/app/api/ingest/route.ts to add documents to your knowledge base:
import { createClient } from '@supabase/supabase-js';
import { generateEmbeddings } from '@/lib/embeddings';
import { NextResponse } from 'next/server';
const supabase = createClient(
process.env.NEXT_PUBLIC_SUPABASE_URL!,
process.env.SUPABASE_SERVICE_ROLE_KEY!
);
interface Document {
content: string;
metadata?: Record<string, unknown>;
}
export async function POST(request: Request) {
try {
const { documents }: { documents: Document[] } = await request.json();
if (!documents || documents.length === 0) {
return NextResponse.json(
{ error: 'No documents provided' },
{ status: 400 }
);
}
// Generate embeddings for all documents
const contents = documents.map((doc) => doc.content);
const embeddings = await generateEmbeddings(contents);
// Prepare data for insertion
const rows = documents.map((doc, index) => ({
content: doc.content,
metadata: doc.metadata || {},
embedding: embeddings[index],
}));
// Insert into Supabase
const { data, error } = await supabase
.from('documents')
.insert(rows)
.select('id');
if (error) {
throw error;
}
return NextResponse.json({
success: true,
inserted: data.length,
});
} catch (error) {
console.error('Ingestion error:', error);
return NextResponse.json(
{ error: 'Failed to ingest documents' },
{ status: 500 }
);
}
}Step 5: Create the Semantic Search Function
Create src/lib/search.ts for finding relevant documents:
import { createClient } from '@supabase/supabase-js';
import { generateEmbedding } from './embeddings';
const supabase = createClient(
process.env.NEXT_PUBLIC_SUPABASE_URL!,
process.env.SUPABASE_SERVICE_ROLE_KEY!
);
export interface SearchResult {
id: number;
content: string;
metadata: Record<string, unknown>;
similarity: number;
}
export async function semanticSearch(
query: string,
topK: number = 5,
threshold: number = 0.5
): Promise<SearchResult[]> {
// Generate embedding for the query
const queryEmbedding = await generateEmbedding(query);
// Call the similarity search RPC function
const { data, error } = await supabase.rpc('match_documents', {
query_embedding: queryEmbedding,
match_threshold: threshold,
match_count: topK,
});
if (error) {
throw error;
}
return data;
}Now add the matching function in Supabase SQL Editor:
-- Create the similarity search function
create or replace function match_documents(
query_embedding vector(1536),
match_threshold float,
match_count int
)
returns table (
id bigint,
content text,
metadata jsonb,
similarity float
)
language sql stable
as $$
select
documents.id,
documents.content,
documents.metadata,
1 - (documents.embedding <=> query_embedding) as similarity
from documents
where 1 - (documents.embedding <=> query_embedding) > match_threshold
order by documents.embedding <=> query_embedding
limit match_count;
$$;The <=> operator computes cosine distance. We convert it to similarity by subtracting from 1.
Step 6: Build the RAG Chat API
Create src/app/api/chat/route.ts:
import OpenAI from 'openai';
import { semanticSearch } from '@/lib/search';
import { NextResponse } from 'next/server';
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
export async function POST(request: Request) {
try {
const { message, conversationHistory = [] } = await request.json();
if (!message) {
return NextResponse.json(
{ error: 'No message provided' },
{ status: 400 }
);
}
// Step 1: Search for relevant documents
const relevantDocs = await semanticSearch(message, 5, 0.5);
// Step 2: Build context from retrieved documents
const context = relevantDocs
.map((doc, i) => `[Document ${i + 1}]\n${doc.content}`)
.join('\n\n');
// Step 3: Create the augmented prompt
const systemPrompt = `You are a helpful assistant that answers questions based on the provided context.
CONTEXT:
${context || 'No relevant documents found.'}
INSTRUCTIONS:
- Answer the user's question based on the context above
- If the context doesn't contain relevant information, say so
- Be concise but thorough
- Cite which document(s) you're referencing when applicable`;
// Step 4: Generate response with GPT-4
const completion = await openai.chat.completions.create({
model: 'gpt-4-turbo-preview',
messages: [
{ role: 'system', content: systemPrompt },
...conversationHistory,
{ role: 'user', content: message },
],
temperature: 0.7,
max_tokens: 1000,
});
const assistantMessage = completion.choices[0].message.content;
return NextResponse.json({
response: assistantMessage,
sources: relevantDocs.map((doc) => ({
id: doc.id,
preview: doc.content.slice(0, 200) + '...',
similarity: doc.similarity,
})),
});
} catch (error) {
console.error('Chat error:', error);
return NextResponse.json(
{ error: 'Failed to process chat' },
{ status: 500 }
);
}
}Step 7: Create the Chat Interface
Create src/app/page.tsx:
'use client';
import { useState, useRef, useEffect } from 'react';
interface Message {
role: 'user' | 'assistant';
content: string;
sources?: Array<{
id: number;
preview: string;
similarity: number;
}>;
}
export default function ChatPage() {
const [messages, setMessages] = useState<Message[]>([]);
const [input, setInput] = useState('');
const [isLoading, setIsLoading] = useState(false);
const messagesEndRef = useRef<HTMLDivElement>(null);
const scrollToBottom = () => {
messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' });
};
useEffect(() => {
scrollToBottom();
}, [messages]);
const handleSubmit = async (e: React.FormEvent) => {
e.preventDefault();
if (!input.trim() || isLoading) return;
const userMessage = input.trim();
setInput('');
setMessages((prev) => [...prev, { role: 'user', content: userMessage }]);
setIsLoading(true);
try {
const response = await fetch('/api/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
message: userMessage,
conversationHistory: messages.map((m) => ({
role: m.role,
content: m.content,
})),
}),
});
const data = await response.json();
if (data.error) {
throw new Error(data.error);
}
setMessages((prev) => [
...prev,
{
role: 'assistant',
content: data.response,
sources: data.sources,
},
]);
} catch (error) {
console.error('Error:', error);
setMessages((prev) => [
...prev,
{
role: 'assistant',
content: 'Sorry, something went wrong. Please try again.',
},
]);
} finally {
setIsLoading(false);
}
};
return (
<main className="flex min-h-screen flex-col bg-gray-50">
<header className="bg-white border-b p-4">
<h1 className="text-xl font-semibold text-center">
RAG Chatbot
</h1>
</header>
<div className="flex-1 overflow-y-auto p-4 space-y-4">
{messages.length === 0 && (
<div className="text-center text-gray-500 mt-8">
<p>Ask me anything about your documents!</p>
</div>
)}
{messages.map((message, index) => (
<div
key={index}
className={`flex ${
message.role === 'user' ? 'justify-end' : 'justify-start'
}`}
>
<div
className={`max-w-[80%] rounded-lg p-4 ${
message.role === 'user'
? 'bg-blue-600 text-white'
: 'bg-white border shadow-sm'
}`}
>
<p className="whitespace-pre-wrap">{message.content}</p>
{message.sources && message.sources.length > 0 && (
<div className="mt-3 pt-3 border-t border-gray-200">
<p className="text-xs text-gray-500 mb-2">Sources:</p>
{message.sources.map((source, i) => (
<div
key={source.id}
className="text-xs bg-gray-50 p-2 rounded mb-1"
>
<span className="font-medium">
[{i + 1}] {(source.similarity * 100).toFixed(0)}% match
</span>
<p className="text-gray-600 truncate">{source.preview}</p>
</div>
))}
</div>
)}
</div>
</div>
))}
{isLoading && (
<div className="flex justify-start">
<div className="bg-white border rounded-lg p-4 shadow-sm">
<div className="flex space-x-2">
<div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce" />
<div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce delay-100" />
<div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce delay-200" />
</div>
</div>
</div>
)}
<div ref={messagesEndRef} />
</div>
<form onSubmit={handleSubmit} className="p-4 bg-white border-t">
<div className="flex gap-2">
<input
type="text"
value={input}
onChange={(e) => setInput(e.target.value)}
placeholder="Ask a question..."
className="flex-1 border rounded-lg px-4 py-2 focus:outline-none focus:ring-2 focus:ring-blue-500"
disabled={isLoading}
/>
<button
type="submit"
disabled={isLoading || !input.trim()}
className="bg-blue-600 text-white px-6 py-2 rounded-lg hover:bg-blue-700 disabled:opacity-50 disabled:cursor-not-allowed"
>
Send
</button>
</div>
</form>
</main>
);
}Step 8: Add Sample Documents
Create a script scripts/seed.ts to add sample documents:
const documents = [
{
content: `RAG (Retrieval-Augmented Generation) is a technique that combines
information retrieval with text generation. It first retrieves relevant
documents from a knowledge base, then uses them as context for an LLM
to generate accurate, grounded responses.`,
metadata: { topic: 'RAG', type: 'definition' },
},
{
content: `Vector embeddings are numerical representations of text that
capture semantic meaning. Similar texts have similar embeddings, enabling
semantic search even when exact keywords don't match.`,
metadata: { topic: 'embeddings', type: 'definition' },
},
{
content: `pgvector is a PostgreSQL extension for vector similarity search.
It supports exact and approximate nearest neighbor search, making it
perfect for AI applications requiring semantic search capabilities.`,
metadata: { topic: 'pgvector', type: 'definition' },
},
// Add more documents as needed
];
async function seed() {
const response = await fetch('http://localhost:3000/api/ingest', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ documents }),
});
const result = await response.json();
console.log('Seeding result:', result);
}
seed();Run it with:
npx tsx scripts/seed.tsTesting Your Implementation
Start the development server:
npm run devOpen http://localhost:3000 and try asking questions like:
- "What is RAG?"
- "How do vector embeddings work?"
- "What is pgvector used for?"
The chatbot should respond with accurate information based on your seeded documents, showing which sources it used.
Troubleshooting
"No relevant documents found"
- Check that documents were successfully ingested
- Lower the similarity threshold in
semanticSearch() - Verify embeddings are being generated correctly
Slow search performance
- Ensure the HNSW index is created
- Consider using
ivfflatindex for very large datasets - Adjust
ef_searchparameter for speed/accuracy tradeoff
API errors
- Verify all environment variables are set
- Check Supabase RLS policies allow read access
- Ensure OpenAI API key has embeddings access
Next Steps
Now that you have a working RAG chatbot, consider:
- Add document chunking - Split large documents into smaller chunks for better retrieval
- Implement hybrid search - Combine semantic search with keyword search
- Add conversation memory - Store chat history for context-aware responses
- Build an admin panel - Create a UI for managing documents
- Add authentication - Protect your API endpoints with Supabase Auth
Conclusion
You've built a RAG-powered chatbot that can answer questions using your own data. The combination of Supabase's pgvector extension and OpenAI's APIs provides a powerful, scalable foundation for AI applications.
The key takeaways:
- Vector embeddings enable semantic search beyond keyword matching
- pgvector brings vector search capabilities to PostgreSQL
- RAG grounds LLM responses in your actual data
- Supabase provides a complete backend with minimal setup
This architecture scales well—add more documents, improve chunking strategies, and fine-tune retrieval parameters as your knowledge base grows.
Resources:
Discuss Your Project with Us
We're here to help with your web development needs. Schedule a call to discuss your project and how we can assist you.
Let's find the best solutions for your needs.