Building a RAG Chatbot with Supabase pgvector and Next.js

AI Bot
By AI Bot ·

Loading the Text to Speech Audio Player...

Large Language Models are impressive, but they have a critical limitation: they only know what they were trained on. What if you want an AI assistant that understands your documentation, your products, or your company knowledge?

That's where RAG (Retrieval-Augmented Generation) comes in. In this tutorial, you'll build a chatbot that can answer questions using your own data by combining Supabase's pgvector extension with OpenAI's APIs.

What You'll Build

By the end of this tutorial, you'll have:

  • A Next.js application with a chat interface
  • A Supabase database storing your documents as vector embeddings
  • Semantic search that finds relevant content based on meaning
  • A RAG-powered chatbot that answers questions using your data

Prerequisites

Before starting, ensure you have:

  • Node.js 18+ installed
  • A Supabase account (free tier works)
  • An OpenAI API key with access to embeddings and chat models
  • Basic knowledge of React and TypeScript
  • Familiarity with Next.js App Router

Understanding the Architecture

Before diving into code, let's understand how RAG works:

User Question → Embed Question → Search Similar Docs → Augment Prompt → LLM Response
     ↓               ↓                   ↓                  ↓              ↓
  "What is X?"   [0.1, 0.2...]    Find top 5 docs     Add context      Answer!
  1. User asks a question in natural language
  2. Embed the question into a vector (array of numbers)
  3. Semantic search finds documents with similar vectors
  4. Augment the prompt by adding retrieved documents as context
  5. LLM generates an answer based on the context

The magic is in step 3: instead of keyword matching, we're finding documents that are semantically similar—even if they don't share the same words.

Step 1: Project Setup

Create a new Next.js project with TypeScript:

npx create-next-app@latest rag-chatbot --typescript --tailwind --app --src-dir
cd rag-chatbot

Install the required dependencies:

npm install @supabase/supabase-js openai ai

Create a .env.local file with your credentials:

NEXT_PUBLIC_SUPABASE_URL=your-supabase-url
NEXT_PUBLIC_SUPABASE_ANON_KEY=your-anon-key
SUPABASE_SERVICE_ROLE_KEY=your-service-role-key
OPENAI_API_KEY=your-openai-api-key

Step 2: Configure Supabase with pgvector

Head to your Supabase dashboard and open the SQL Editor. Run the following to enable the pgvector extension:

-- Enable the pgvector extension
create extension if not exists vector with schema extensions;

Now create a table to store your documents with their embeddings:

-- Create the documents table
create table documents (
  id bigint primary key generated always as identity,
  content text not null,
  metadata jsonb,
  embedding extensions.vector(1536)  -- OpenAI text-embedding-3-small dimension
);
 
-- Enable Row Level Security
alter table documents enable row level security;
 
-- Create policy for reading (adjust as needed)
create policy "Allow public read access"
  on documents for select
  using (true);
 
-- Create an index for faster similarity search
create index on documents
using hnsw (embedding vector_cosine_ops);

The vector(1536) matches OpenAI's text-embedding-3-small model output dimensions. The HNSW index enables fast approximate nearest neighbor search.

Step 3: Create the Embedding Function

Create a new file src/lib/embeddings.ts:

import OpenAI from 'openai';
 
const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});
 
export async function generateEmbedding(text: string): Promise<number[]> {
  const response = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: text,
  });
 
  return response.data[0].embedding;
}
 
export async function generateEmbeddings(texts: string[]): Promise<number[][]> {
  const response = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: texts,
  });
 
  return response.data.map((item) => item.embedding);
}

Step 4: Build the Document Ingestion API

Create src/app/api/ingest/route.ts to add documents to your knowledge base:

import { createClient } from '@supabase/supabase-js';
import { generateEmbeddings } from '@/lib/embeddings';
import { NextResponse } from 'next/server';
 
const supabase = createClient(
  process.env.NEXT_PUBLIC_SUPABASE_URL!,
  process.env.SUPABASE_SERVICE_ROLE_KEY!
);
 
interface Document {
  content: string;
  metadata?: Record<string, unknown>;
}
 
export async function POST(request: Request) {
  try {
    const { documents }: { documents: Document[] } = await request.json();
 
    if (!documents || documents.length === 0) {
      return NextResponse.json(
        { error: 'No documents provided' },
        { status: 400 }
      );
    }
 
    // Generate embeddings for all documents
    const contents = documents.map((doc) => doc.content);
    const embeddings = await generateEmbeddings(contents);
 
    // Prepare data for insertion
    const rows = documents.map((doc, index) => ({
      content: doc.content,
      metadata: doc.metadata || {},
      embedding: embeddings[index],
    }));
 
    // Insert into Supabase
    const { data, error } = await supabase
      .from('documents')
      .insert(rows)
      .select('id');
 
    if (error) {
      throw error;
    }
 
    return NextResponse.json({
      success: true,
      inserted: data.length,
    });
  } catch (error) {
    console.error('Ingestion error:', error);
    return NextResponse.json(
      { error: 'Failed to ingest documents' },
      { status: 500 }
    );
  }
}

Step 5: Create the Semantic Search Function

Create src/lib/search.ts for finding relevant documents:

import { createClient } from '@supabase/supabase-js';
import { generateEmbedding } from './embeddings';
 
const supabase = createClient(
  process.env.NEXT_PUBLIC_SUPABASE_URL!,
  process.env.SUPABASE_SERVICE_ROLE_KEY!
);
 
export interface SearchResult {
  id: number;
  content: string;
  metadata: Record<string, unknown>;
  similarity: number;
}
 
export async function semanticSearch(
  query: string,
  topK: number = 5,
  threshold: number = 0.5
): Promise<SearchResult[]> {
  // Generate embedding for the query
  const queryEmbedding = await generateEmbedding(query);
 
  // Call the similarity search RPC function
  const { data, error } = await supabase.rpc('match_documents', {
    query_embedding: queryEmbedding,
    match_threshold: threshold,
    match_count: topK,
  });
 
  if (error) {
    throw error;
  }
 
  return data;
}

Now add the matching function in Supabase SQL Editor:

-- Create the similarity search function
create or replace function match_documents(
  query_embedding vector(1536),
  match_threshold float,
  match_count int
)
returns table (
  id bigint,
  content text,
  metadata jsonb,
  similarity float
)
language sql stable
as $$
  select
    documents.id,
    documents.content,
    documents.metadata,
    1 - (documents.embedding <=> query_embedding) as similarity
  from documents
  where 1 - (documents.embedding <=> query_embedding) > match_threshold
  order by documents.embedding <=> query_embedding
  limit match_count;
$$;

The <=> operator computes cosine distance. We convert it to similarity by subtracting from 1.

Step 6: Build the RAG Chat API

Create src/app/api/chat/route.ts:

import OpenAI from 'openai';
import { semanticSearch } from '@/lib/search';
import { NextResponse } from 'next/server';
 
const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});
 
export async function POST(request: Request) {
  try {
    const { message, conversationHistory = [] } = await request.json();
 
    if (!message) {
      return NextResponse.json(
        { error: 'No message provided' },
        { status: 400 }
      );
    }
 
    // Step 1: Search for relevant documents
    const relevantDocs = await semanticSearch(message, 5, 0.5);
 
    // Step 2: Build context from retrieved documents
    const context = relevantDocs
      .map((doc, i) => `[Document ${i + 1}]\n${doc.content}`)
      .join('\n\n');
 
    // Step 3: Create the augmented prompt
    const systemPrompt = `You are a helpful assistant that answers questions based on the provided context.
 
CONTEXT:
${context || 'No relevant documents found.'}
 
INSTRUCTIONS:
- Answer the user's question based on the context above
- If the context doesn't contain relevant information, say so
- Be concise but thorough
- Cite which document(s) you're referencing when applicable`;
 
    // Step 4: Generate response with GPT-4
    const completion = await openai.chat.completions.create({
      model: 'gpt-4-turbo-preview',
      messages: [
        { role: 'system', content: systemPrompt },
        ...conversationHistory,
        { role: 'user', content: message },
      ],
      temperature: 0.7,
      max_tokens: 1000,
    });
 
    const assistantMessage = completion.choices[0].message.content;
 
    return NextResponse.json({
      response: assistantMessage,
      sources: relevantDocs.map((doc) => ({
        id: doc.id,
        preview: doc.content.slice(0, 200) + '...',
        similarity: doc.similarity,
      })),
    });
  } catch (error) {
    console.error('Chat error:', error);
    return NextResponse.json(
      { error: 'Failed to process chat' },
      { status: 500 }
    );
  }
}

Step 7: Create the Chat Interface

Create src/app/page.tsx:

'use client';
 
import { useState, useRef, useEffect } from 'react';
 
interface Message {
  role: 'user' | 'assistant';
  content: string;
  sources?: Array<{
    id: number;
    preview: string;
    similarity: number;
  }>;
}
 
export default function ChatPage() {
  const [messages, setMessages] = useState<Message[]>([]);
  const [input, setInput] = useState('');
  const [isLoading, setIsLoading] = useState(false);
  const messagesEndRef = useRef<HTMLDivElement>(null);
 
  const scrollToBottom = () => {
    messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' });
  };
 
  useEffect(() => {
    scrollToBottom();
  }, [messages]);
 
  const handleSubmit = async (e: React.FormEvent) => {
    e.preventDefault();
    if (!input.trim() || isLoading) return;
 
    const userMessage = input.trim();
    setInput('');
    setMessages((prev) => [...prev, { role: 'user', content: userMessage }]);
    setIsLoading(true);
 
    try {
      const response = await fetch('/api/chat', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({
          message: userMessage,
          conversationHistory: messages.map((m) => ({
            role: m.role,
            content: m.content,
          })),
        }),
      });
 
      const data = await response.json();
 
      if (data.error) {
        throw new Error(data.error);
      }
 
      setMessages((prev) => [
        ...prev,
        {
          role: 'assistant',
          content: data.response,
          sources: data.sources,
        },
      ]);
    } catch (error) {
      console.error('Error:', error);
      setMessages((prev) => [
        ...prev,
        {
          role: 'assistant',
          content: 'Sorry, something went wrong. Please try again.',
        },
      ]);
    } finally {
      setIsLoading(false);
    }
  };
 
  return (
    <main className="flex min-h-screen flex-col bg-gray-50">
      <header className="bg-white border-b p-4">
        <h1 className="text-xl font-semibold text-center">
          RAG Chatbot
        </h1>
      </header>
 
      <div className="flex-1 overflow-y-auto p-4 space-y-4">
        {messages.length === 0 && (
          <div className="text-center text-gray-500 mt-8">
            <p>Ask me anything about your documents!</p>
          </div>
        )}
 
        {messages.map((message, index) => (
          <div
            key={index}
            className={`flex ${
              message.role === 'user' ? 'justify-end' : 'justify-start'
            }`}
          >
            <div
              className={`max-w-[80%] rounded-lg p-4 ${
                message.role === 'user'
                  ? 'bg-blue-600 text-white'
                  : 'bg-white border shadow-sm'
              }`}
            >
              <p className="whitespace-pre-wrap">{message.content}</p>
 
              {message.sources && message.sources.length > 0 && (
                <div className="mt-3 pt-3 border-t border-gray-200">
                  <p className="text-xs text-gray-500 mb-2">Sources:</p>
                  {message.sources.map((source, i) => (
                    <div
                      key={source.id}
                      className="text-xs bg-gray-50 p-2 rounded mb-1"
                    >
                      <span className="font-medium">
                        [{i + 1}] {(source.similarity * 100).toFixed(0)}% match
                      </span>
                      <p className="text-gray-600 truncate">{source.preview}</p>
                    </div>
                  ))}
                </div>
              )}
            </div>
          </div>
        ))}
 
        {isLoading && (
          <div className="flex justify-start">
            <div className="bg-white border rounded-lg p-4 shadow-sm">
              <div className="flex space-x-2">
                <div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce" />
                <div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce delay-100" />
                <div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce delay-200" />
              </div>
            </div>
          </div>
        )}
 
        <div ref={messagesEndRef} />
      </div>
 
      <form onSubmit={handleSubmit} className="p-4 bg-white border-t">
        <div className="flex gap-2">
          <input
            type="text"
            value={input}
            onChange={(e) => setInput(e.target.value)}
            placeholder="Ask a question..."
            className="flex-1 border rounded-lg px-4 py-2 focus:outline-none focus:ring-2 focus:ring-blue-500"
            disabled={isLoading}
          />
          <button
            type="submit"
            disabled={isLoading || !input.trim()}
            className="bg-blue-600 text-white px-6 py-2 rounded-lg hover:bg-blue-700 disabled:opacity-50 disabled:cursor-not-allowed"
          >
            Send
          </button>
        </div>
      </form>
    </main>
  );
}

Step 8: Add Sample Documents

Create a script scripts/seed.ts to add sample documents:

const documents = [
  {
    content: `RAG (Retrieval-Augmented Generation) is a technique that combines
    information retrieval with text generation. It first retrieves relevant
    documents from a knowledge base, then uses them as context for an LLM
    to generate accurate, grounded responses.`,
    metadata: { topic: 'RAG', type: 'definition' },
  },
  {
    content: `Vector embeddings are numerical representations of text that
    capture semantic meaning. Similar texts have similar embeddings, enabling
    semantic search even when exact keywords don't match.`,
    metadata: { topic: 'embeddings', type: 'definition' },
  },
  {
    content: `pgvector is a PostgreSQL extension for vector similarity search.
    It supports exact and approximate nearest neighbor search, making it
    perfect for AI applications requiring semantic search capabilities.`,
    metadata: { topic: 'pgvector', type: 'definition' },
  },
  // Add more documents as needed
];
 
async function seed() {
  const response = await fetch('http://localhost:3000/api/ingest', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ documents }),
  });
 
  const result = await response.json();
  console.log('Seeding result:', result);
}
 
seed();

Run it with:

npx tsx scripts/seed.ts

Testing Your Implementation

Start the development server:

npm run dev

Open http://localhost:3000 and try asking questions like:

  • "What is RAG?"
  • "How do vector embeddings work?"
  • "What is pgvector used for?"

The chatbot should respond with accurate information based on your seeded documents, showing which sources it used.

Troubleshooting

"No relevant documents found"

  • Check that documents were successfully ingested
  • Lower the similarity threshold in semanticSearch()
  • Verify embeddings are being generated correctly

Slow search performance

  • Ensure the HNSW index is created
  • Consider using ivfflat index for very large datasets
  • Adjust ef_search parameter for speed/accuracy tradeoff

API errors

  • Verify all environment variables are set
  • Check Supabase RLS policies allow read access
  • Ensure OpenAI API key has embeddings access

Next Steps

Now that you have a working RAG chatbot, consider:

  1. Add document chunking - Split large documents into smaller chunks for better retrieval
  2. Implement hybrid search - Combine semantic search with keyword search
  3. Add conversation memory - Store chat history for context-aware responses
  4. Build an admin panel - Create a UI for managing documents
  5. Add authentication - Protect your API endpoints with Supabase Auth

Conclusion

You've built a RAG-powered chatbot that can answer questions using your own data. The combination of Supabase's pgvector extension and OpenAI's APIs provides a powerful, scalable foundation for AI applications.

The key takeaways:

  • Vector embeddings enable semantic search beyond keyword matching
  • pgvector brings vector search capabilities to PostgreSQL
  • RAG grounds LLM responses in your actual data
  • Supabase provides a complete backend with minimal setup

This architecture scales well—add more documents, improve chunking strategies, and fine-tune retrieval parameters as your knowledge base grows.


Resources:


Want to read more tutorials? Check out our latest tutorial on 12 Laravel 11 Basics: Session.

Discuss Your Project with Us

We're here to help with your web development needs. Schedule a call to discuss your project and how we can assist you.

Let's find the best solutions for your needs.