Build a Semantic Search Engine with Next.js 15, OpenAI, and Pinecone

AI Bot
By AI Bot ·

Loading the Text to Speech Audio Player...

Traditional keyword-based search falls short in the age of AI. When a user searches for "how to speed up my app," a keyword search won't find an article titled "Optimizing React Performance" — even though it's the perfect answer. This is where semantic search comes in, understanding meaning rather than just matching characters.

In this comprehensive tutorial, you'll build a full semantic search engine using:

  • Next.js 15 with App Router and Server Actions
  • OpenAI Embeddings API to convert text into vectors
  • Pinecone as a cloud-native vector database
  • TypeScript for end-to-end type safety

What You'll Build

A web application that lets users search through a collection of articles using semantic search. The app understands user intent and surfaces the most relevant results by meaning, even when the words don't match exactly.

Core features:

  • Interactive search interface with instant results
  • Automatic content indexing via API Route
  • Results ranked by semantic similarity score
  • Multi-language search support

Prerequisites

Before starting, make sure you have:

  • Node.js 18 or later
  • Basic familiarity with React and Next.js
  • An OpenAI account with an API key
  • A Pinecone account (free tier is sufficient)
  • A code editor like VS Code

How Semantic Search Works

Vector search relies on three stages:

  1. Embedding: Converting text into a numerical vector (array of numbers) that represents its meaning
  2. Storage: Saving vectors in a specialized database like Pinecone
  3. Querying: Converting the user's question into a vector and comparing it against stored vectors

OpenAI's text-embedding-3-small model produces vectors with 1,536 dimensions. Each dimension represents an aspect of the text's meaning. Texts with similar meanings have vectors that are close together in multi-dimensional space.

A vector is simply an array of numbers. For example: [0.023, -0.41, 0.87, ...] — each number represents a dimension of meaning. Proximity between two vectors indicates similarity in meaning.

Step 2: Create the Project

Scaffold a new Next.js 15 project:

npx create-next-app@latest semantic-search-app --typescript --tailwind --app --src-dir
cd semantic-search-app

Install the required packages:

npm install openai @pinecone-database/pinecone

Project Structure

src/
├── app/
│   ├── layout.tsx
│   ├── page.tsx
│   ├── api/
│   │   └── index-content/
│   │       └── route.ts
│   └── actions/
│       └── search.ts
├── lib/
│   ├── openai.ts
│   ├── pinecone.ts
│   └── types.ts
└── components/
    ├── SearchBar.tsx
    ├── SearchResults.tsx
    └── ArticleCard.tsx

Step 3: Configure Environment Variables

Create a .env.local file at the project root:

OPENAI_API_KEY=sk-your-openai-api-key
PINECONE_API_KEY=your-pinecone-api-key
PINECONE_INDEX=semantic-search

Never share your API keys. Add .env.local to .gitignore (Next.js does this automatically).

Step 4: Set Up the OpenAI Client

Create src/lib/openai.ts:

import OpenAI from "openai";
 
const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});
 
export async function generateEmbedding(text: string): Promise<number[]> {
  const response = await openai.embeddings.create({
    model: "text-embedding-3-small",
    input: text,
  });
 
  return response.data[0].embedding;
}
 
export async function generateEmbeddings(
  texts: string[]
): Promise<number[][]> {
  const response = await openai.embeddings.create({
    model: "text-embedding-3-small",
    input: texts,
  });
 
  return response.data.map((item) => item.embedding);
}

Why text-embedding-3-small?

ModelDimensionsPrice per 1M tokensPerformance
text-embedding-3-small1,536$0.02Excellent for general use
text-embedding-3-large3,072$0.13Higher accuracy
text-embedding-ada-0021,536$0.10Previous generation

The small model offers an ideal balance of performance and cost for most applications.

Step 5: Set Up Pinecone

Create the Pinecone Index

  1. Log in to console.pinecone.io
  2. Create a new index named semantic-search
  3. Set dimensions to 1536 (matching the OpenAI model)
  4. Choose cosine as the similarity metric

Create src/lib/pinecone.ts:

import { Pinecone } from "@pinecone-database/pinecone";
 
const pinecone = new Pinecone({
  apiKey: process.env.PINECONE_API_KEY!,
});
 
export const index = pinecone.index(process.env.PINECONE_INDEX!);
 
export interface ArticleMetadata {
  title: string;
  summary: string;
  url: string;
  category: string;
  language: string;
}
 
export async function upsertVectors(
  vectors: {
    id: string;
    values: number[];
    metadata: ArticleMetadata;
  }[]
) {
  // Pinecone accepts up to 100 vectors per upsert
  const batchSize = 100;
  for (let i = 0; i < vectors.length; i += batchSize) {
    const batch = vectors.slice(i, i + batchSize);
    await index.upsert(batch);
  }
}
 
export async function queryVectors(
  queryVector: number[],
  topK: number = 5,
  filter?: Record<string, string>
) {
  const results = await index.query({
    vector: queryVector,
    topK,
    includeMetadata: true,
    filter,
  });
 
  return results.matches || [];
}

Step 6: Define Types

Create src/lib/types.ts:

export interface Article {
  id: string;
  title: string;
  content: string;
  summary: string;
  url: string;
  category: string;
  language: string;
}
 
export interface SearchResult {
  id: string;
  score: number;
  title: string;
  summary: string;
  url: string;
  category: string;
}
 
export interface SearchState {
  results: SearchResult[];
  query: string;
  isLoading: boolean;
  error: string | null;
}

Step 7: Build the Content Indexing API

Create src/app/api/index-content/route.ts:

import { NextResponse } from "next/server";
import { generateEmbeddings } from "@/lib/openai";
import { upsertVectors, type ArticleMetadata } from "@/lib/pinecone";
import type { Article } from "@/lib/types";
 
// Sample data — in production, fetch from your CMS or database
const articles: Article[] = [
  {
    id: "1",
    title: "Optimizing React Application Performance",
    content:
      "A comprehensive guide to improving React app performance using memo, useMemo, useCallback, and lazy loading components...",
    summary: "Learn React performance optimization techniques",
    url: "/tutorials/react-performance",
    category: "frontend",
    language: "en",
  },
  {
    id: "2",
    title: "Building REST APIs with Node.js and Express",
    content:
      "How to design and build a complete RESTful API with authentication and data validation...",
    summary: "Build robust APIs with Express.js",
    url: "/tutorials/nodejs-rest-api",
    category: "backend",
    language: "en",
  },
  {
    id: "3",
    title: "TypeScript Fundamentals for Developers",
    content:
      "A comprehensive introduction to TypeScript covering basic types, interfaces, and generics...",
    summary: "Start your TypeScript journey",
    url: "/tutorials/typescript-basics",
    category: "language",
    language: "en",
  },
  {
    id: "4",
    title: "Deploying Next.js to Production",
    content:
      "Complete guide to deploying Next.js applications with Docker, CI/CD pipelines, and monitoring...",
    summary: "Production deployment strategies for Next.js",
    url: "/tutorials/nextjs-deployment",
    category: "devops",
    language: "en",
  },
  {
    id: "5",
    title: "Authentication with JWT and Refresh Tokens",
    content:
      "Implementing secure authentication using JSON Web Tokens with refresh token rotation...",
    summary: "Secure auth implementation guide",
    url: "/tutorials/jwt-auth",
    category: "security",
    language: "en",
  },
];
 
export async function POST() {
  try {
    // Prepare texts for embedding: combine title and content
    const textsToEmbed = articles.map(
      (article) => `${article.title}\n\n${article.content}`
    );
 
    // Generate embeddings in batch (more efficient than individual requests)
    const embeddings = await generateEmbeddings(textsToEmbed);
 
    // Prepare vectors for Pinecone
    const vectors = articles.map((article, idx) => ({
      id: article.id,
      values: embeddings[idx],
      metadata: {
        title: article.title,
        summary: article.summary,
        url: article.url,
        category: article.category,
        language: article.language,
      } satisfies ArticleMetadata,
    }));
 
    // Upload vectors to Pinecone
    await upsertVectors(vectors);
 
    return NextResponse.json({
      success: true,
      indexed: articles.length,
    });
  } catch (error) {
    console.error("Indexing error:", error);
    return NextResponse.json(
      { error: "Failed to index content" },
      { status: 500 }
    );
  }
}

Step 8: Build the Search Server Action

Create src/app/actions/search.ts:

"use server";
 
import { generateEmbedding } from "@/lib/openai";
import { queryVectors } from "@/lib/pinecone";
import type { SearchResult } from "@/lib/types";
 
export async function semanticSearch(
  query: string,
  language?: string
): Promise<SearchResult[]> {
  if (!query.trim()) {
    return [];
  }
 
  try {
    // Convert the user's query into a vector
    const queryVector = await generateEmbedding(query);
 
    // Prepare language filter (optional)
    const filter = language ? { language } : undefined;
 
    // Query Pinecone
    const matches = await queryVectors(queryVector, 5, filter);
 
    // Transform results into the expected shape
    const results: SearchResult[] = matches.map((match) => ({
      id: match.id,
      score: match.score || 0,
      title: (match.metadata?.title as string) || "",
      summary: (match.metadata?.summary as string) || "",
      url: (match.metadata?.url as string) || "",
      category: (match.metadata?.category as string) || "",
    }));
 
    return results;
  } catch (error) {
    console.error("Search error:", error);
    throw new Error("Search failed. Please try again.");
  }
}

Using Server Actions means your API keys stay on the server and are never sent to the browser. This is more secure than creating a separate API Route for search.

Step 9: Build the Search Bar Component

Create src/components/SearchBar.tsx:

"use client";
 
import { useState, useTransition, useCallback } from "react";
import { semanticSearch } from "@/app/actions/search";
import type { SearchResult } from "@/lib/types";
 
interface SearchBarProps {
  onResults: (results: SearchResult[]) => void;
  onLoading: (loading: boolean) => void;
}
 
export default function SearchBar({ onResults, onLoading }: SearchBarProps) {
  const [query, setQuery] = useState("");
  const [isPending, startTransition] = useTransition();
 
  const handleSearch = useCallback(() => {
    if (!query.trim()) return;
 
    onLoading(true);
    startTransition(async () => {
      try {
        const results = await semanticSearch(query);
        onResults(results);
      } catch {
        onResults([]);
      } finally {
        onLoading(false);
      }
    });
  }, [query, onResults, onLoading]);
 
  return (
    <div className="w-full max-w-2xl mx-auto">
      <div className="relative">
        <input
          type="text"
          value={query}
          onChange={(e) => setQuery(e.target.value)}
          onKeyDown={(e) => e.key === "Enter" && handleSearch()}
          placeholder="Search by meaning... e.g., How do I improve my app's speed?"
          className="w-full px-6 py-4 text-lg border-2 border-gray-200
                     rounded-2xl focus:border-blue-500 focus:outline-none
                     transition-colors duration-200 pl-28"
        />
        <button
          onClick={handleSearch}
          disabled={isPending || !query.trim()}
          className="absolute right-3 top-1/2 -translate-y-1/2
                     bg-blue-600 text-white px-4 py-2 rounded-xl
                     hover:bg-blue-700 disabled:opacity-50
                     disabled:cursor-not-allowed transition-colors"
        >
          {isPending ? "..." : "Search"}
        </button>
      </div>
      <p className="text-sm text-gray-500 mt-2">
        Semantic search understands meaning — try natural questions instead of keywords
      </p>
    </div>
  );
}

Step 10: Build the Results Components

Create src/components/ArticleCard.tsx:

import type { SearchResult } from "@/lib/types";
 
interface ArticleCardProps {
  result: SearchResult;
}
 
export default function ArticleCard({ result }: ArticleCardProps) {
  const relevancePercent = Math.round(result.score * 100);
 
  return (
    <a
      href={result.url}
      className="block p-6 bg-white rounded-2xl border border-gray-100
                 hover:border-blue-200 hover:shadow-lg transition-all
                 duration-200"
    >
      <div className="flex items-start justify-between gap-4">
        <div className="flex-1">
          <h3 className="text-xl font-bold text-gray-900 mb-2">
            {result.title}
          </h3>
          <p className="text-gray-600 leading-relaxed">{result.summary}</p>
          <span className="inline-block mt-3 text-sm text-blue-600
                          bg-blue-50 px-3 py-1 rounded-full">
            {result.category}
          </span>
        </div>
        <div className="flex-shrink-0 text-center">
          <div
            className={`text-2xl font-bold ${
              relevancePercent >= 80
                ? "text-green-600"
                : relevancePercent >= 60
                ? "text-yellow-600"
                : "text-gray-400"
            }`}
          >
            {relevancePercent}%
          </div>
          <div className="text-xs text-gray-400">match</div>
        </div>
      </div>
    </a>
  );
}

Create src/components/SearchResults.tsx:

import type { SearchResult } from "@/lib/types";
import ArticleCard from "./ArticleCard";
 
interface SearchResultsProps {
  results: SearchResult[];
  isLoading: boolean;
}
 
export default function SearchResults({
  results,
  isLoading,
}: SearchResultsProps) {
  if (isLoading) {
    return (
      <div className="space-y-4 mt-8">
        {[1, 2, 3].map((i) => (
          <div
            key={i}
            className="h-32 bg-gray-100 rounded-2xl animate-pulse"
          />
        ))}
      </div>
    );
  }
 
  if (results.length === 0) {
    return null;
  }
 
  return (
    <div className="space-y-4 mt-8">
      <p className="text-sm text-gray-500">
        Found {results.length} results ranked by semantic relevance
      </p>
      {results.map((result) => (
        <ArticleCard key={result.id} result={result} />
      ))}
    </div>
  );
}

Step 11: Assemble the Home Page

Update src/app/page.tsx:

"use client";
 
import { useState } from "react";
import SearchBar from "@/components/SearchBar";
import SearchResults from "@/components/SearchResults";
import type { SearchResult } from "@/lib/types";
 
export default function Home() {
  const [results, setResults] = useState<SearchResult[]>([]);
  const [isLoading, setIsLoading] = useState(false);
 
  return (
    <main className="min-h-screen bg-gradient-to-b from-gray-50 to-white">
      <div className="max-w-4xl mx-auto px-4 py-20">
        <div className="text-center mb-12">
          <h1 className="text-4xl font-bold text-gray-900 mb-4">
            AI Semantic Search
          </h1>
          <p className="text-xl text-gray-600">
            Search by meaning, not just words
          </p>
        </div>
 
        <SearchBar onResults={setResults} onLoading={setIsLoading} />
        <SearchResults results={results} isLoading={isLoading} />
      </div>
    </main>
  );
}

Step 12: Index Your Content

Before search can work, you need to index the articles. Start the dev server and run the indexing request:

npm run dev

In another terminal:

curl -X POST http://localhost:3000/api/index-content

You should get:

{
  "success": true,
  "indexed": 5
}

Open your browser to http://localhost:3000 and try these queries:

QueryExpected Result
"how to speed up my app"React Performance article
"I want to protect my application"JWT Authentication article
"deploying to the cloud"Next.js Deployment article
"learning a programming language"TypeScript Fundamentals article

Notice how the engine finds relevant articles even when the words don't match exactly.

Step 14: Advanced Improvements

Improve UX with automatic search-as-you-type:

// src/hooks/useDebounce.ts
import { useEffect, useState } from "react";
 
export function useDebounce<T>(value: T, delay: number): T {
  const [debouncedValue, setDebouncedValue] = useState<T>(value);
 
  useEffect(() => {
    const handler = setTimeout(() => {
      setDebouncedValue(value);
    }, delay);
 
    return () => clearTimeout(handler);
  }, [value, delay]);
 
  return debouncedValue;
}

Server-Side Caching

Add server-level caching to reduce OpenAI API calls:

// src/lib/cache.ts
const cache = new Map<string, { data: number[]; timestamp: number }>();
const TTL = 1000 * 60 * 60; // 1 hour
 
export function getCachedEmbedding(text: string): number[] | null {
  const entry = cache.get(text);
  if (!entry) return null;
 
  if (Date.now() - entry.timestamp > TTL) {
    cache.delete(text);
    return null;
  }
 
  return entry.data;
}
 
export function setCachedEmbedding(text: string, embedding: number[]) {
  cache.set(text, { data: embedding, timestamp: Date.now() });
}

Then update the generateEmbedding function:

export async function generateEmbedding(text: string): Promise<number[]> {
  // Check cache first
  const cached = getCachedEmbedding(text);
  if (cached) return cached;
 
  const response = await openai.embeddings.create({
    model: "text-embedding-3-small",
    input: text,
  });
 
  const embedding = response.data[0].embedding;
  setCachedEmbedding(text, embedding);
 
  return embedding;
}

Metadata Filtering

Pinecone supports metadata filtering that you can combine with vector search:

// Search only frontend articles
const results = await queryVectors(queryVector, 5, {
  category: "frontend",
});
 
// Search only English articles
const results = await queryVectors(queryVector, 5, {
  language: "en",
});

Step 15: Deploy to Production

Deploy to Vercel

npm install -g vercel
vercel

Add environment variables in the Vercel dashboard:

  • OPENAI_API_KEY
  • PINECONE_API_KEY
  • PINECONE_INDEX

Production Considerations

  1. Rate limiting: Protect your API from abuse
  2. Monitoring: Track OpenAI API calls and costs
  3. Caching: Use Redis for persistent embedding caching
  4. Reindexing: Set up a cron job to periodically reindex content
// Example: Simple rate limiting
const rateLimitMap = new Map<string, number[]>();
 
function isRateLimited(ip: string, maxRequests = 10, windowMs = 60000) {
  const now = Date.now();
  const requests = rateLimitMap.get(ip) || [];
  const recentRequests = requests.filter((t) => now - t < windowMs);
 
  if (recentRequests.length >= maxRequests) {
    return true;
  }
 
  recentRequests.push(now);
  rateLimitMap.set(ip, recentRequests);
  return false;
}

Cost Estimation

ComponentCostNotes
OpenAI Embeddings~$0.02 per 1M tokensVery affordable
PineconeFree up to 100,000 vectorsFree tier sufficient for starting
VercelFree for personal projectsHobby plan

For an app with 1,000 articles and 10,000 monthly searches, the estimated cost is under $1/month.

Troubleshooting

Common Issues

OpenAI API key error: Verify your key is correct and you have sufficient credits. Check your .env.local file.

Pinecone index not responding: Ensure the index dimensions match 1536 (for text-embedding-3-small).

Inaccurate results:

  • Add more content to index — more data improves results
  • Try combining title, content, and tags before embedding
  • Use text-embedding-3-large for higher accuracy

Slow responses:

  • Enable caching for repeated embedding queries
  • Use debounce to reduce queries while typing
  • Choose a Pinecone region closest to your server

Next Steps

After completing this tutorial, you can:

  • Add RAG (Retrieval-Augmented Generation): Combine search results with an LLM to generate custom answers
  • Build multimodal search: Add image search using CLIP models
  • Add search analytics: Track what users search for to improve content
  • Integrate with a CMS: Automatically trigger reindexing when content changes

Conclusion

Semantic search fundamentally changes how users interact with content. Instead of guessing the right keywords, they can simply describe what they want in natural language.

In this tutorial, you learned:

  • How vectors and embeddings work
  • Setting up OpenAI and Pinecone for vector search
  • Building secure Server Actions for search
  • Designing an interactive search interface
  • Performance and production optimizations

The technologies we used — OpenAI Embeddings, Pinecone, and Next.js Server Actions — represent the modern approach to building intelligent, scalable search applications.


Want to read more tutorials? Check out our latest tutorial on Build a Real-Time Full-Stack App with Convex and Next.js 15.

Discuss Your Project with Us

We're here to help with your web development needs. Schedule a call to discuss your project and how we can assist you.

Let's find the best solutions for your needs.

Related Articles