Build a Semantic Search Engine with Next.js 15, OpenAI, and Pinecone

Traditional keyword-based search falls short in the age of AI. When a user searches for "how to speed up my app," a keyword search won't find an article titled "Optimizing React Performance" — even though it's the perfect answer. This is where semantic search comes in, understanding meaning rather than just matching characters.
In this comprehensive tutorial, you'll build a full semantic search engine using:
- Next.js 15 with App Router and Server Actions
- OpenAI Embeddings API to convert text into vectors
- Pinecone as a cloud-native vector database
- TypeScript for end-to-end type safety
What You'll Build
A web application that lets users search through a collection of articles using semantic search. The app understands user intent and surfaces the most relevant results by meaning, even when the words don't match exactly.
Core features:
- Interactive search interface with instant results
- Automatic content indexing via API Route
- Results ranked by semantic similarity score
- Multi-language search support
Prerequisites
Before starting, make sure you have:
- Node.js 18 or later
- Basic familiarity with React and Next.js
- An OpenAI account with an API key
- A Pinecone account (free tier is sufficient)
- A code editor like VS Code
Step 1: Understanding Vector Search
How Semantic Search Works
Vector search relies on three stages:
- Embedding: Converting text into a numerical vector (array of numbers) that represents its meaning
- Storage: Saving vectors in a specialized database like Pinecone
- Querying: Converting the user's question into a vector and comparing it against stored vectors
OpenAI's text-embedding-3-small model produces vectors with 1,536 dimensions. Each dimension represents an aspect of the text's meaning. Texts with similar meanings have vectors that are close together in multi-dimensional space.
A vector is simply an array of numbers. For example: [0.023, -0.41, 0.87, ...] — each number represents a dimension of meaning. Proximity between two vectors indicates similarity in meaning.
Step 2: Create the Project
Scaffold a new Next.js 15 project:
npx create-next-app@latest semantic-search-app --typescript --tailwind --app --src-dir
cd semantic-search-appInstall the required packages:
npm install openai @pinecone-database/pineconeProject Structure
src/
├── app/
│ ├── layout.tsx
│ ├── page.tsx
│ ├── api/
│ │ └── index-content/
│ │ └── route.ts
│ └── actions/
│ └── search.ts
├── lib/
│ ├── openai.ts
│ ├── pinecone.ts
│ └── types.ts
└── components/
├── SearchBar.tsx
├── SearchResults.tsx
└── ArticleCard.tsx
Step 3: Configure Environment Variables
Create a .env.local file at the project root:
OPENAI_API_KEY=sk-your-openai-api-key
PINECONE_API_KEY=your-pinecone-api-key
PINECONE_INDEX=semantic-searchNever share your API keys. Add .env.local to .gitignore (Next.js does this automatically).
Step 4: Set Up the OpenAI Client
Create src/lib/openai.ts:
import OpenAI from "openai";
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
export async function generateEmbedding(text: string): Promise<number[]> {
const response = await openai.embeddings.create({
model: "text-embedding-3-small",
input: text,
});
return response.data[0].embedding;
}
export async function generateEmbeddings(
texts: string[]
): Promise<number[][]> {
const response = await openai.embeddings.create({
model: "text-embedding-3-small",
input: texts,
});
return response.data.map((item) => item.embedding);
}Why text-embedding-3-small?
| Model | Dimensions | Price per 1M tokens | Performance |
|---|---|---|---|
| text-embedding-3-small | 1,536 | $0.02 | Excellent for general use |
| text-embedding-3-large | 3,072 | $0.13 | Higher accuracy |
| text-embedding-ada-002 | 1,536 | $0.10 | Previous generation |
The small model offers an ideal balance of performance and cost for most applications.
Step 5: Set Up Pinecone
Create the Pinecone Index
- Log in to console.pinecone.io
- Create a new index named
semantic-search - Set dimensions to 1536 (matching the OpenAI model)
- Choose cosine as the similarity metric
Create src/lib/pinecone.ts:
import { Pinecone } from "@pinecone-database/pinecone";
const pinecone = new Pinecone({
apiKey: process.env.PINECONE_API_KEY!,
});
export const index = pinecone.index(process.env.PINECONE_INDEX!);
export interface ArticleMetadata {
title: string;
summary: string;
url: string;
category: string;
language: string;
}
export async function upsertVectors(
vectors: {
id: string;
values: number[];
metadata: ArticleMetadata;
}[]
) {
// Pinecone accepts up to 100 vectors per upsert
const batchSize = 100;
for (let i = 0; i < vectors.length; i += batchSize) {
const batch = vectors.slice(i, i + batchSize);
await index.upsert(batch);
}
}
export async function queryVectors(
queryVector: number[],
topK: number = 5,
filter?: Record<string, string>
) {
const results = await index.query({
vector: queryVector,
topK,
includeMetadata: true,
filter,
});
return results.matches || [];
}Step 6: Define Types
Create src/lib/types.ts:
export interface Article {
id: string;
title: string;
content: string;
summary: string;
url: string;
category: string;
language: string;
}
export interface SearchResult {
id: string;
score: number;
title: string;
summary: string;
url: string;
category: string;
}
export interface SearchState {
results: SearchResult[];
query: string;
isLoading: boolean;
error: string | null;
}Step 7: Build the Content Indexing API
Create src/app/api/index-content/route.ts:
import { NextResponse } from "next/server";
import { generateEmbeddings } from "@/lib/openai";
import { upsertVectors, type ArticleMetadata } from "@/lib/pinecone";
import type { Article } from "@/lib/types";
// Sample data — in production, fetch from your CMS or database
const articles: Article[] = [
{
id: "1",
title: "Optimizing React Application Performance",
content:
"A comprehensive guide to improving React app performance using memo, useMemo, useCallback, and lazy loading components...",
summary: "Learn React performance optimization techniques",
url: "/tutorials/react-performance",
category: "frontend",
language: "en",
},
{
id: "2",
title: "Building REST APIs with Node.js and Express",
content:
"How to design and build a complete RESTful API with authentication and data validation...",
summary: "Build robust APIs with Express.js",
url: "/tutorials/nodejs-rest-api",
category: "backend",
language: "en",
},
{
id: "3",
title: "TypeScript Fundamentals for Developers",
content:
"A comprehensive introduction to TypeScript covering basic types, interfaces, and generics...",
summary: "Start your TypeScript journey",
url: "/tutorials/typescript-basics",
category: "language",
language: "en",
},
{
id: "4",
title: "Deploying Next.js to Production",
content:
"Complete guide to deploying Next.js applications with Docker, CI/CD pipelines, and monitoring...",
summary: "Production deployment strategies for Next.js",
url: "/tutorials/nextjs-deployment",
category: "devops",
language: "en",
},
{
id: "5",
title: "Authentication with JWT and Refresh Tokens",
content:
"Implementing secure authentication using JSON Web Tokens with refresh token rotation...",
summary: "Secure auth implementation guide",
url: "/tutorials/jwt-auth",
category: "security",
language: "en",
},
];
export async function POST() {
try {
// Prepare texts for embedding: combine title and content
const textsToEmbed = articles.map(
(article) => `${article.title}\n\n${article.content}`
);
// Generate embeddings in batch (more efficient than individual requests)
const embeddings = await generateEmbeddings(textsToEmbed);
// Prepare vectors for Pinecone
const vectors = articles.map((article, idx) => ({
id: article.id,
values: embeddings[idx],
metadata: {
title: article.title,
summary: article.summary,
url: article.url,
category: article.category,
language: article.language,
} satisfies ArticleMetadata,
}));
// Upload vectors to Pinecone
await upsertVectors(vectors);
return NextResponse.json({
success: true,
indexed: articles.length,
});
} catch (error) {
console.error("Indexing error:", error);
return NextResponse.json(
{ error: "Failed to index content" },
{ status: 500 }
);
}
}Step 8: Build the Search Server Action
Create src/app/actions/search.ts:
"use server";
import { generateEmbedding } from "@/lib/openai";
import { queryVectors } from "@/lib/pinecone";
import type { SearchResult } from "@/lib/types";
export async function semanticSearch(
query: string,
language?: string
): Promise<SearchResult[]> {
if (!query.trim()) {
return [];
}
try {
// Convert the user's query into a vector
const queryVector = await generateEmbedding(query);
// Prepare language filter (optional)
const filter = language ? { language } : undefined;
// Query Pinecone
const matches = await queryVectors(queryVector, 5, filter);
// Transform results into the expected shape
const results: SearchResult[] = matches.map((match) => ({
id: match.id,
score: match.score || 0,
title: (match.metadata?.title as string) || "",
summary: (match.metadata?.summary as string) || "",
url: (match.metadata?.url as string) || "",
category: (match.metadata?.category as string) || "",
}));
return results;
} catch (error) {
console.error("Search error:", error);
throw new Error("Search failed. Please try again.");
}
}Using Server Actions means your API keys stay on the server and are never sent to the browser. This is more secure than creating a separate API Route for search.
Step 9: Build the Search Bar Component
Create src/components/SearchBar.tsx:
"use client";
import { useState, useTransition, useCallback } from "react";
import { semanticSearch } from "@/app/actions/search";
import type { SearchResult } from "@/lib/types";
interface SearchBarProps {
onResults: (results: SearchResult[]) => void;
onLoading: (loading: boolean) => void;
}
export default function SearchBar({ onResults, onLoading }: SearchBarProps) {
const [query, setQuery] = useState("");
const [isPending, startTransition] = useTransition();
const handleSearch = useCallback(() => {
if (!query.trim()) return;
onLoading(true);
startTransition(async () => {
try {
const results = await semanticSearch(query);
onResults(results);
} catch {
onResults([]);
} finally {
onLoading(false);
}
});
}, [query, onResults, onLoading]);
return (
<div className="w-full max-w-2xl mx-auto">
<div className="relative">
<input
type="text"
value={query}
onChange={(e) => setQuery(e.target.value)}
onKeyDown={(e) => e.key === "Enter" && handleSearch()}
placeholder="Search by meaning... e.g., How do I improve my app's speed?"
className="w-full px-6 py-4 text-lg border-2 border-gray-200
rounded-2xl focus:border-blue-500 focus:outline-none
transition-colors duration-200 pl-28"
/>
<button
onClick={handleSearch}
disabled={isPending || !query.trim()}
className="absolute right-3 top-1/2 -translate-y-1/2
bg-blue-600 text-white px-4 py-2 rounded-xl
hover:bg-blue-700 disabled:opacity-50
disabled:cursor-not-allowed transition-colors"
>
{isPending ? "..." : "Search"}
</button>
</div>
<p className="text-sm text-gray-500 mt-2">
Semantic search understands meaning — try natural questions instead of keywords
</p>
</div>
);
}Step 10: Build the Results Components
Create src/components/ArticleCard.tsx:
import type { SearchResult } from "@/lib/types";
interface ArticleCardProps {
result: SearchResult;
}
export default function ArticleCard({ result }: ArticleCardProps) {
const relevancePercent = Math.round(result.score * 100);
return (
<a
href={result.url}
className="block p-6 bg-white rounded-2xl border border-gray-100
hover:border-blue-200 hover:shadow-lg transition-all
duration-200"
>
<div className="flex items-start justify-between gap-4">
<div className="flex-1">
<h3 className="text-xl font-bold text-gray-900 mb-2">
{result.title}
</h3>
<p className="text-gray-600 leading-relaxed">{result.summary}</p>
<span className="inline-block mt-3 text-sm text-blue-600
bg-blue-50 px-3 py-1 rounded-full">
{result.category}
</span>
</div>
<div className="flex-shrink-0 text-center">
<div
className={`text-2xl font-bold ${
relevancePercent >= 80
? "text-green-600"
: relevancePercent >= 60
? "text-yellow-600"
: "text-gray-400"
}`}
>
{relevancePercent}%
</div>
<div className="text-xs text-gray-400">match</div>
</div>
</div>
</a>
);
}Create src/components/SearchResults.tsx:
import type { SearchResult } from "@/lib/types";
import ArticleCard from "./ArticleCard";
interface SearchResultsProps {
results: SearchResult[];
isLoading: boolean;
}
export default function SearchResults({
results,
isLoading,
}: SearchResultsProps) {
if (isLoading) {
return (
<div className="space-y-4 mt-8">
{[1, 2, 3].map((i) => (
<div
key={i}
className="h-32 bg-gray-100 rounded-2xl animate-pulse"
/>
))}
</div>
);
}
if (results.length === 0) {
return null;
}
return (
<div className="space-y-4 mt-8">
<p className="text-sm text-gray-500">
Found {results.length} results ranked by semantic relevance
</p>
{results.map((result) => (
<ArticleCard key={result.id} result={result} />
))}
</div>
);
}Step 11: Assemble the Home Page
Update src/app/page.tsx:
"use client";
import { useState } from "react";
import SearchBar from "@/components/SearchBar";
import SearchResults from "@/components/SearchResults";
import type { SearchResult } from "@/lib/types";
export default function Home() {
const [results, setResults] = useState<SearchResult[]>([]);
const [isLoading, setIsLoading] = useState(false);
return (
<main className="min-h-screen bg-gradient-to-b from-gray-50 to-white">
<div className="max-w-4xl mx-auto px-4 py-20">
<div className="text-center mb-12">
<h1 className="text-4xl font-bold text-gray-900 mb-4">
AI Semantic Search
</h1>
<p className="text-xl text-gray-600">
Search by meaning, not just words
</p>
</div>
<SearchBar onResults={setResults} onLoading={setIsLoading} />
<SearchResults results={results} isLoading={isLoading} />
</div>
</main>
);
}Step 12: Index Your Content
Before search can work, you need to index the articles. Start the dev server and run the indexing request:
npm run devIn another terminal:
curl -X POST http://localhost:3000/api/index-contentYou should get:
{
"success": true,
"indexed": 5
}Step 13: Test Semantic Search
Open your browser to http://localhost:3000 and try these queries:
| Query | Expected Result |
|---|---|
| "how to speed up my app" | React Performance article |
| "I want to protect my application" | JWT Authentication article |
| "deploying to the cloud" | Next.js Deployment article |
| "learning a programming language" | TypeScript Fundamentals article |
Notice how the engine finds relevant articles even when the words don't match exactly.
Step 14: Advanced Improvements
Add Debounce for Auto-Search
Improve UX with automatic search-as-you-type:
// src/hooks/useDebounce.ts
import { useEffect, useState } from "react";
export function useDebounce<T>(value: T, delay: number): T {
const [debouncedValue, setDebouncedValue] = useState<T>(value);
useEffect(() => {
const handler = setTimeout(() => {
setDebouncedValue(value);
}, delay);
return () => clearTimeout(handler);
}, [value, delay]);
return debouncedValue;
}Server-Side Caching
Add server-level caching to reduce OpenAI API calls:
// src/lib/cache.ts
const cache = new Map<string, { data: number[]; timestamp: number }>();
const TTL = 1000 * 60 * 60; // 1 hour
export function getCachedEmbedding(text: string): number[] | null {
const entry = cache.get(text);
if (!entry) return null;
if (Date.now() - entry.timestamp > TTL) {
cache.delete(text);
return null;
}
return entry.data;
}
export function setCachedEmbedding(text: string, embedding: number[]) {
cache.set(text, { data: embedding, timestamp: Date.now() });
}Then update the generateEmbedding function:
export async function generateEmbedding(text: string): Promise<number[]> {
// Check cache first
const cached = getCachedEmbedding(text);
if (cached) return cached;
const response = await openai.embeddings.create({
model: "text-embedding-3-small",
input: text,
});
const embedding = response.data[0].embedding;
setCachedEmbedding(text, embedding);
return embedding;
}Metadata Filtering
Pinecone supports metadata filtering that you can combine with vector search:
// Search only frontend articles
const results = await queryVectors(queryVector, 5, {
category: "frontend",
});
// Search only English articles
const results = await queryVectors(queryVector, 5, {
language: "en",
});Step 15: Deploy to Production
Deploy to Vercel
npm install -g vercel
vercelAdd environment variables in the Vercel dashboard:
OPENAI_API_KEYPINECONE_API_KEYPINECONE_INDEX
Production Considerations
- Rate limiting: Protect your API from abuse
- Monitoring: Track OpenAI API calls and costs
- Caching: Use Redis for persistent embedding caching
- Reindexing: Set up a cron job to periodically reindex content
// Example: Simple rate limiting
const rateLimitMap = new Map<string, number[]>();
function isRateLimited(ip: string, maxRequests = 10, windowMs = 60000) {
const now = Date.now();
const requests = rateLimitMap.get(ip) || [];
const recentRequests = requests.filter((t) => now - t < windowMs);
if (recentRequests.length >= maxRequests) {
return true;
}
recentRequests.push(now);
rateLimitMap.set(ip, recentRequests);
return false;
}Cost Estimation
| Component | Cost | Notes |
|---|---|---|
| OpenAI Embeddings | ~$0.02 per 1M tokens | Very affordable |
| Pinecone | Free up to 100,000 vectors | Free tier sufficient for starting |
| Vercel | Free for personal projects | Hobby plan |
For an app with 1,000 articles and 10,000 monthly searches, the estimated cost is under $1/month.
Troubleshooting
Common Issues
OpenAI API key error:
Verify your key is correct and you have sufficient credits. Check your .env.local file.
Pinecone index not responding: Ensure the index dimensions match 1536 (for text-embedding-3-small).
Inaccurate results:
- Add more content to index — more data improves results
- Try combining title, content, and tags before embedding
- Use
text-embedding-3-largefor higher accuracy
Slow responses:
- Enable caching for repeated embedding queries
- Use debounce to reduce queries while typing
- Choose a Pinecone region closest to your server
Next Steps
After completing this tutorial, you can:
- Add RAG (Retrieval-Augmented Generation): Combine search results with an LLM to generate custom answers
- Build multimodal search: Add image search using CLIP models
- Add search analytics: Track what users search for to improve content
- Integrate with a CMS: Automatically trigger reindexing when content changes
Conclusion
Semantic search fundamentally changes how users interact with content. Instead of guessing the right keywords, they can simply describe what they want in natural language.
In this tutorial, you learned:
- How vectors and embeddings work
- Setting up OpenAI and Pinecone for vector search
- Building secure Server Actions for search
- Designing an interactive search interface
- Performance and production optimizations
The technologies we used — OpenAI Embeddings, Pinecone, and Next.js Server Actions — represent the modern approach to building intelligent, scalable search applications.
Discuss Your Project with Us
We're here to help with your web development needs. Schedule a call to discuss your project and how we can assist you.
Let's find the best solutions for your needs.
Related Articles

Building a RAG Chatbot with Supabase pgvector and Next.js
Learn to build an AI chatbot that answers questions using your own data. This tutorial covers vector embeddings, semantic search, and RAG with Supabase and Next.js.

AI Chatbot Integration Guide: Build Intelligent Conversational Interfaces
A comprehensive guide to integrating AI chatbots into your applications using OpenAI, Anthropic Claude, and ElevenLabs. Learn to build text and voice-enabled chatbots with Next.js.

Building an Autonomous AI Agent with Agentic RAG and Next.js
Learn how to build an AI agent that autonomously decides when and how to retrieve information from vector databases. A comprehensive hands-on guide using Vercel AI SDK and Next.js with executable examples.