Prerequisites

Before starting, ensure you have:

Node.js 20 or higher installed
A Next.js 15 project (or create one fresh with create-next-app)
A Google AI API key from Google AI Studio
Basic TypeScript knowledge
Familiarity with Next.js App Router

What You'll Build

In this tutorial, you'll create a smart content assistant API powered by Google Genkit 1.0. The application will include:

Typed AI Flows — composable, observable workflows with input/output validation
Custom Tools — functions your AI can call to fetch real-world data
Middleware — retry logic, model fallbacks, and request interception (new in May 2026)
Streaming Responses — real-time output for a richer user experience
Next.js API Routes — production-ready endpoints using @genkit-ai/next

By the end, you'll have a solid understanding of how to structure, test, and deploy Genkit-powered features in a Next.js application.

What is Google Genkit?

Google Genkit is an open-source TypeScript framework from Firebase for building AI-powered applications. It reached its 1.0 stable milestone in late 2024 and has expanded rapidly in 2025–2026, including:

A unified interface for models from Google, OpenAI, Anthropic, Ollama, and more
First-class support for flows, RAG pipelines, and autonomous agents
A composable Middleware system announced in May 2026 for hardening AI pipelines
An interactive Developer UI for testing flows and inspecting traces
Official Next.js, Angular, and mobile SDKs

Unlike heavier orchestration frameworks, Genkit is lightweight and designed to integrate naturally into existing Node.js backends. It brings structure and observability to your AI logic without imposing a new architecture.

Step 1: Project Setup

If you don't have a Next.js project yet, create one:

npx create-next-app@latest genkit-demo --typescript --tailwind --app
cd genkit-demo

Install Genkit and its plugins:

npm install genkit @genkit-ai/google-genai @genkit-ai/next @genkit-ai/middleware
npm install -D genkit-cli

Set your API key in .env.local:

GOOGLE_GENAI_API_KEY=your_api_key_here

Never commit your .env.local file to version control. Add it to .gitignore if it isn't already there.

Step 2: Initialize Genkit

Create a shared Genkit instance that will be reused across all your flows. Create lib/genkit.ts:

import { genkit } from 'genkit';
import { googleAI } from '@genkit-ai/google-genai';
 
export const ai = genkit({
  plugins: [
    googleAI({
      apiKey: process.env.GOOGLE_GENAI_API_KEY,
    }),
  ],
  model: 'googleai/gemini-2.5-flash',
});

This creates a singleton ai instance configured with the Google AI plugin. The model property sets the default model for all ai.generate() calls that don't specify one explicitly.

You can swap in any supported model provider — @genkit-ai/openai, @genkit-ai/anthropic, or @genkit-ai/ollama — by changing just the plugin and model string. Your flow code stays identical.

Step 3: Define Your First Flow

Flows are typed functions that wrap AI logic with observability, input/output validation, and streaming support. Create lib/flows/summarize.ts:

import { z } from 'genkit';
import { ai } from '../genkit';
 
export const summarizeFlow = ai.defineFlow(
  {
    name: 'summarize',
    inputSchema: z.object({
      text: z.string().min(1).describe('Text to summarize'),
      maxWords: z.number().optional().default(150),
    }),
    outputSchema: z.object({
      summary: z.string(),
      keyPoints: z.array(z.string()),
    }),
  },
  async (input) => {
    const result = await ai.generate({
      prompt: `Summarize the following text in at most ${input.maxWords} words.
Return a JSON object with two fields:
- summary: a concise paragraph
- keyPoints: an array of 3 to 5 bullet points
 
Text: ${input.text}`,
      output: {
        schema: z.object({
          summary: z.string(),
          keyPoints: z.array(z.string()),
        }),
      },
    });
 
    return result.output!;
  }
);

Key things to notice:

inputSchema and outputSchema use Zod for full TypeScript type inference
The flow is self-contained and trivially testable in isolation
Genkit automatically traces every ai.generate() call inside a flow
The .describe() hint on text is forwarded to the Developer UI for documentation

Step 4: Create Custom Tools

Tools let your AI call external functions — REST APIs, databases, or any Node.js code. The model decides when to call a tool based on its description, and Genkit handles the execution loop automatically.

Create lib/tools/weather.ts:

import { z } from 'genkit';
import { ai } from '../genkit';
 
export const weatherTool = ai.defineTool(
  {
    name: 'getWeather',
    description: 'Fetches the current weather conditions for a given city name',
    inputSchema: z.object({
      city: z.string().describe('Name of the city'),
    }),
    outputSchema: z.object({
      temperature: z.number().describe('Temperature in Celsius'),
      condition: z.string().describe('Weather condition description'),
      humidity: z.number().describe('Relative humidity percentage'),
    }),
  },
  async ({ city }) => {
    const response = await fetch(
      `https://wttr.in/${encodeURIComponent(city)}?format=j1`
    );
    const data = await response.json();
    const current = data.current_condition[0];
    return {
      temperature: parseInt(current.temp_C),
      condition: current.weatherDesc[0].value,
      humidity: parseInt(current.humidity),
    };
  }
);

Now create a flow that uses the tool. Create lib/flows/weather-agent.ts:

import { z } from 'genkit';
import { ai } from '../genkit';
import { weatherTool } from '../tools/weather';
 
export const weatherAgentFlow = ai.defineFlow(
  {
    name: 'weatherAgent',
    inputSchema: z.object({
      query: z.string().describe('Natural language weather question'),
    }),
    outputSchema: z.string(),
  },
  async (input) => {
    const result = await ai.generate({
      prompt: input.query,
      tools: [weatherTool],
    });
    return result.text;
  }
);

When this flow runs, Genkit handles the entire tool-calling loop: the model requests the weather tool, Genkit executes it, and the result is fed back to the model until a final natural-language answer is produced. Your application code never needs to manage this loop manually.

Step 5: Add Middleware

The @genkit-ai/middleware package (announced May 2026) adds a composable interception layer around your AI pipelines. It supports three hook types:

generate hooks — conversation-level logic applied to every ai.generate() call
model hooks — per-call retries, fallbacks, and cost tracking
tool hooks — approvals and sandboxing before tool execution

Update lib/genkit.ts to add retry logic, automatic model fallback, and request logging:

import { genkit } from 'genkit';
import { googleAI } from '@genkit-ai/google-genai';
import { withRetry, withFallback, withLogging } from '@genkit-ai/middleware';
 
export const ai = genkit({
  plugins: [
    googleAI({ apiKey: process.env.GOOGLE_GENAI_API_KEY }),
  ],
  model: 'googleai/gemini-2.5-flash',
  middleware: [
    withLogging({ logLevel: 'info' }),
    withRetry({
      maxAttempts: 3,
      backoff: 'exponential',
      initialDelayMs: 500,
    }),
    withFallback({
      fallbackModel: 'googleai/gemini-flash-latest',
      onError: (error) => error.code === 'QUOTA_EXCEEDED',
    }),
  ],
});

This single configuration gives you:

Structured logs for every AI request and response
Up to 3 automatic retries with exponential backoff on transient errors
Transparent model switching to gemini-flash-latest when your primary quota is exceeded

Middleware is applied globally to all flows that use this ai instance. For flow-specific middleware, pass a middleware array to ai.defineFlow() directly.

Step 6: Next.js API Route Integration

The @genkit-ai/next package provides appRoute, a helper that wraps a Genkit flow as a Next.js App Router handler with zero boilerplate.

Create app/api/summarize/route.ts:

import { appRoute } from '@genkit-ai/next';
import { summarizeFlow } from '@/lib/flows/summarize';
 
export const POST = appRoute(summarizeFlow);

Create app/api/weather/route.ts:

import { appRoute } from '@genkit-ai/next';
import { weatherAgentFlow } from '@/lib/flows/weather-agent';
 
export const POST = appRoute(weatherAgentFlow);

The appRoute handler automatically:

Parses and validates the JSON request body against inputSchema
Returns validated JSON matching outputSchema
Streams responses when the flow uses streamingCallback
Returns proper HTTP error responses when validation fails

Test the summarize endpoint with curl:

curl -X POST http://localhost:3000/api/summarize \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Artificial intelligence is transforming software development at an unprecedented pace...",
    "maxWords": 80
  }'

Expected response:

{
  "summary": "AI is rapidly reshaping software development ...",
  "keyPoints": [
    "Developers are using AI for code completion and review",
    "AI reduces time-to-market for complex features",
    "New tooling requires developers to upskill continuously"
  ]
}

Step 7: Streaming Responses

For long-running flows or chat-style interactions, you can stream output to the client in real time. Create lib/flows/stream-chat.ts:

import { z } from 'genkit';
import { ai } from '../genkit';
 
export const streamChatFlow = ai.defineFlow(
  {
    name: 'streamChat',
    inputSchema: z.object({
      message: z.string().describe('User message'),
    }),
    outputSchema: z.string(),
    streamSchema: z.string(),
  },
  async (input, streamingCallback) => {
    const result = await ai.generate({
      prompt: input.message,
      onChunk: (chunk) => {
        if (streamingCallback && chunk.text) {
          streamingCallback(chunk.text);
        }
      },
    });
    return result.text;
  }
);

Create app/api/chat/route.ts:

import { appRoute } from '@genkit-ai/next';
import { streamChatFlow } from '@/lib/flows/stream-chat';
 
export const POST = appRoute(streamChatFlow, { streaming: true });

On the client side, consume the stream:

async function streamChat(message: string, onChunk: (text: string) => void) {
  const response = await fetch('/api/chat', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ message }),
  });
 
  const reader = response.body?.getReader();
  const decoder = new TextDecoder();
 
  while (reader) {
    const { done, value } = await reader.read();
    if (done) break;
    onChunk(decoder.decode(value, { stream: true }));
  }
}

This pattern works perfectly with React state — call onChunk to append each text chunk to a useState string and the UI updates in real time as the model generates output.

Step 8: Developer UI and Tracing

Genkit ships with an interactive Developer UI that lets you test flows, inspect traces, and compare model outputs without touching your Next.js frontend.

Open a second terminal alongside your npm run dev process:

npx genkit start -- npx tsx --watch lib/genkit.ts

The Developer UI opens at http://localhost:4000 and provides:

Flow Runner — execute any registered flow with custom inputs and view structured output
Model Playground — send prompts directly to any configured model
Trace Explorer — inspect every generate() call with full input/output, latency breakdown, and token counts
Tool Inspector — test tool schemas and stub tool responses for faster iteration

Use the Trace Explorer during development to understand exactly how many tokens each flow consumes. This helps you optimize prompts before deploying to production where costs are real.

Step 9: Production Deployment

Vercel

Add the environment variable in your Vercel Project Settings:

GOOGLE_GENAI_API_KEY=your_production_key

Then deploy normally:

npx vercel --prod

Firebase App Hosting or Cloud Run

For deeper monitoring integration, add the Firebase plugin:

import { firebaseApp } from '@genkit-ai/firebase';
 
export const ai = genkit({
  plugins: [
    googleAI({ apiKey: process.env.GOOGLE_GENAI_API_KEY }),
    firebaseApp(),
  ],
  // ...
});

The Firebase plugin automatically sends traces to Cloud Logging and enables the production monitoring dashboard in the Firebase console, giving you request volumes, latency percentiles, and error rates for every flow.

Environment variables checklist before shipping:

GOOGLE_GENAI_API_KEY — set and scoped to production only
NODE_ENV=production — disables the Developer UI in production builds
Rate limiting — consider wrapping your API routes with arcjet or upstash/ratelimit to prevent abuse

Testing Your Implementation

Genkit flows are plain async functions, which makes them easy to unit test without HTTP mocking. Create __tests__/summarize.test.ts:

import { describe, it, expect } from 'vitest';
import { summarizeFlow } from '@/lib/flows/summarize';
 
describe('summarizeFlow', () => {
  it('returns a summary and key points', async () => {
    const result = await summarizeFlow({
      text: 'AI is transforming every industry from healthcare to finance by automating complex tasks.',
      maxWords: 50,
    });
 
    expect(result.summary).toBeTruthy();
    expect(result.summary.length).toBeGreaterThan(10);
    expect(result.keyPoints).toBeInstanceOf(Array);
    expect(result.keyPoints.length).toBeGreaterThanOrEqual(1);
  });
});

Run tests with:

npx vitest run

Integration tests that call real AI APIs will consume tokens and may be slow. Use environment-based guards (if (process.env.CI) skip(...)) to skip live tests in CI and mock the ai.generate() call instead.

Troubleshooting

GOOGLE_GENAI_API_KEY not found at runtime

Make sure the key is set in .env.local for local development. On Vercel, check the Environment Variables section in Project Settings. The key must be available at both build time and runtime.

Flow output does not match schema

Enable output.strict: true on ai.generate() to throw when the model returns malformed JSON. Add .describe() hints to your Zod fields to help the model understand what each field should contain.

Quota exceeded errors not triggering fallback

Ensure the withFallback middleware is ordered after withRetry in the middleware array, and that the fallback model is enabled for your Google AI account.

Streaming not working on Vercel

Set export const dynamic = 'force-dynamic' in your route file to prevent Vercel from statically optimizing it, and ensure you're on a Pro or higher plan that supports streaming responses.

Tool calling loop runs indefinitely

Set maxTurns in the ai.generate() call to cap the number of tool-calling iterations:

const result = await ai.generate({
  prompt: input.query,
  tools: [weatherTool],
  maxTurns: 5,
});

Next Steps

With a working Genkit setup in place, you can extend it in several directions:

RAG Pipelines — use ai.defineRetriever with Pinecone or Cloud Firestore to add document-grounded search to your flows
Multi-Step Agents — chain flows together, passing the output of one as input to the next for complex reasoning tasks
Evaluation — use @genkit-ai/evaluation to automatically grade flow outputs against test datasets
Genkit Extension for Gemini CLI — debug and run flows directly from your terminal using the official Gemini CLI extension
Prompt Management — store versioned prompts in Firestore using ai.definePrompt for safe prompt iteration without redeploying

Conclusion

Google Genkit 1.0 provides a solid, production-ready foundation for building AI-powered applications in TypeScript. Its combination of typed flows, the new composable Middleware system, and native Next.js integration makes it one of the most practical AI frameworks available to JavaScript developers today. Whether you're building a simple summarization endpoint or a complex multi-step agent with tool use, Genkit gives you the observability and structure you need to ship confidently and iterate quickly.