Deploy Your First Serverless AI Function with Cloudflare Workers and Claude

Deploy Your First Serverless AI Function with Cloudflare Workers and Claude
Serverless functions are revolutionizing how we build and deploy applications. When combined with AI capabilities like Claude, they become incredibly powerful tools for creating intelligent, scalable services that run at the edge of the network.
In this comprehensive tutorial, you'll learn how to build a production-ready AI-powered serverless function using Cloudflare Workers and Anthropic's Claude API. By the end, you'll have a deployed function that can handle AI requests with minimal latency, auto-scaling, and global distribution.
Why Cloudflare Workers + Claude?
Cloudflare Workers run your code on Cloudflare's global network of data centers, meaning your function executes close to your users anywhere in the world. Benefits include:
- Zero cold starts – instant execution
- Global distribution – 300+ data centers worldwide
- Cost-effective – generous free tier (100,000 requests/day)
- Low latency – edge computing reduces response times
- Auto-scaling – handles traffic spikes automatically
Claude API by Anthropic provides state-of-the-art language model capabilities:
- Advanced reasoning and analysis
- Large context windows (up to 200K tokens)
- Strong safety guardrails
- Excellent at following instructions
- Fast response times
Together, they create a powerful stack for building intelligent applications that scale globally.
What We're Building
We'll create a Smart Content Analyzer – an API endpoint that accepts text and returns:
- Sentiment analysis
- Key topics extraction
- Summary generation
- Content recommendations
This pattern can be adapted for chatbots, content moderation, document processing, and more.
Prerequisites
Before starting, ensure you have:
- Node.js 18+ installed (download here)
- A Cloudflare account (free tier works perfectly)
- An Anthropic API key (get one here)
- Basic JavaScript/TypeScript knowledge
- Familiarity with REST APIs
You'll also need about 30 minutes to complete this tutorial.
Step 1: Set Up Your Development Environment
First, install Wrangler, Cloudflare's CLI tool for Workers:
npm install -g wranglerVerify installation:
wrangler --versionLog in to your Cloudflare account:
wrangler loginThis opens a browser window for authentication. Once logged in, you're ready to create your first Worker.
Step 2: Create a New Worker Project
Create a new directory and initialize the project:
mkdir smart-analyzer-worker
cd smart-analyzer-worker
npm init -yCreate the Worker configuration file wrangler.toml:
name = "smart-analyzer"
main = "src/index.js"
compatibility_date = "2026-02-28"
[vars]
ENVIRONMENT = "production"Create the source directory:
mkdir srcStep 3: Write Your First Worker
Create src/index.js with a basic Worker structure:
export default {
async fetch(request, env, ctx) {
// Handle CORS for browser requests
if (request.method === 'OPTIONS') {
return new Response(null, {
headers: {
'Access-Control-Allow-Origin': '*',
'Access-Control-Allow-Methods': 'POST, OPTIONS',
'Access-Control-Allow-Headers': 'Content-Type',
},
});
}
// Only accept POST requests
if (request.method !== 'POST') {
return new Response('Method not allowed', { status: 405 });
}
try {
const { text } = await request.json();
if (!text || text.trim().length === 0) {
return new Response(
JSON.stringify({ error: 'Text is required' }),
{ status: 400, headers: { 'Content-Type': 'application/json' } }
);
}
return new Response(
JSON.stringify({
message: 'Worker is running!',
textLength: text.length,
}),
{
headers: {
'Content-Type': 'application/json',
'Access-Control-Allow-Origin': '*',
},
}
);
} catch (error) {
return new Response(
JSON.stringify({ error: 'Invalid JSON' }),
{ status: 400, headers: { 'Content-Type': 'application/json' } }
);
}
},
};Test locally:
wrangler devYour Worker is now running at http://localhost:8787. Test it:
curl -X POST http://localhost:8787 \
-H "Content-Type: application/json" \
-d '{"text":"Hello, World!"}'You should see: {"message":"Worker is running!","textLength":13}
Step 4: Integrate Claude API
Now let's add the AI magic. First, store your Anthropic API key as a secret:
wrangler secret put ANTHROPIC_API_KEYPaste your API key when prompted.
Update src/index.js to call Claude:
async function analyzeWithClaude(text, apiKey) {
const response = await fetch('https://api.anthropic.com/v1/messages', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-api-key': apiKey,
'anthropic-version': '2023-06-01',
},
body: JSON.stringify({
model: 'claude-3-5-sonnet-20241022',
max_tokens: 1024,
messages: [
{
role: 'user',
content: `Analyze the following text and provide:
1. Sentiment (positive/negative/neutral with confidence score)
2. Top 3 key topics
3. A concise summary (max 2 sentences)
4. Content recommendation (what action should be taken)
Text: "${text}"
Return your response as valid JSON with this structure:
{
"sentiment": {"label": "positive", "confidence": 0.85},
"topics": ["topic1", "topic2", "topic3"],
"summary": "Your summary here",
"recommendation": "Your recommendation here"
}`,
},
],
}),
});
if (!response.ok) {
throw new Error(\`Claude API error: \${response.status}\`);
}
const data = await response.json();
const content = data.content[0].text;
// Extract JSON from Claude's response
const jsonMatch = content.match(/\{[\s\S]*\}/);
if (!jsonMatch) {
throw new Error('Failed to parse Claude response');
}
return JSON.parse(jsonMatch[0]);
}
export default {
async fetch(request, env, ctx) {
if (request.method === 'OPTIONS') {
return new Response(null, {
headers: {
'Access-Control-Allow-Origin': '*',
'Access-Control-Allow-Methods': 'POST, OPTIONS',
'Access-Control-Allow-Headers': 'Content-Type',
},
});
}
if (request.method !== 'POST') {
return new Response('Method not allowed', { status: 405 });
}
try {
const { text } = await request.json();
if (!text || text.trim().length === 0) {
return new Response(
JSON.stringify({ error: 'Text is required' }),
{ status: 400, headers: { 'Content-Type': 'application/json' } }
);
}
// Limit text length to prevent abuse
if (text.length > 10000) {
return new Response(
JSON.stringify({ error: 'Text too long (max 10,000 characters)' }),
{ status: 400, headers: { 'Content-Type': 'application/json' } }
);
}
const analysis = await analyzeWithClaude(text, env.ANTHROPIC_API_KEY);
return new Response(
JSON.stringify({
success: true,
analysis,
processedAt: new Date().toISOString(),
}),
{
headers: {
'Content-Type': 'application/json',
'Access-Control-Allow-Origin': '*',
},
}
);
} catch (error) {
console.error('Error:', error);
return new Response(
JSON.stringify({
error: 'Analysis failed',
message: error.message,
}),
{
status: 500,
headers: { 'Content-Type': 'application/json' },
}
);
}
},
};Step 5: Test Locally
Run your Worker with the secret:
wrangler devTest with a sample text:
curl -X POST http://localhost:8787 \
-H "Content-Type: application/json" \
-d '{
"text": "The new AI features in our product have been incredibly well-received. Customers are reporting 40% productivity gains and the feedback has been overwhelmingly positive. This validates our strategic direction."
}'You should receive a structured analysis with sentiment, topics, summary, and recommendations!
Step 6: Deploy to Production
Once testing looks good, deploy to Cloudflare's global network:
wrangler deployYou'll get a URL like https://smart-analyzer.YOUR-SUBDOMAIN.workers.dev
Your serverless AI function is now live and distributed across 300+ data centers worldwide!
Step 7: Add Rate Limiting (Optional but Recommended)
To prevent abuse, add simple rate limiting:
// At the top of your Worker
const RATE_LIMIT = 10; // requests per minute
const rateLimitMap = new Map();
function checkRateLimit(ip) {
const now = Date.now();
const windowStart = now - 60000; // 1 minute window
if (!rateLimitMap.has(ip)) {
rateLimitMap.set(ip, [now]);
return true;
}
const requests = rateLimitMap.get(ip).filter(time => time > windowStart);
requests.push(now);
rateLimitMap.set(ip, requests);
return requests.length <= RATE_LIMIT;
}
// In your fetch handler, before processing:
const clientIP = request.headers.get('CF-Connecting-IP') || 'unknown';
if (!checkRateLimit(clientIP)) {
return new Response(
JSON.stringify({ error: 'Rate limit exceeded. Try again later.' }),
{ status: 429, headers: { 'Content-Type': 'application/json' } }
);
}Redeploy with wrangler deploy.
Step 8: Monitor and Optimize
View Analytics
Check your Worker's performance in the Cloudflare dashboard:
- Go to Workers & Pages
- Select your Worker
- View metrics: requests, errors, CPU time, bandwidth
Optimize Performance
1. Use streaming for large responses:
// For longer analysis tasks
return new Response(stream, {
headers: { 'Content-Type': 'text/event-stream' }
});2. Cache frequent requests:
const cache = caches.default;
const cacheKey = new Request(request.url, request);
let response = await cache.match(cacheKey);
if (!response) {
response = await analyzeWithClaude(text, env.ANTHROPIC_API_KEY);
ctx.waitUntil(cache.put(cacheKey, response.clone()));
}3. Use Claude's faster models for simple tasks:
model: 'claude-3-5-haiku-20241022' // Faster and cheaperBest Practices
1. Security
- Never expose your API key in client-side code
- Use Cloudflare Access for additional authentication
- Implement request signing for sensitive operations
- Validate and sanitize all inputs
2. Error Handling
- Always return structured error responses
- Log errors for debugging (use
console.error) - Provide helpful error messages to users
- Set appropriate HTTP status codes
3. Cost Optimization
- Set reasonable text length limits
- Implement caching for identical requests
- Use appropriate Claude models (Haiku for simple tasks, Sonnet for complex)
- Monitor usage through Cloudflare and Anthropic dashboards
4. Performance
- Keep Workers lightweight (< 1MB compressed)
- Minimize external API calls
- Use
ctx.waitUntil()for non-blocking operations - Take advantage of Cloudflare's global network
Real-World Use Cases
This pattern can power:
- Content Moderation – Automatically flag inappropriate content
- Customer Support – Analyze support tickets for routing and priority
- Document Processing – Extract insights from uploaded documents
- Chatbots – Build conversational interfaces with global low latency
- Email Analysis – Categorize and summarize incoming emails
- Social Media Monitoring – Analyze brand mentions and sentiment
- Code Review – Automated code quality and security checks
Troubleshooting
Error: "Invalid API key"
- Verify your secret:
wrangler secret list - Recreate if needed:
wrangler secret put ANTHROPIC_API_KEY
Error: "Worker exceeded CPU time limit"
- Optimize Claude prompts to be more concise
- Use faster Claude models
- Implement timeout handling
High latency
- Check if you're on the free tier (slight cold starts possible)
- Optimize your Worker code
- Consider implementing caching
Rate limiting from Anthropic
- Implement request queuing
- Use exponential backoff
- Upgrade your Anthropic plan if needed
Next Steps
Now that you have a working serverless AI function, consider:
- Add authentication using Cloudflare Access or JWT tokens
- Implement a frontend to interact with your API
- Store results in Cloudflare D1 (serverless SQL database)
- Add webhooks to notify external systems of analysis results
- Build a dashboard to visualize analytics
- Create specialized analyzers for different content types
- Integrate with other APIs (translation, image analysis, etc.)
Conclusion
You've successfully built and deployed a serverless AI-powered function that runs globally with minimal latency. This architecture is production-ready and can scale to millions of requests without infrastructure management.
The combination of Cloudflare Workers and Claude API opens up endless possibilities for building intelligent applications. Whether you're processing user-generated content, building chatbots, or creating analysis tools, this stack provides the performance, scalability, and intelligence modern applications demand.
Ready to take your skills further? Explore our other tutorials on AI integration, serverless architecture, and production deployment strategies.
Need help building production AI systems? Contact Noqta for expert consulting on AI integration, serverless architecture, and scalable application development.
Discuss Your Project with Us
We're here to help with your web development needs. Schedule a call to discuss your project and how we can assist you.
Let's find the best solutions for your needs.
Related Articles

Getting Started with ALLaM-7B-Instruct-preview
Learn how to use the ALLaM-7B-Instruct-preview model with Python, and how to interact with it from JavaScript via a hosted API (e.g., on Hugging Face Spaces).

An Introduction to GPT-4o and GPT-4o mini
Explore the future of AI with our introduction to GPT-4o and GPT-4o mini, OpenAI's latest multimodal models capable of processing and generating text, audio, and visual content seamlessly.

Build Your Own Code Interpreter with Dynamic Tool Generation
Learn how to create a custom code interpreter using dynamic tool generation and execution with o3-mini, enabling flexible and adaptive problem-solving.