Building AI-Powered Features

Adding LLM capabilities to your app — API integration, streaming responses, prompt management, and cost control.

When to Add AI Features

AI features make sense when your product needs:

Text generation — Writing, summarization, translation
Analysis — Categorization, sentiment, extraction
Search — Semantic search, Q&A over documents
Conversation — Chatbots, assistants, support agents
Code — Code generation, review, debugging

If the feature can be solved with simple rules or a database query, skip the AI. LLMs add latency, cost, and unpredictability.

Choosing a Model

Provider	Best Models	Best For
Anthropic	Claude Sonnet 4, Claude Haiku	Complex reasoning, long context, structured output
OpenAI	GPT-4o, GPT-4o mini	General purpose, image understanding
Google	Gemini 2.5 Pro, Gemini 2.5 Flash	Multimodal, long context

Recommendation: Start with Claude Haiku or GPT-4o mini for cost-sensitive features. Upgrade to Claude Sonnet 4 or GPT-4o for complex tasks.

Basic API Integration

Anthropic (Claude)

npm install @anthropic-ai/sdk

// lib/ai.ts
import Anthropic from '@anthropic-ai/sdk'

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY!,
})

export async function generateSummary(text: string): Promise<string> {
  const message = await anthropic.messages.create({
    model: 'claude-sonnet-4-20250514',
    max_tokens: 1024,
    messages: [
      {
        role: 'user',
        content: `Summarize the following text in 2-3 sentences:\n\n${text}`,
      },
    ],
  })

  return (message.content[0] as { type: 'text'; text: string }).text
}

API Route

// app/api/summarize/route.ts
import { generateSummary } from '@/lib/ai'
import { createClient } from '@/lib/supabase/server'
import { NextResponse } from 'next/server'

export async function POST(request: Request) {
  // Authenticate the request
  const supabase = await createClient()
  const { data: { user } } = await supabase.auth.getUser()
  if (!user) return NextResponse.json({ error: 'Unauthorized' }, { status: 401 })

  const { text } = await request.json()
  if (!text || text.length > 10000) {
    return NextResponse.json({ error: 'Invalid input' }, { status: 400 })
  }

  const summary = await generateSummary(text)
  return NextResponse.json({ summary })
}

Streaming Responses

For chat or long-form generation, stream the response so users see text as it's generated:

Server-Side Streaming

// app/api/chat/route.ts
import Anthropic from '@anthropic-ai/sdk'
import { createClient } from '@/lib/supabase/server'

const anthropic = new Anthropic()

export async function POST(request: Request) {
  const supabase = await createClient()
  const { data: { user } } = await supabase.auth.getUser()
  if (!user) return new Response('Unauthorized', { status: 401 })

  const { messages } = await request.json()

  const stream = anthropic.messages.stream({
    model: 'claude-sonnet-4-20250514',
    max_tokens: 2048,
    messages,
  })

  return new Response(stream.toReadableStream(), {
    headers: { 'Content-Type': 'text/event-stream' },
  })
}

Client-Side Streaming

async function sendMessage(messages: Message[]) {
  const response = await fetch('/api/chat', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ messages }),
  })

  const reader = response.body!.getReader()
  const decoder = new TextDecoder()

  while (true) {
    const { done, value } = await reader.read()
    if (done) break
    const chunk = decoder.decode(value)
    // Append chunk to your UI state
    setResponse((prev) => prev + chunk)
  }
}

Structured Output

When you need the AI to return data in a specific format:

Using Tool Use (Function Calling)

const message = await anthropic.messages.create({
  model: 'claude-sonnet-4-20250514',
  max_tokens: 1024,
  tools: [
    {
      name: 'categorize_feedback',
      description: 'Categorize user feedback into structured data',
      input_schema: {
        type: 'object',
        properties: {
          category: {
            type: 'string',
            enum: ['bug', 'feature_request', 'ux_issue', 'praise', 'other'],
          },
          priority: {
            type: 'string',
            enum: ['low', 'medium', 'high', 'critical'],
          },
          summary: {
            type: 'string',
            description: 'One-sentence summary',
          },
        },
        required: ['category', 'priority', 'summary'],
      },
    },
  ],
  tool_choice: { type: 'tool', name: 'categorize_feedback' },
  messages: [
    {
      role: 'user',
      content: `Categorize this feedback: "${feedbackText}"`,
    },
  ],
})

Prompt Management

Keep prompts in separate files

// prompts/summarize.ts
export const SUMMARIZE_PROMPT = `You are a helpful assistant that creates concise summaries.

Rules:
- Summarize in 2-3 sentences
- Focus on key facts and conclusions
- Use clear, simple language
- Don't start with "This text..." or "The article..."

Text to summarize:
{{text}}`

export function buildSummarizePrompt(text: string): string {
  return SUMMARIZE_PROMPT.replace('{{text}}', text)
}

Use system prompts for behavior

const message = await anthropic.messages.create({
  model: 'claude-sonnet-4-20250514',
  max_tokens: 1024,
  system: `You are a customer support assistant for MyApp.
You help users with their accounts, billing, and features.
Be concise, friendly, and helpful.
If you don't know something, say so rather than guessing.`,
  messages: conversationHistory,
})

Cost Control

Strategies to manage costs

Choose the right model — Use smaller models (Haiku, GPT-4o mini) for simple tasks
Set max_tokens — Don't request more tokens than you need
Cache responses — Store results for identical or similar queries
Rate limit per user — Prevent abuse

Simple rate limiting

// lib/rate-limit.ts
const userRequestCounts = new Map<string, { count: number; resetAt: number }>()

export function checkRateLimit(userId: string, limit = 20): boolean {
  const now = Date.now()
  const record = userRequestCounts.get(userId)

  if (!record || now > record.resetAt) {
    userRequestCounts.set(userId, { count: 1, resetAt: now + 60 * 60 * 1000 })
    return true
  }

  if (record.count >= limit) return false
  record.count++
  return true
}

Tracking usage

Log every API call to monitor costs:

// After each API call
await supabase.from('ai_usage').insert({
  user_id: user.id,
  model: 'claude-sonnet-4-20250514',
  input_tokens: message.usage.input_tokens,
  output_tokens: message.usage.output_tokens,
  feature: 'summarize',
})

Cost estimation

Model	Input (per 1M tokens)	Output (per 1M tokens)
Claude Haiku	$0.25	$1.25
Claude Sonnet 4	$3.00	$15.00
GPT-4o mini	$0.15	$0.60
GPT-4o	$2.50	$10.00

A typical user message is ~100 tokens. A typical response is ~200-500 tokens.

Security Considerations

Never expose API keys client-side — Always call LLM APIs from your server
Validate and sanitize input — Set max length, strip HTML/scripts
Don't pass raw user input as system prompts — This enables prompt injection
Log but don't store sensitive data — Be careful with PII in AI conversations
Implement content filtering — Check AI outputs for inappropriate content before displaying

Building AI-Powered Features

Building AI-Powered Features

When to Add AI Features

Choosing a Model

Basic API Integration

Anthropic (Claude)

API Route

Streaming Responses

Server-Side Streaming

Client-Side Streaming

Structured Output

Using Tool Use (Function Calling)

Prompt Management

Keep prompts in separate files

Use system prompts for behavior

Cost Control

Strategies to manage costs

Simple rate limiting

Tracking usage

Cost estimation

Security Considerations

Related Topics in Guides