The Complete Monthly Cost Breakdown
Before I explain the architecture, here's the actual bill. These are March 2026 numbers from my Cloudflare dashboard and Anthropic billing page:
| Component | Service | Usage | Monthly Cost |
|---|---|---|---|
| Compute | Cloudflare Workers | 85K requests (free: 100K/day) | $0.00 |
| Storage | Cloudflare R2 | 47GB (free: 10GB, then $0.015/GB) | $0.55 |
| KV State | Cloudflare KV | 420K reads (free: 1M/day) | $0.00 |
| Database | Supabase Free Tier | 180MB (free: 500MB) | $0.00 |
| AI API | Anthropic Claude Haiku | ~12K requests | $4.28 |
| Cloudflare Email Routing | Unlimited | $0.00 | |
| DNS + CDN | Cloudflare Free | Full CDN, DDoS protection | $0.00 |
| Hosting | Cloudflare Pages | Unlimited builds | $0.00 |
| Total | $4.83 – $6.83 | ||
That's the real number. Compare it to the $312/month I was spending 18 months ago on Zapier ($49), Notion Team ($16), Airtable ($24), ConvertKit ($79), AWS S3 + EC2 ($87), and Vercel Pro ($20). Same capabilities. 97.8% cheaper.
Why Cloudflare Is the Core of Everything
The entire architecture pivots on one insight: Cloudflare's free tier is absurdly generous for solopreneurs. Most infrastructure pricing is designed for enterprises moving terabytes of data and running millions of concurrent connections. For a solo founder shipping AI products to a few thousand customers, you'll stay in the free tier almost indefinitely on compute.
Here's what the Cloudflare free tier actually gives you in 2026:
- Workers: 100,000 requests/day (3M/month) — enough to serve 30,000 monthly active users doing 100 requests each
- KV: 100,000 writes/day, 1,000,000 reads/day — sufficient for user session management and feature flags at serious scale
- R2: 10GB storage free, then $0.015/GB with zero egress fees (vs S3's $0.09/GB egress)
- Pages: Unlimited deployments, global CDN, custom domains — better than Vercel's free tier by a wide margin
- Email Routing: Unlimited email forwarding with programmable routing rules
The only thing Cloudflare doesn't give you for free is the AI API calls themselves. And that's the right model: pay for the intelligence, not the infrastructure.
The Architecture: How the Components Connect
Here's the production architecture. Every one of my 5 digital products runs on some variation of this pattern:
Browser/Client
↓
Cloudflare Pages (static assets, CDN)
↓
Cloudflare Workers (API layer, auth, routing)
↓
┌─────────────────────────────┐
│ Cloudflare KV │ ← session state, feature flags, rate limits
│ Cloudflare R2 │ ← file storage, AI outputs, product assets
│ Supabase PostgreSQL │ ← structured data, orders, user records
└─────────────────────────────┘
↓
Anthropic Claude API (AI calls, on demand)
↓
Cloudflare Email Routing (transactional email triggers)
The key design principle: Workers are stateless compute, storage is separated. Each Worker reads context from KV or Supabase, calls the Claude API if needed, writes results back to R2 or Supabase, and returns a response. No servers, no containers, no infrastructure to manage.
Setting Up the Workers Layer
This is where most people get stuck. Here's the minimal setup that actually works in production:
# Install Wrangler CLI (use v4, not v3)
npm install -g wrangler@latest
# Create a new Worker project
wrangler init my-ai-worker --type javascript
# wrangler.toml — the complete config I use
name = "my-ai-worker"
main = "src/index.js"
compatibility_date = "2026-01-01"
# KV namespace binding
[[kv_namespaces]]
binding = "SESSIONS"
id = "your_kv_namespace_id"
# R2 bucket binding
[[r2_buckets]]
binding = "ASSETS"
bucket_name = "my-ai-assets"
# Environment variables (set via wrangler secret)
# ANTHROPIC_API_KEY = "sk-ant-..."
The Worker itself follows a simple pattern for AI-powered endpoints:
// src/index.js — Production AI Worker pattern
export default {
async fetch(request, env, ctx) {
const url = new URL(request.url);
// Auth check via KV
const sessionToken = request.headers.get('Authorization')?.split(' ')[1];
const session = await env.SESSIONS.get(`session:${sessionToken}`, 'json');
if (!session) return new Response('Unauthorized', { status: 401 });
// Rate limiting via KV
const rateLimitKey = `rl:${session.userId}:${Date.now() / 60000 | 0}`;
const requestCount = parseInt(await env.SESSIONS.get(rateLimitKey) || '0');
if (requestCount > 20) return new Response('Rate limited', { status: 429 });
await env.SESSIONS.put(rateLimitKey, String(requestCount + 1), { expirationTtl: 120 });
// Parse request
const { prompt, context } = await request.json();
// Call Claude API
const response = await fetch('https://api.anthropic.com/v1/messages', {
method: 'POST',
headers: {
'x-api-key': env.ANTHROPIC_API_KEY,
'anthropic-version': '2023-06-01',
'content-type': 'application/json',
},
body: JSON.stringify({
model: 'claude-haiku-4-5', // cheapest, fastest
max_tokens: 1024,
messages: [{ role: 'user', content: prompt }]
})
});
const result = await response.json();
const aiOutput = result.content[0].text;
// Cache result in R2 if needed
const cacheKey = `outputs/${session.userId}/${Date.now()}.json`;
await env.ASSETS.put(cacheKey, JSON.stringify({ prompt, output: aiOutput }));
return Response.json({ output: aiOutput });
}
};
The Claude API Cost Model: Why Haiku Changes Everything
I use Claude Haiku for 90% of my AI calls. Here's why the math works:
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Typical request cost |
|---|---|---|---|
| Claude Haiku 4.5 | $0.80 | $4.00 | ~$0.0004 |
| Claude Sonnet 4.6 | $3.00 | $15.00 | ~$0.0015 |
| GPT-4o mini | $0.60 | $2.40 | ~$0.0003 |
| GPT-4o | $5.00 | $15.00 | ~$0.003 |
For a typical solopreneur product — let's say an AI writing tool that generates a 300-word response from a 200-token prompt — Claude Haiku costs roughly $0.0004 per use. At 10,000 monthly uses, that's $4/month. At 100,000 uses, you're at $40/month, and at that volume you're generating serious revenue to cover it.
The smart play: use Haiku for initial generation and classification tasks, route complex reasoning to Sonnet only when quality demands it. My 90/10 split (Haiku/Sonnet) keeps costs predictable while maintaining output quality.
R2 + Supabase: The Storage Layer
I use R2 and Supabase for different purposes, and the distinction matters:
R2 stores blobs: AI-generated content, PDFs, images, audio files, cached API responses. It's an S3-compatible object store with zero egress fees. I store everything from product delivery files to AI output logs. At $0.015/GB after the free 10GB tier, 50GB costs $0.60/month.
Supabase handles structured data: orders, users, email sequences, product metadata. The free tier gives you 500MB PostgreSQL storage, which is sufficient for a product doing under $50K/year in revenue. The real-time subscriptions work out of the box with Workers via HTTP — no WebSocket complexity.
// Supabase from a Worker — simple HTTP pattern
async function getUser(userId, env) {
const response = await fetch(
`${env.SUPABASE_URL}/rest/v1/users?id=eq.${userId}&select=*`,
{
headers: {
'apikey': env.SUPABASE_ANON_KEY,
'Authorization': `Bearer ${env.SUPABASE_SERVICE_KEY}`,
}
}
);
const [user] = await response.json();
return user;
}
The 9 Workers Running My Business
Every business function runs as a dedicated Worker. This is the production list:
- browning-sales-engine — PayPal webhook handler, order processing, license generation
- browning-delivery — Secure product download URLs with expiry, R2 signed URLs
- browning-email-worker — Transactional email sequences triggered by Supabase events
- browning-inbox-agent — AI-powered email triage, Claude classifies and routes support tickets
- browning-supervisor — Health monitoring, uptime checks, error rate alerts
- browning-ops-api — Internal dashboard API, revenue data, metrics aggregation
- content-queue-poster — Automated content publishing to social channels
- seo-worker — Keyword tracking, AI citation monitoring, content performance
- ai-relay — Unified AI routing layer, model selection, cost tracking
Each Worker is under 200 lines of code. Each does exactly one thing. The result is a system that's trivial to debug, easy to deploy, and completely observable.
What This Stack Can't Do (Be Honest About Limits)
This architecture has real limits. Workers have a 10ms CPU time limit on the free tier (upgradeable to 30 seconds on $5/month paid tier). If you need long-running AI tasks — fine-tuning, batch processing, video generation — you'll need a different compute layer. I use Cloudflare Workers for real-time request handling and offload anything requiring more than 5 seconds of processing to a background queue.
Supabase free tier also has connection limits (20 concurrent connections) and goes to sleep after a week of inactivity. For a product in active use, this isn't a problem. For something dormant for weeks at a time, you'll need to handle cold start gracefully.
How to Start: The 30-Minute Setup
If you want to replicate this stack, here's the exact sequence:
- Create a Cloudflare account (free) — get your account ID from the dashboard
- Install Wrangler 4:
npm install -g wrangler@latest - Run
wrangler loginto authenticate - Create your first Worker:
wrangler init my-worker - Create an R2 bucket:
wrangler r2 bucket create my-bucket - Create a KV namespace:
wrangler kv:namespace create MY_KV - Get a Claude API key from console.anthropic.com
- Set your secret:
wrangler secret put ANTHROPIC_API_KEY - Deploy:
wrangler deploy
That's the entire infrastructure footprint. No servers, no DevOps, no load balancers. The first request hits a Worker running in the datacenter closest to your user within 50ms, globally.
Get the Complete Kit — Pre-Built and Ready to Deploy
I've packaged the exact Worker templates, wrangler configs, Supabase schemas, and Claude API integration patterns from this guide into the Zero-Cost AI Kit. Instead of spending 6 hours setting this up from scratch, you get the production-ready boilerplate and have your first AI Worker live in under an hour.
Get the Zero-Cost AI Kit — $47