All posts

Building a Rate Limiter in TypeScript

A practical walkthrough of implementing a sliding window rate limiter using Cloudflare Workers KV.

3 min read
typescriptcloudflareworkers

Rate limiting is one of those things that sounds simple until you actually implement it. A naive counter resets at fixed intervals, which means a burst at the end of one window plus a burst at the start of the next can double your intended limit. A sliding window fixes that.

The sliding window algorithm

Instead of resetting a counter every minute, we track the timestamps of each request and count only those within the last 60 seconds.

type RateLimitResult = {
allowed: boolean;
remaining: number;
resetIn: number; // ms until oldest request falls outside the window
};
function slidingWindow(
timestamps: number[],
windowMs: number,
limit: number,
): RateLimitResult {
const now = Date.now();
const windowStart = now - windowMs;
// Drop timestamps outside the window
const recent = timestamps.filter((t) => t > windowStart);
return {
allowed: recent.length < limit,
remaining: Math.max(0, limit - recent.length),
resetIn: recent.length > 0 ? recent[0]! - windowStart : 0,
};
}

Persisting state with Cloudflare KV

For a distributed system you need shared state. KV is eventually consistent, but for rate limiting that’s an acceptable trade-off — a few extra requests slipping through beats adding a Redis dependency.

async function checkRateLimit(
kv: KVNamespace,
key: string,
limit: number,
windowMs: number,
): Promise<RateLimitResult> {
const stored = (await kv.get(key, "json")) as number[] | null;
const timestamps = stored ?? [];
const result = slidingWindow(timestamps, windowMs, limit);
if (result.allowed) {
const now = Date.now();
const windowStart = now - windowMs;
// Persist only the timestamps still in the window, plus this request
const updated = [...timestamps.filter((t) => t > windowStart), now];
await kv.put(key, JSON.stringify(updated), {
expirationTtl: Math.ceil(windowMs / 1000),
});
}
return result;
}

Wiring it up in a Worker

export default {
async fetch(request: Request, env: Env): Promise<Response> {
const ip = request.headers.get("cf-connecting-ip") ?? "unknown";
const key = `rate:${ip}`;
const { allowed, remaining, resetIn } = await checkRateLimit(
env.RATE_LIMIT_KV,
key,
100, // 100 requests
60 * 1000, // per 60 seconds
);
if (!allowed) {
return new Response("Too Many Requests", {
status: 429,
headers: {
"Retry-After": String(Math.ceil(resetIn / 1000)),
"X-RateLimit-Remaining": "0",
},
});
}
return new Response("OK", {
headers: {
"X-RateLimit-Remaining": String(remaining),
},
});
},
};

A note on atomicity

This implementation has a race condition: two requests can read the same stale list simultaneously and both pass the limit check. For most APIs this is fine — you’re adding friction, not building a vault. If you need hard guarantees, use Durable Objects instead, which give you single-threaded execution per key.

// Durable Object approach — no race condition
export class RateLimiter implements DurableObject {
private timestamps: number[] = [];
async fetch(request: Request): Promise<Response> {
const { allowed, remaining } = slidingWindow(this.timestamps, 60_000, 100);
if (allowed) {
this.timestamps.push(Date.now());
}
return Response.json({ allowed, remaining });
}
}

The tradeoff is cost and latency — each Durable Object request routes to a specific edge location, which can add a round-trip. For most use cases, the KV approach is good enough.


Back to all posts