LoadZen

Building

Self-hosted load testing with real-time metrics and AI-powered diagnosis. Enter a URL, set your load parameters, and get actionable insights — not just raw numbers.

LoadZen is a self-hosted load testing platform built for developers who want production-quality performance insights without enterprise tooling overhead. Enter a target URL, configure concurrent users and test duration, and watch live metrics stream in as k6 runs the actual load generation as a managed subprocess. When the test completes, an AI diagnosis layer interprets the results — surfacing bottlenecks, error rate patterns, and specific recommendations rather than leaving raw p95 latency numbers for you to interpret. The architecture is a pnpm monorepo: Next.js frontend with TanStack Query for server state, a Fastify API, BullMQ and Redis for the job queue, k6 spawned as a subprocess, and PostgreSQL with Drizzle ORM for test history and result persistence.

k6 subprocess runner

BullMQ job orchestration

p95/error-rate diagnosis

Architecture

Next.js Console

Load params and live results

Fastify API

Test creation and auth boundary

BullMQ + Redis

Queued test execution

k6 Runner

Managed subprocess load generation

PostgreSQL

History, metrics, reports

AI Diagnosis

Bottlenecks and next actions

Why I built this

Every load testing tool I tried was either too heavy — enterprise software requiring days of setup — or too light — a shell script that hits a URL and returns average latency. I wanted something that you could self-host in minutes and that gives you real, production-quality signal about what breaks under load and why.

Use case

Developers and small engineering teams testing endpoints before a launch, after a significant deploy, or when diagnosing a performance regression in production. The AI diagnosis layer is the differentiator — instead of staring at a p95 latency chart, you get a written interpretation of what the results mean and what to investigate next.

What I learned

Subprocess orchestration has failure modes you don't think about until they happen in production. What happens when k6 crashes mid-test? When Redis goes down during a run? When the job queue restarts while a test is active? Building resilient error handling around a child process required thinking carefully about partial failure states and what the user should see in each scenario.

Where I got stuck

Cloudflare Workers seemed like the ideal deployment target — cheap, globally distributed, easy to manage. But k6 requires child_process.spawn, a Node.js primitive that doesn't exist in the Workers runtime. That constraint ruled out the serverless deployment model entirely and forced a traditional server architecture, which shaped every infrastructure decision that followed.

Stack used

TypeScript

Next.jsFastifyBullMQ

Redis

PostgreSQLDrizzle ORMk6