Skip to content

Inside Gati's Production Runtime: Achieving 172K RPS

How Gati's queue-based architecture and zero-copy design delivers exceptional performance.

The Challenge

Modern backend frameworks struggle with high-throughput scenarios. Express.js tops out around 50K RPS, while Fastify reaches 80K RPS. We needed better.

Queue Fabric Design

Gati uses a multi-stage queue pipeline:

Request → Parse → Route → Execute → Serialize → Response
  6μs     0.5ms    0.5ms   Variable    1ms

Why Queues?

  1. Backpressure handling - Automatic flow control
  2. Worker isolation - Failures don't cascade
  3. Observability - Queue depth = system health
  4. Scalability - Add workers without code changes

Zero-Copy Optimization

Traditional frameworks copy request data multiple times:

typescript
// ❌ Multiple copies
const body = Buffer.from(req.body); // Copy 1
const parsed = JSON.parse(body.toString()); // Copy 2
const validated = schema.parse(parsed); // Copy 3

Gati eliminates copies:

typescript
// ✅ Zero-copy
const body = req.rawBody; // Direct buffer reference
const parsed = fastJsonParse(body); // In-place parsing

Result: 3x faster request processing

Worker Pool Management

Dynamic worker scaling based on load:

typescript
if (queueDepth > threshold) {
  spawnWorker();
} else if (idleWorkers > minWorkers) {
  terminateWorker();
}

Metrics:

  • Spawn time: <10ms
  • Idle timeout: 30s
  • Max workers: 16 per core

Benchmarks

FrameworkRPSp95 LatencyMemory
Express50K45ms250MB
Fastify80K28ms180MB
Gati172K<10ms120MB

Key Takeaways

  1. Queue-based architecture enables backpressure
  2. Zero-copy reduces memory overhead
  3. Worker pools provide isolation
  4. Careful optimization yields 3x performance

Try it: npx gatic create my-app

Released under the MIT License.