Uber-Lite: Architecting High-Scale Geo-Spatial Matchmaking Systems

Lesson 15: Batching Physics - The Record Accumulator's Role in High-Velocity Geo-Spatial Streams

Mar 17, 2026

The Naive Approach: Why 3,000 Events/Sec Crashes Your System

A junior engineer building a ride-hailing location tracker writes this:

java

producer.send(new ProducerRecord<>("driver-locations", driverId, location));

Every GPS ping—every 1-2 seconds from 3,000 active drivers—triggers an immediate network call. This creates:

6,000+ syscalls/sec: Each send() invokes kernel-level socket operations
TCP packet waste: A 200-byte location update occupies a 1,500-byte Ethernet frame (87% wasted bandwidth)
Broker CPU saturation: Kafka spends 70% of cycles on TCP handshakes, not data processing
Producer blocking: When network buffers fill, threads stall waiting for I/O completion

At 10,000 drivers, the system collapses. The NIC saturates, latency spikes to 5+ seconds, and match failures cascade.

The Uber-Lite Architecture: RecordAccumulator-Based Batching

Kafka’s producer is not a fire-and-forget client. It’s a sophisticated batching engine built on three components:

1. RecordAccumulator (The Staging Buffer)

A thread-safe buffer that groups records by TopicPartition. When you call producer.send():

The record is serialized (key, value, headers → byte arrays)
A RecordBatch for the target partition is located or created
The serialized bytes are appended to the batch’s MemoryRecords buffer
If batch.size is reached OR linger.ms expires, the batch is ready

2. Sender Thread (The Network Dispatcher)

A single background thread polls the accumulator for ready batches and:

Groups batches by broker node (partition leader affinity)
Issues one TCP request per broker, multiplexing up to max.in.flight.requests.per.connection (default: 5)
Uses Java NIO Selector for non-blocking I/O

3. MemoryPool (Backpressure Control)

A fixed-size buffer (buffer.memory=64MB default). If all memory is allocated to pending batches:

send() blocks until memory is freed (bad for Virtual Threads)
Or throws TimeoutException after max.block.ms (default: 60s)

The Key Insight: By batching 50-100 location updates into a 32KB payload, we reduce network overhead from 6,000 syscalls/sec to 60 syscalls/sec (99% reduction).

Implementation Deep Dive

Tuning Batch Size for Geo-Spatial Workloads

A DriverLocationUpdate serializes to ~200 bytes (driverId UUID, H3 index, timestamp, heading). The optimal batch.size calculation:

Target: 150 records/batch (to hit linger.ms before size limit)
Record size: 200 bytes
Batch size: 150 * 200 = 30KB → set batch.size=32768 (32KB)

Why 32KB?

Fits comfortably in Linux default socket send buffer (16KB-64KB)
Aligns with Kafka’s default segment size (1GB / 32KB = 32,000 batches per segment)
Avoids TCP fragmentation (most MTUs are 1,500 bytes; 32KB splits cleanly into 22 packets)

The Linger Dance: Latency vs Throughput

linger.ms controls how long the accumulator waits before sending a non-full batch. For geo-spatial:

linger.ms=0 (default): Send immediately when buffer has data → low latency, low throughput
linger.ms=10: Wait 10ms to collect more records → batches average 80% full
linger.ms=50: Wait 50ms → batches 95% full, but adds 50ms base latency

Uber-Lite Choice: linger.ms=10, batch.size=32KB. With 3,000 drivers @ 1 update/sec:

Expected batch fill rate: 3,000 updates/sec / 100 partitions = 30 updates/sec/partition
Time to fill 32KB batch: (32KB / 200 bytes) / 30 = 5.3 seconds
Actual trigger: linger.ms=10 fires first → batches contain ~0.3 seconds of data (9 records)

This is intentional. We prioritize latency (matches happen in <200ms) over perfect batch packing.

Virtual Threads and Backpressure

Java 21’s Virtual Threads let us handle 10,000+ concurrent producers without kernel thread exhaustion. But blocking on send() still parks the virtual thread:

java

// BAD: Blocks virtual thread on full buffer
producer.send(record); // May block up to max.block.ms=60s

// GOOD: Non-blocking send with callback
producer.send(record, (metadata, exception) -> {
    if (exception != null) {
        // Handle in virtual thread without blocking caller
        errorHandler.accept(exception);
    }
});

With buffer.memory=67108864 (64MB) and batch.size=32768:

Max buffered batches: 64MB / 32KB = 2,048 batches
At 100 partitions: 20 batches per partition
At 30 records/sec/partition: ~13 seconds of buffering before blocking

This provides ample headroom for broker hiccups without stalling producers.

ByteBuffer Mechanics: Zero-Copy Accumulation

Internally, RecordBatch uses ByteBuffer.allocateDirect() for the backing memory. When you append a record:

Serialized key/value are written to the buffer’s current position
Record metadata (offset delta, timestamp delta, headers) are encoded using variable-length encoding
The batch’s CRC32 checksum is updated incrementally

When the batch is ready, the Sender thread:

Calls buffer.flip() to prepare for reading
Passes the buffer to SocketChannel.write() (zero-copy via sendfile() syscall)
Returns the buffer to the MemoryPool after broker acknowledgment

No intermediate copies. The bytes flow from your POJO → serializer → DirectByteBuffer → NIC DMA → broker’s page cache.

Production Metrics: Observing Batching Efficiency

Deploy a JMX exporter to expose these producer metrics:

Core Metrics

batch-size-avg: Average batch size in bytes. Target: 25KB-30KB (80-95% of batch.size=32KB)
records-per-request-avg: Records per TCP request. Target: >50 for geo workloads
record-queue-time-max: Time records spend in accumulator. Should be ~linger.ms value
buffer-available-bytes: Free memory in pool. If <10% → increase buffer.memory or reduce producer count

Red Flags

batch-size-avg < 10KB: Either traffic is too sparse OR linger.ms is too low
record-queue-time-max > 100ms: Broker is slow or linger.ms is misconfigured
buffer-available-bytes = 0: Producer is blocking. Check broker health or scale out

Partitioning Impact

With 100 partitions and uniform hashing (by driverId):

Each partition receives ~30 updates/sec
At linger.ms=10, batches contain 0.3sec * 30 = 9 records
Batch size: 9 * 200 bytes = 1.8KB

This is suboptimal. To fill batches better, we use geo-aware partitioning (covered in Lesson 16), routing drivers in the same H3 Resolution 9 hexagon to the same partition. This increases partition locality, filling batches to 20-25KB.

Step-by-Step Execution

GitHub Link

https://github.com/sysdr/uber-lite/tree/main/lesson15/lesson-15-batching

Prerequisites

Docker & Docker Compose installed
Java 21+ (sdk install java 21-tem)
4GB RAM available

Execution

bash

# Generate project structure
bash project_setup.sh

# Start Kafka cluster
cd lesson-15-batching
docker-compose up -d

# Wait for cluster ready (30 seconds)
sleep 30

# Run the batching producer
./gradlew run

# In another terminal, observe metrics
curl http://localhost:8080/metrics

Verification

The producer emits 3,000 location updates/sec for 60 seconds. Expected output:

Batch Statistics (60s window):
  Total Records Sent: 180,000
  Total Batches: 1,200
  Avg Records/Batch: 150
  Avg Batch Size: 29.8 KB
  P99 Queue Time: 12 ms
  Network Syscalls: 1,200 (vs 180,000 naive)
  Throughput: 99.3% reduction in network calls

Run the verification script: