independent inference research

Coconut Labs

Schedulers, systems notes, and reproducible measurements for shared inference.

KVWarden Gate 2: 1.14× of solo TTFT under load. 26× better than FIFO.

Coconut Labs works on the shared layer of inference: scheduling, fairness, cache pressure, and the measurements that keep claims honest.

The lab is small by design. Fewer abstractions between the benchmark, the note, and the code.

The quiet tenant should still have a name.

people

Two people, close to the work.

Coconut Labs is Shrey Patel and Jay Patel, building public research around inference scheduling, fairness, and systems measurement.

Shrey Patel

Co-founder · Engineer

Shrey Patel

Engineer and writer. Builds inference middleware between LLMs and the GPUs they run on. Currently building Coconut Labs.

Jay Patel

Co-founder · Engineer

Jay Patel

Engineer focused on inference reliability and tenant fairness on shared hardware. Co-founder of Coconut Labs.

How we work

Building something at this layer? Write us.

latest note · 2026-04-19 (1 week ago)14 commits this weekkvwarden gate 2 · 1.14× solo · 26× better than fifo3 repos tracked12 rfc openupdated 1h ago