Eric Cox
I build the infrastructure that makes systems faster and models smarter.
< 2μs
p99 latency
4.2M/sec
peak throughput
12
RL environments shipped
4
domains scaled
Currently building RL training environments and high-performance infrastructure for AI systems.
Interested in hard problems at the intersection of model intelligence, systems performance, and scale.
Case Studies
Problem
Legacy Java matching engine. 340μs mean latency, 1.2ms p99. Losing execution priority on co-located venues. Estimated cost: $2.1M/year in missed fills on a single strategy.
Approach
Ground-up C++23 rewrite. Kernel-bypass networking via custom DPDK integration. Lock-free order book with cache-line-aligned price levels. Arena allocator eliminating all hot-path allocations. CPU-pinned threads with isolated NUMA nodes. 14 months.
Result
0.8μs mean, 1.8μs p99. 4.2M events/sec throughput. Captured 23% more spread opportunities. System paid for itself in 4 months.
Problem
Checkout system couldn't survive peak traffic. Black Friday: 12% transaction failure rate above 50K concurrent users. Cart abandonment hit 41%. Direct revenue loss: $3.8M in a single weekend.
Approach
Rebuilt the checkout pipeline end-to-end. Replaced synchronous microservices with event-driven architecture using a custom LMAX Disruptor pattern. Pre-computed inventory reservations. Payment tokenization at the edge. Reduced the critical path from 14 service hops to 3.
Result
500K concurrent users, zero failed transactions. Checkout p50: 47ms, p99: 89ms. Next Black Friday: 99.99% success rate, $28M processed in 4 hours.
Problem
Quant research team running batch jobs overnight. 8-hour delay between market close and signal generation. Intraday opportunities invisible. Models always looking at yesterday.
Approach
Built a streaming pipeline: market data → feature extraction → model inference → signal output. Custom binary protocol for IPC with zero-copy shared memory. GPU-accelerated inference for ensemble models. Automatic failover with sub-second recovery.
Result
8 hours → 12ms end-to-end signal latency. Enabled 3 new intraday strategies. 850K features/sec across 4,000 instruments. Sharpe ratio improvement of 0.4 across the portfolio.
Problem
LLM fine-tuning across specialized domains was bottlenecked by environment quality. Reward models were brittle — high scores didn't correlate with actual task performance. Data vendors delivered inconsistent quality, and the team spent more time debugging reward hacking than improving capabilities. New domain onboarding took 6+ weeks.
Approach
Designed and built a scalable RL environment framework. Standardized the domain onboarding pipeline: task specification → data collection → reward calibration → QA → training integration. Built automated reward hacking detection (distributional analysis of reward signals, adversarial probing for reward shortcuts). Established vendor evaluation rubrics with rapid iteration cycles. Created domain-specific evaluation suites that measured real-world task completion, not proxy metrics.
Result
Domain onboarding: 6 weeks → 8 days. Shipped 12 RL environments across finance, legal, medical, and code domains. Reward-task correlation improved from 0.31 to 0.89. Model win rate on human eval increased 34% across targeted capabilities. Managed 4 vendor partnerships with weekly delivery cadence.
Perspective
On Performance
The difference between 10 microseconds and 1 microsecond isn't 9 microseconds. It's the difference between seeing an opportunity and watching someone else take it. Performance isn't an optimization phase — it's an architectural decision that compounds at every layer. You can't bolt it on later. Either it's in the foundation, or it's not in the building.
On Systems
A system isn't fast because of any single optimization. It's fast because every layer was designed by someone who understood the layer below it. Memory layout informs data structure choice. Data structure choice informs algorithm design. Algorithm design informs system architecture. There are no shortcuts. Only understanding, or the absence of it.
On Training AI
A model is only as good as the environment it learned in. The hardest part of RL isn't the algorithm — it's building reward functions that actually measure what you care about, catching the shortcuts before the model finds them, and knowing when your evaluation suite is lying to you. Getting this right requires domain depth, engineering rigor, and a paranoid attention to data quality. Most people underestimate how much infrastructure sits between "we have a model" and "the model works."
On Value
I don't write code. I build competitive advantages that happen to be implemented in code. The best systems I've built aren't impressive because of their technology — they're impressive because they made something possible that wasn't possible before. The compiler doesn't care about your business model. But I do.