Build notes, what broke, and the route ahead — written from inside the work, sent to your inbox. Algorithm-proof, once a week.
2,140 engineers on the expedition · no hype · unsubscribe anytime
Intuition said the retrieval was fine. The harness disagreed — and it was right. A walkthrough of the exact metrics that caught a silent regression before it shipped.
Most RAG pipelines stop at retrieval. One cross-encoder re-ranking pass — under 100 lines — dropped our irrelevance rate by a third. Here's the implementation and why it's almost always worth the inference cost.
Not frameworks, not APIs. The six conceptual shifts — eval-first thinking, token economics, retrieval intuition, latency tradeoffs, failure-mode reasoning, and observability — that define the role in production.