RAG-over-Docs, re-ranked — inferLearn Field Log

← Back to Field Log

Log Entry #3 · Nov 28

Hybrid retrieval degraded recall on long-tail queries. Time to fix it.

Problem: When BM25 and Vector scores were combined using RRF (Reciprocal Rank Fusion), the long-tail semantic queries dropped out of the top-K.
What we tried: Added a cross-encoder reranker at the final step (Cohere Rerank API).
Result: +18% retrieval accuracy and bounded latency.
Next: Caching identical queries to drop latency further.

Log Entry #2 · Nov 20

Attempted semantic chunking using spaCy boundaries.

Problem: Chunking by raw character count was splitting tables and code blocks down the middle.
What we tried: Implemented a semantic router that detects headers and keeps blocks intact.
Result: Context window density improved significantly.
Next: Implement BM25 for keyword lookup.

Log Entry #1 · Nov 15

Baseline created.

Goal: Set up a simple LangChain pipeline to load the documentation.
Result: Pipeline functional. Latency is currently 4.1s (too slow).