Hybrid RAG Pipeline for Enterprise Docs | Ridhwan Amin

Problem

A leading energy company needed a Q&A system over 50,000+ internal technical documents — drilling reports, well completion records, and regulatory filings. Off-the-shelf dense retrieval systems failed on exact technical terminology (e.g. well IDs, formation names, equipment codes) where BM25-style exact match is critical.

Approach

The pipeline uses a two-stage hybrid retrieval strategy:

Stage 1 — parallel retrieval: FAISS HNSW index for dense semantic search + BM25 Elasticsearch index for keyword match. Top-k from each (k=20) are merged and deduplicated.
Stage 2 — reranking: A cross-encoder (ms-marco-MiniLM-L-6-v2 fine-tuned on domain data) scores all 40 candidates and returns the top 5.
Stage 3 — generation: GPT-4o produces the final answer with retrieved context, citing source document IDs.

Results

Evaluated on a manually curated 200-question benchmark:

Answer faithfulness (human-rated): 78% (hybrid) vs 66% (dense-only) — +12 pp
Retrieval recall@5: 0.84 vs 0.71
Latency (p95): 1.8s end-to-end (including reranker)

Tech stack

LangChain, FAISS, Elasticsearch 8, FastAPI, OpenAI API, HuggingFace Transformers, Docker, Redis (answer cache).