Build a Retrieval-Augmented Generation pipeline with hybrid search (vector + keyword) and a reranking step for higher precision answers.
## Task RAG pipeline with hybrid search and reranking for high-precision Q&A. ## Requirements - Vector DB: pgvector, Pinecone, or Qdrant - Embeddings: OpenAI text-embedding-3-small or Cohere embed-v3 - Reranker: Cohere rerank or cross-encoder model - Language: Python or TypeScript ## Pipeline ``` Query → [Hybrid Search] → [Rerank] → [LLM Generate] 1. Hybrid Search (parallel): a. Vector search: embed query → top 20 by cosine similarity b. Keyword search: BM25/FTS on same corpus → top 20 c. Merge results using Reciprocal Rank Fusion (RRF) 2. Rerank: - Take merged top 30 results - Rerank with cross-encoder (query, document) pairs - Keep top 5 3. Generate: - Inject top 5 chunks as context - System prompt: "Answer based only on provided context" - Include source citations ``` ## Implementation Notes 1. Chunk documents at 512 tokens with 50-token overlap 2. Store metadata: source URL, title, chunk index 3. Cache embeddings — don't re-embed on every query 4. Include "I don't have enough information" when context is insufficient 5. Return confidence score based on reranker scores
No gallery images yet.