AI and Machine Learning

Building AI Search with Hybrid Retrieval Strategies

By Mohd Baquir Qureshi
Building AI Search with Hybrid Retrieval

Pure vector similarity search works well for finding semantically related content, but it can miss results that match specific keywords or technical terms. Pure keyword search finds exact matches but misses synonyms and related concepts. Hybrid search combines both approaches to deliver the most accurate results.

How Hybrid Search Works

Hybrid search runs two parallel searches for every query: a vector similarity search using embeddings and a keyword-based BM25 search. The results from both searches are combined using a reciprocal rank fusion (RRF) algorithm that merges the two ranked lists into a single, optimized ranking.

The RRF algorithm works by assigning each result a score based on its rank in each individual search. Results that appear highly ranked in both searches get the highest combined score. Results that appear in only one search still make it into the final list but with a lower score.

Implementation with PostgreSQL

PostgreSQL is uniquely suited for hybrid search because it supports both vector search (via pgvector) and full-text search (via tsvector) natively. You can run both searches in a single database query, which simplifies your architecture and reduces latency compared to using separate search services.

I create a table with both a vector column for embeddings and a tsvector column for full-text search, then combine the results using a CTE that performs both searches and merges them with RRF scoring.

When to Use Hybrid Search

Hybrid search is particularly valuable for technical documentation, product catalogs, and knowledge bases where users may search using both natural language questions and specific technical terms. For purely conversational search (like chatbot queries), vector-only search is often sufficient.

Further Reading

For more detailed technical specifications and updates, refer to the OpenAI API Documentation.