Semantic search
Query Embedding -> Nearest Neighbor Search
Nearest Neighbor queries: naive method
*full scan
*sort
*top-k
*Inefficient
Nearest Neighbor queries: Index-based method
definition of RAG
Retrieval Augmented Generation
what does Retrieval Augmented Generation do when given a query
*Create a query embedding
*Run an approximate nearest neighbor (ANN) query
to find the most relevant documents
*With an appropriate prompt, feed the documents
through a generative AI model to get the final
answer
Vector Indexing
Locality Sensitive Hashing (LSH)
Product Quantization (PQ) Operations
*Break down a big vector into subvectors
*Quantize the value of each subvector - Reduce the size of each vector
*Store a list of vector IDs (record IDs) and their corresponding quantized vector
*For a query vector (q), do the same quantization
*Compute the distances to the quantized vectors and the find the nearest one
Navigable Small World (NSW)
basic concept
*Connect each data point to its
nearest neighbors
*Connect each data point to a few
faraway neighbors
*Search: Follow the edges to the
target point
Hierarchical Navigable Small World (HNSW)
basic concept
*Create a hierarchy of graphs
*Each vertex is elevated to the higher level with a probability
*Search starts at the highest level and goes to the next level when the search converges
Key indexing tools:
LSH, HNSW, Product Quantization (PQ)
Vector DBs integrate what
embedding storage, ANN search, and RAG