Vector Databases
Embeddings, indexes, filters, and performance trade-offs.
Core Decisions
Index
HNSW, IVF-PQ, Flat (exact), DiskANN
Filters
Metadata pre/post-filter, hybrid keyword
Rerank
Cross-encoder or lightweight MMR
Freshness
Upserts, TTL, delta indexes
Latency vs Cost
- Smaller embeddings → cheaper RAM, faster scans
- Quantization/IVF → speed with recall trade-off
- Batch queries & cache popular items
Metadata Strategy
- doc_id, section, product, audience, region
- date, version, security_tag
- Use filters to narrow before vector search where possible