Lesson 3 of 3
Building a Vector Search Pipeline
45 min
End-to-End Vector Search
Now let's put it all together: document loading, chunking, embedding, and searching. This is the foundation of every RAG system.
The Complete Pipeline
pythonA minimal but complete RAG pipeline
# Complete RAG Pipeline with ChromaDB
import chromadb
from openai import OpenAI
client = OpenAI()
chroma = chromadb.Client()
# 1. Create a collection (like a table for vectors)
collection = chroma.create_collection(name="my_docs")
# 2. Add documents (ChromaDB handles embeddings!)
documents = [
"The quick brown fox jumps over the lazy dog",
"Machine learning is a subset of artificial intelligence",
"Python is a popular programming language for data science",
"Neural networks are inspired by the human brain",
]
collection.add(
documents=documents,
ids=[f"doc_{i}" for i in range(len(documents))]
)
# 3. Query the collection
results = collection.query(
query_texts=["What is AI?"],
n_results=2
)
print("Most relevant documents:")
for doc in results['documents'][0]:
print(f" - {doc}")⚠️In production, use a persistent vector store like Pinecone, Weaviate, or Qdrant. ChromaDB is great for prototyping.
Improving Retrieval Quality
**Techniques for better retrieval:**
1. **Hybrid Search:** Combine vector + keyword search 2. **Re-ranking:** Use a second model to score relevance 3. **Query Expansion:** Rephrase queries multiple ways 4. **Metadata Filtering:** Filter by date, source, category 5. **Parent-Child Chunking:** Retrieve chunks, return parent docs
Vector Search Check
Question 1 of 2What is cosine similarity used for in RAG?