RAG Pipeline
Index your codebase so the AI has relevant context during every review.
How it works
At index time, Merlin AI Code Review chunks your source files and embeds them using OpenAI or Ollama. At review time, each diff chunk is embedded and the nearest stored documents are retrieved and prepended to the AI prompt — giving the model context about your conventions, related code, and past issues.
Quick start (zero infra, OpenAI embedder)
[rag]enabled = trueembedder = "openai" # "openai" | "ollama"embed_model = "text-embedding-3-small"store = "local" # JSONL flat file, zero setup
$ OPENAI_API_KEY=sk-... merlin rag index . # index current directory$ merlin review # review now has RAG context
Quick start (Ollama, local embedder)
[rag]enabled = trueembedder = "ollama"embed_model = "nomic-embed-text"store = "local"
$ ollama pull nomic-embed-text # one-time$ merlin rag index . # index current directory$ merlin review # review now has RAG context
Vector store backends
| Store | Config key | Setup | Best for |
|---|---|---|---|
| Local JSONL | local | None | Small repos, dev/CI |
| Memory | memory | None | Testing |
| Qdrant | qdrant | Docker or Qdrant Cloud | Production self-hosted |
| ChromaDB | chroma | Docker or pip install | Open-source alternative |
| Pinecone | pinecone | cloud.pinecone.io account | Managed cloud |
Qdrant setup
# Start Qdrant$ docker run -p 6333:6333 qdrant/qdrant
[rag]enabled = truestore = "qdrant"qdrant_url = "http://localhost:6333"# qdrant_api_key = "" # for Qdrant Cloud
Pinecone setup
[rag]enabled = truestore = "pinecone"pinecone_host = "https://my-index-xyz.svc.us-east1.pinecone.io"# pinecone_api_key = "" # or set PINECONE_API_KEY env var
Pinecone index setup
Create your Pinecone index manually in the console first. Choose cosine metric and match the dimension of your embedding model (768 for nomic-embed-text, 1536 for text-embedding-3-small).
CLI commands
# Index a directory (walks all files matching index_extensions)$ merlin rag index .$ merlin rag index src/# Search the index$ merlin rag search "authentication bypass"$ merlin rag search "SQL injection" -k 10# Manage the index$ merlin rag count # how many chunks are indexed$ merlin rag clear # delete all data in the collection
Caching the RAG index in CI
The local JSONL index can be cached between CI runs so you only pay the embedding cost on the first run or when source files change.
- name: Cache RAG indexuses: actions/cache@v4with:path: merlin-rag.jsonlkey: merlin-rag-${{ hashFiles('src/**', 'lib/**') }}restore-keys: merlin-rag-- name: Build RAG index (first run only)run: test -f merlin-rag.jsonl || merlin rag index .env:OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
Configuration reference
[rag]enabled = trueembedder = "openai" # "openai" | "ollama"embed_model = "text-embedding-3-small"store = "local"collection = "merlin" # collection/namespace nametop_k = 5 # documents retrieved per querymin_score = 0.70 # minimum cosine similaritychunk_lines = 80 # lines per indexed chunkindex_extensions = [".rs", ".ts", ".py", ".go", ".java", ".md"]local_path = "merlin-rag.jsonl"# Ollama embedder# embedder = "ollama"# embed_model = "nomic-embed-text"# ollama_base_url = "http://localhost:11434"
Indexing past review comments
Merlin AI Code Review can also index past AI review comments so future reviews learn from them. The comment_to_doc() function in src/rag/indexer.rs converts a past comment into a Document that can be upserted via pipeline.index_documents(). This is a programmatic API — CI integration is planned for a future release.