RAG Pipeline
Features

RAG Pipeline

Index your codebase so the AI has relevant context during every review.

How it works

At index time, Merlin AI Code Review chunks your source files and embeds them using OpenAI or Ollama. At review time, each diff chunk is embedded and the nearest stored documents are retrieved and prepended to the AI prompt — giving the model context about your conventions, related code, and past issues.

Quick start (zero infra, OpenAI embedder)

merlin.toml
toml
[rag]
enabled = true
embedder = "openai" # "openai" | "ollama"
embed_model = "text-embedding-3-small"
store = "local" # JSONL flat file, zero setup
shell
$ OPENAI_API_KEY=sk-... merlin rag index . # index current directory
$ merlin review # review now has RAG context

Quick start (Ollama, local embedder)

merlin.toml
toml
[rag]
enabled = true
embedder = "ollama"
embed_model = "nomic-embed-text"
store = "local"
shell
$ ollama pull nomic-embed-text # one-time
$ merlin rag index . # index current directory
$ merlin review # review now has RAG context

Vector store backends

StoreConfig keySetupBest for
Local JSONLlocalNoneSmall repos, dev/CI
MemorymemoryNoneTesting
QdrantqdrantDocker or Qdrant CloudProduction self-hosted
ChromaDBchromaDocker or pip installOpen-source alternative
Pineconepineconecloud.pinecone.io accountManaged cloud

Qdrant setup

shell
# Start Qdrant
$ docker run -p 6333:6333 qdrant/qdrant
merlin.toml
toml
[rag]
enabled = true
store = "qdrant"
qdrant_url = "http://localhost:6333"
# qdrant_api_key = "" # for Qdrant Cloud

Pinecone setup

merlin.toml
toml
[rag]
enabled = true
store = "pinecone"
pinecone_host = "https://my-index-xyz.svc.us-east1.pinecone.io"
# pinecone_api_key = "" # or set PINECONE_API_KEY env var

Pinecone index setup

Create your Pinecone index manually in the console first. Choose cosine metric and match the dimension of your embedding model (768 for nomic-embed-text, 1536 for text-embedding-3-small).

CLI commands

shell
# Index a directory (walks all files matching index_extensions)
$ merlin rag index .
$ merlin rag index src/
# Search the index
$ merlin rag search "authentication bypass"
$ merlin rag search "SQL injection" -k 10
# Manage the index
$ merlin rag count # how many chunks are indexed
$ merlin rag clear # delete all data in the collection

Caching the RAG index in CI

The local JSONL index can be cached between CI runs so you only pay the embedding cost on the first run or when source files change.

.github/workflows/merlin-review.yml
yaml
- name: Cache RAG index
uses: actions/cache@v4
with:
path: merlin-rag.jsonl
key: merlin-rag-${{ hashFiles('src/**', 'lib/**') }}
restore-keys: merlin-rag-
- name: Build RAG index (first run only)
run: test -f merlin-rag.jsonl || merlin rag index .
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}

Configuration reference

merlin.toml
toml
[rag]
enabled = true
embedder = "openai" # "openai" | "ollama"
embed_model = "text-embedding-3-small"
store = "local"
collection = "merlin" # collection/namespace name
top_k = 5 # documents retrieved per query
min_score = 0.70 # minimum cosine similarity
chunk_lines = 80 # lines per indexed chunk
index_extensions = [".rs", ".ts", ".py", ".go", ".java", ".md"]
local_path = "merlin-rag.jsonl"
# Ollama embedder
# embedder = "ollama"
# embed_model = "nomic-embed-text"
# ollama_base_url = "http://localhost:11434"

Indexing past review comments

Merlin AI Code Review can also index past AI review comments so future reviews learn from them. The comment_to_doc() function in src/rag/indexer.rs converts a past comment into a Document that can be upserted via pipeline.index_documents(). This is a programmatic API — CI integration is planned for a future release.