RAG Pipeline
Features

RAG Pipeline

Index your codebase so the AI has relevant context during every review.

How it works

At index time, Ferret chunks your source files and embeds them with Ollama. At review time, each diff chunk is embedded and the nearest stored documents are retrieved and prepended to the AI prompt — giving the model context about your conventions, related code, and past issues.

Quick start (zero infra)

ferret.toml
toml
[rag]
enabled = true
store = "local" # JSONL flat file, zero setup
embed_model = "nomic-embed-text"
shell
$ ollama pull nomic-embed-text # one-time
$ ferret rag index . # index current directory
$ ferret review # review now has RAG context

Vector store backends

StoreConfig keySetupBest for
Local JSONLlocalNoneSmall repos, dev/CI
MemorymemoryNoneTesting
QdrantqdrantDocker or Qdrant CloudProduction self-hosted
ChromaDBchromaDocker or pip installOpen-source alternative
Pineconepineconecloud.pinecone.io accountManaged cloud

Qdrant setup

shell
# Start Qdrant
$ docker run -p 6333:6333 qdrant/qdrant
ferret.toml
toml
[rag]
enabled = true
store = "qdrant"
qdrant_url = "http://localhost:6333"
# qdrant_api_key = "" # for Qdrant Cloud

Pinecone setup

ferret.toml
toml
[rag]
enabled = true
store = "pinecone"
pinecone_host = "https://my-index-xyz.svc.us-east1.pinecone.io"
# pinecone_api_key = "" # or set PINECONE_API_KEY env var

Pinecone index setup

Create your Pinecone index manually in the console first. Choose cosine metric and match the dimension of your embedding model (768 for nomic-embed-text, 1536 for text-embedding-3-small).

CLI commands

shell
# Index a directory (walks all files matching index_extensions)
$ ferret rag index .
$ ferret rag index src/
# Search the index
$ ferret rag search "authentication bypass"
$ ferret rag search "SQL injection" -k 10
# Manage the index
$ ferret rag count # how many chunks are indexed
$ ferret rag clear # delete all data in the collection

Configuration reference

ferret.toml
toml
[rag]
enabled = true
store = "local"
collection = "ferret" # collection/namespace name
embed_model = "nomic-embed-text"
ollama_base_url = "http://localhost:11434"
top_k = 5 # documents retrieved per query
min_score = 0.70 # minimum cosine similarity
chunk_lines = 80 # lines per indexed chunk
index_extensions = [".rs", ".ts", ".py", ".go", ".java", ".md"]
local_path = "ferret-rag.jsonl"

Indexing past review comments

Ferret can also index past AI review comments so future reviews learn from them. The comment_to_doc() function in src/rag/indexer.rs converts a past comment into a Document that can be upserted via pipeline.index_documents(). This is a programmatic API — CI integration is planned for a future release.