Introduction
ReasonDB is an AI-native document database built in Rust. Rather than splitting documents into flat chunks and searching by vector similarity, it uses Hierarchical Reasoning Retrieval (HRR) — an LLM-guided pipeline that navigates document trees the same way a human expert would.
Documents have structure. That structure has meaning. ReasonDB preserves it.
When to use ReasonDB
- Complex, structured documents — policies, contracts, technical manuals — with cross-references and nested sections.
- You need auditable, deterministic answers that can be replayed and cited.
- Domain-specific vocabulary that end users won't phrase correctly (e.g. "big C" → critical illness).
- Regulated industries (finance, insurance, legal, healthcare) where accuracy and traceability are non-negotiable.
Quick Start
Get a document ingested and your first AI-powered query running in under 5 minutes.
Prerequisites
- Docker (recommended) or Rust toolchain for source builds
- An LLM API key — Anthropic, OpenAI, Gemini, Cohere, or a local Ollama instance
1. Start the server
services:
reasondb:
image: ghcr.io/brainfish-ai/reasondb:latest
ports:
- "4444:4444"
volumes:
- ./data:/data
- ./plugins:/plugins
environment:
REASONDB_LLM_PROVIDER: anthropic
REASONDB_LLM_API_KEY: ${ANTHROPIC_API_KEY}
REASONDB_MODEL: claude-opus-4-6docker compose up -d # Server available at http://localhost:4444
2. Create a table
curl -X POST http://localhost:4444/v1/tables \
-H 'Content-Type: application/json' \
-d '{"name": "policies", "description": "Insurance policy documents"}'3. Ingest a document
# Ingest a PDF
curl -X POST http://localhost:4444/v1/tables/policies/ingest/file \
-F 'file=@my_policy.pdf' \
-F 'title=Home Insurance PDS 2024' \
-F 'metadata={"cohort":"home_2024","year":"2024"}'
# Or ingest plain text / Markdown
curl -X POST http://localhost:4444/v1/tables/policies/ingest/text \
-H 'Content-Type: application/json' \
-d '{
"title": "Disability Benefits Guide",
"content": "# Section 1\nContent here...",
"tags": ["disability", "benefits"]
}'4. Query with REASON
curl -X POST http://localhost:4444/v1/tables/policies/query \
-H 'Content-Type: application/json' \
-d '{
"query": "SELECT answer, confidence, trace FROM docs REASON \'What disability benefits apply to partial inability to work?\' LIMIT 3"
}'5. Check the result
{
"results": [
{
"answer": "Partial disability benefit pays 50% of the total disability...",
"confidence": 0.94,
"trace": {
"phases": ["candidate_selection", "structural_filter", "llm_rank", "beam_search"],
"nodes_visited": 18,
"sections": ["§4.2 Disability Definitions", "§4.3 Partial Disability Benefit"]
}
}
]
}Installation
Docker (recommended)
docker run -d \ -p 4444:4444 \ -e REASONDB_LLM_PROVIDER=anthropic \ -e REASONDB_LLM_API_KEY=your_key \ -v $(pwd)/data:/data \ ghcr.io/brainfish-ai/reasondb:latest
Homebrew (macOS / Linux)
brew tap brainfish-ai/reasondb brew install reasondb reasondb config init # interactive setup wizard reasondb serve
From source
git clone https://github.com/brainfish-ai/reasondb cd reasondb cargo build --release ./target/release/reasondb serve
Configuration
Interactive wizard
reasondb config init
CLI commands
reasondb config set llm.provider anthropic reasondb config set llm.api_key sk-ant-... reasondb config set server.port 4444 reasondb config list # show all values reasondb config path # show config file location
Config file locations
- macOS:
~/Library/Application Support/reasondb/config.toml - Linux:
~/.config/reasondb/config.toml - Windows:
%APPDATA%\reasondb\config.toml
Environment variables
Environment variables override config file values. Priority: CLI args → env vars → config file → defaults.
REASONDB_LLM_PROVIDER=anthropic REASONDB_LLM_API_KEY=sk-ant-... REASONDB_MODEL=claude-opus-4-6 REASONDB_PORT=4444 REASONDB_HOST=127.0.0.1 # use 0.0.0.0 in production REASONDB_PATH=data/reasondb.redb REASONDB_AUTH_ENABLED=false RUST_LOG=info
HRR Pipeline
Every REASON query runs through four phases. No phase is skipped — each prunes the candidate set before passing it to the next.
By pruning at each level, ReasonDB visits only 20–50 nodes even across databases with 50 million nodes — exponentially more efficient than flat vector search.
Document trees
Every ingested document is converted into a hierarchical tree. Markdown headings define the structure; each heading becomes a node with an LLM-generated summary.
- Branch nodes — contain overviews and guide navigation decisions.
- Leaf nodes — hold detailed content where answers are extracted.
- Summaries — generated at ingestion for every node. Enable intelligent routing without scanning raw text.
generate_summaries: false speeds up ingestion but significantly reduces search quality. Only disable for keyword-only SEARCH workloads.Tables
Tables group related documents and scope queries. Table names must be snake_case, start with a letter, and contain only letters, numbers, and underscores.
# Create
curl -X POST http://localhost:4444/v1/tables \
-H 'Content-Type: application/json' \
-d '{"name": "contracts", "description": "Legal contracts"}'
# List
curl http://localhost:4444/v1/tables
# Get details
curl http://localhost:4444/v1/tables/contracts
# Update
curl -X PATCH http://localhost:4444/v1/tables/contracts \
-H 'Content-Type: application/json' \
-d '{"description": "Updated description"}'
# Delete (documents moved to default table)
curl -X DELETE http://localhost:4444/v1/tables/contracts
# Delete + cascade (removes all documents)
curl -X DELETE http://localhost:4444/v1/tables/contracts \
-d '{"cascade": true}'Ingestion
Supported formats
The built-in markitdown plugin handles: PDF, Word (.docx/.doc), Excel (.xlsx), PowerPoint (.pptx), HTML, images with OCR, audio transcription, Markdown, and archive formats.
File ingestion
curl -X POST http://localhost:4444/v1/tables/policies/ingest/file \
-F 'file=@policy.pdf' \
-F 'title=Home Insurance PDS' \
-F 'generate_summaries=true' \
-F 'tags=["home","2024"]' \
-F 'metadata={"cohort":"home_2024","region":"AU"}'Text / Markdown ingestion
curl -X POST http://localhost:4444/v1/tables/policies/ingest/text \
-H 'Content-Type: application/json' \
-d '{
"title": "Disability Benefits Guide",
"content": "# Section 1\n\nContent here...\n\n## 1.1 Subsection",
"generate_summaries": true,
"tags": ["disability"],
"metadata": {"cohort": "disability_2023"}
}'URL ingestion
curl -X POST http://localhost:4444/v1/tables/research/ingest/url \
-H 'Content-Type: application/json' \
-d '{
"url": "https://example.com/report.html",
"generate_summaries": true
}'RQL — Reasoning Query Language
RQL extends SQL with two new clauses: SEARCH for BM25 keyword matching and REASON for LLM-guided hierarchical retrieval.
SEARCHBM25 keyword match · titles boosted 3×
~50ms
REASONBM25 + tree-grep + LLM traversal
~2–5s
SEARCH clause
-- Fast keyword search (BM25) SELECT * FROM contracts WHERE metadata.status = 'active' SEARCH 'penalty clause termination' ORDER BY score DESC LIMIT 10; -- Combine SEARCH with metadata filters SELECT title, score FROM policies WHERE metadata.cohort = 'home_2024' AND tags CONTAINS 'flood' SEARCH 'water damage exclusions' LIMIT 5;
REASON clause
-- LLM-guided hierarchical retrieval SELECT answer, confidence, trace FROM policies WHERE metadata.cohort = 'disability_2023' REASON 'What benefits apply to partial inability to work after injury?' LIMIT 3; -- Combine SEARCH and REASON SELECT answer, confidence FROM contracts WHERE metadata.year = '2024' SEARCH 'limitation period' REASON 'What is the deadline for submitting a claim?' LIMIT 5;
SEARCH and REASON in the same query. SEARCH pre-filters candidates with keywords before REASON runs LLM traversal — useful for large corpora.Relationships & filtering
-- Filter by relationship type SELECT * FROM policies RELATED TO 'policy_001' WITH RELATIONSHIP SUPERSEDES; -- Supported relationship types: -- REFERENCES, SUPERSEDES, PARENT_OF, CHILD_OF -- UPDATE documents UPDATE policies SET metadata.status = 'archived' WHERE metadata.year < '2020'; -- Aggregate queries SELECT COUNT(*) as total, metadata.cohort FROM policies GROUP BY metadata.cohort; -- EXPLAIN shows execution plan EXPLAIN SELECT * FROM policies REASON 'What is the excess for flood damage?';
Authentication
Authentication is disabled by default. Enable it for any internet-facing deployment.
# Enable authentication
REASONDB_AUTH_ENABLED=true \
REASONDB_MASTER_KEY="your-secret-master-key" \
reasondb serve
# Create an API key
curl -X POST http://localhost:4444/v1/auth/keys \
-H 'Authorization: Bearer your-secret-master-key' \
-H 'Content-Type: application/json' \
-d '{
"name": "production-reader",
"permissions": ["read", "query"],
"environment": "live"
}'API keys follow the format rdb_live_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx (production) or rdb_test_... (development). The secret is returned once — store it immediately.
Permissions
| Permission | Grants |
|---|---|
read | List tables, list and fetch documents |
write | Create/update/delete documents |
ingest | Submit ingestion jobs |
query | Run SEARCH and REASON queries |
relations | Create and query document relationships |
tables | Create, update, and delete tables |
admin | Full access including key management |
Plugin API
Plugins are external processes that communicate via JSON over stdin/stdout. Each plugin runs independently in its own directory under $REASONDB_PLUGINS_DIR and follows a one-shot model (one request in, one response out).
Pipeline stages
- Extractors — convert files or URLs to Markdown
- Post-processors — transform Markdown before chunking
- Chunkers — split Markdown into discrete chunks
- Summarizers — generate node summaries
- Formatters — shape the final output nodes
Supported runtimes
Python 3, Node.js, Bash/sh, and any compiled binary (Rust, Go, C, etc.). Declare capabilities in a plugin.toml manifest.
Configuration
REASONDB_PLUGINS_DIR=./plugins # default REASONDB_PLUGINS_ENABLED=true # default REASONDB_PLUGIN_TIMEOUT=120 # seconds # Plugin discovery endpoints GET /v1/plugins GET /v1/plugins/:name POST /v1/plugins/:name/test
LLM Providers
ReasonDB supports nine providers out of the box. Swap providers without re-ingesting documents.
| Provider | REASONDB_LLM_PROVIDER | Notes |
|---|---|---|
| Anthropic Claude | anthropic | claude-opus-4-6 recommended |
| OpenAI GPT-4 | openai | gpt-4o recommended |
| Google Gemini | gemini | gemini-2.0-flash recommended |
| Cohere | cohere | command-r-plus recommended |
| Ollama (local) | ollama | No API key required |
| Google Vertex AI | vertex | Requires GCP credentials |
| AWS Bedrock | bedrock | Requires AWS credentials |
| GLM | glm | |
| Kimi | kimi |
Full API reference
The complete interactive API reference — including all endpoints, request/response schemas, and a built-in playground — is at reason-db.devdoc.sh.