Introduction

ReasonDB is an AI-native document database built in Rust. Rather than splitting documents into flat chunks and searching by vector similarity, it uses Hierarchical Reasoning Retrieval (HRR) — an LLM-guided pipeline that navigates document trees the same way a human expert would.

Documents have structure. That structure has meaning. ReasonDB preserves it.

When to use ReasonDB

Complex, structured documents — policies, contracts, technical manuals — with cross-references and nested sections.
You need auditable, deterministic answers that can be replayed and cited.
Domain-specific vocabulary that end users won't phrase correctly (e.g. "big C" → critical illness).
Regulated industries (finance, insurance, legal, healthcare) where accuracy and traceability are non-negotiable.

Note

ReasonDB is open source under the ReasonDB License v1.0, which permits use, modification, and distribution while restricting DBaaS offerings. Read the license.

Quick Start

Get a document ingested and your first AI-powered query running in under 5 minutes.

Prerequisites

Docker (recommended) or Rust toolchain for source builds
An LLM API key — Anthropic, OpenAI, Gemini, Cohere, or a local Ollama instance

1. Start the server

docker-compose.yml

services:
  reasondb:
    image: ghcr.io/brainfish-ai/reasondb:latest
    ports:
      - "4444:4444"
    volumes:
      - ./data:/data
      - ./plugins:/plugins
    environment:
      REASONDB_LLM_PROVIDER: anthropic
      REASONDB_LLM_API_KEY: ${ANTHROPIC_API_KEY}
      REASONDB_MODEL: claude-opus-4-6

terminal

docker compose up -d
# Server available at http://localhost:4444

2. Create a table

terminal

curl -X POST http://localhost:4444/v1/tables \
  -H 'Content-Type: application/json' \
  -d '{"name": "policies", "description": "Insurance policy documents"}'

3. Ingest a document

terminal

# Ingest a PDF
curl -X POST http://localhost:4444/v1/tables/policies/ingest/file \
  -F 'file=@my_policy.pdf' \
  -F 'title=Home Insurance PDS 2024' \
  -F 'metadata={"cohort":"home_2024","year":"2024"}'

# Or ingest plain text / Markdown
curl -X POST http://localhost:4444/v1/tables/policies/ingest/text \
  -H 'Content-Type: application/json' \
  -d '{
    "title": "Disability Benefits Guide",
    "content": "# Section 1\nContent here...",
    "tags": ["disability", "benefits"]
  }'

Tip

Ingestion is async. The server returns a job ID immediately (HTTP 202). Documents are queryable once processing completes — typically a few seconds for small files.

4. Query with REASON

terminal

curl -X POST http://localhost:4444/v1/tables/policies/query \
  -H 'Content-Type: application/json' \
  -d '{
    "query": "SELECT answer, confidence, trace FROM docs REASON \'What disability benefits apply to partial inability to work?\' LIMIT 3"
  }'

5. Check the result

terminal

{
  "results": [
    {
      "answer": "Partial disability benefit pays 50% of the total disability...",
      "confidence": 0.94,
      "trace": {
        "phases": ["candidate_selection", "structural_filter", "llm_rank", "beam_search"],
        "nodes_visited": 18,
        "sections": ["§4.2 Disability Definitions", "§4.3 Partial Disability Benefit"]
      }
    }
  ]
}

Installation

Docker (recommended)

terminal

docker run -d \
  -p 4444:4444 \
  -e REASONDB_LLM_PROVIDER=anthropic \
  -e REASONDB_LLM_API_KEY=your_key \
  -v $(pwd)/data:/data \
  ghcr.io/brainfish-ai/reasondb:latest

Homebrew (macOS / Linux)

terminal

brew tap brainfish-ai/reasondb
brew install reasondb
reasondb config init   # interactive setup wizard
reasondb serve

From source

terminal

git clone https://github.com/brainfish-ai/reasondb
cd reasondb
cargo build --release
./target/release/reasondb serve

Configuration

Interactive wizard

terminal

reasondb config init

CLI commands

terminal

reasondb config set llm.provider anthropic
reasondb config set llm.api_key sk-ant-...
reasondb config set server.port 4444
reasondb config list      # show all values
reasondb config path      # show config file location

Config file locations

macOS: ~/Library/Application Support/reasondb/config.toml
Linux: ~/.config/reasondb/config.toml
Windows: %APPDATA%\reasondb\config.toml

Environment variables

Environment variables override config file values. Priority: CLI args → env vars → config file → defaults.

terminal

REASONDB_LLM_PROVIDER=anthropic
REASONDB_LLM_API_KEY=sk-ant-...
REASONDB_MODEL=claude-opus-4-6
REASONDB_PORT=4444
REASONDB_HOST=127.0.0.1       # use 0.0.0.0 in production
REASONDB_PATH=data/reasondb.redb
REASONDB_AUTH_ENABLED=false
RUST_LOG=info

Warning

The database file uses exclusive locking — only one server instance can access it at a time. For HA deployments, use the clustering guide.

HRR Pipeline

Every REASON query runs through four phases. No phase is skipped — each prunes the candidate set before passing it to the next.

Phase 01

Candidate Selection

BM25 keyword search (titles boosted 3×) over the entire corpus. Narrows millions of documents to ~100 candidates. Zero LLM calls.

0 LLM calls · ~50ms

Phase 02

Structural Filtering

tree_grep walks each document's node hierarchy, scoring title matches 3×, summary matches 1.5×, and content 1×. Eliminates structurally irrelevant documents.

0 LLM calls · ~200ms

Phase 03

LLM Ranking

LLM evaluates the top ~20 survivors using their summaries and match signals. Returns top-N most likely to contain the answer.

1 LLM call · ~1–2s

Phase 04

Parallel Beam Search

LLM traverses each winner's document tree (beam_width=3, max_depth=10). Extracts answer + confidence at leaf nodes. Cross-references resolved automatically.

N LLM calls · ~2–4s

By pruning at each level, ReasonDB visits only 20–50 nodes even across databases with 50 million nodes — exponentially more efficient than flat vector search.

Document trees

Every ingested document is converted into a hierarchical tree. Markdown headings define the structure; each heading becomes a node with an LLM-generated summary.

Branch nodes — contain overviews and guide navigation decisions.
Leaf nodes — hold detailed content where answers are extracted.
Summaries — generated at ingestion for every node. Enable intelligent routing without scanning raw text.

Tip

Setting generate_summaries: false speeds up ingestion but significantly reduces search quality. Only disable for keyword-only SEARCH workloads.

Tables

Tables group related documents and scope queries. Table names must be snake_case, start with a letter, and contain only letters, numbers, and underscores.

terminal

# Create
curl -X POST http://localhost:4444/v1/tables \
  -H 'Content-Type: application/json' \
  -d '{"name": "contracts", "description": "Legal contracts"}'

# List
curl http://localhost:4444/v1/tables

# Get details
curl http://localhost:4444/v1/tables/contracts

# Update
curl -X PATCH http://localhost:4444/v1/tables/contracts \
  -H 'Content-Type: application/json' \
  -d '{"description": "Updated description"}'

# Delete (documents moved to default table)
curl -X DELETE http://localhost:4444/v1/tables/contracts

# Delete + cascade (removes all documents)
curl -X DELETE http://localhost:4444/v1/tables/contracts \
  -d '{"cascade": true}'

Ingestion

Supported formats

The built-in markitdown plugin handles: PDF, Word (.docx/.doc), Excel (.xlsx), PowerPoint (.pptx), HTML, images with OCR, audio transcription, Markdown, and archive formats.

File ingestion

terminal

curl -X POST http://localhost:4444/v1/tables/policies/ingest/file \
  -F 'file=@policy.pdf' \
  -F 'title=Home Insurance PDS' \
  -F 'generate_summaries=true' \
  -F 'tags=["home","2024"]' \
  -F 'metadata={"cohort":"home_2024","region":"AU"}'

Text / Markdown ingestion

terminal

curl -X POST http://localhost:4444/v1/tables/policies/ingest/text \
  -H 'Content-Type: application/json' \
  -d '{
    "title": "Disability Benefits Guide",
    "content": "# Section 1\n\nContent here...\n\n## 1.1 Subsection",
    "generate_summaries": true,
    "tags": ["disability"],
    "metadata": {"cohort": "disability_2023"}
  }'

URL ingestion

terminal

curl -X POST http://localhost:4444/v1/tables/research/ingest/url \
  -H 'Content-Type: application/json' \
  -d '{
    "url": "https://example.com/report.html",
    "generate_summaries": true
  }'

Note

URL ingestion supports web pages, YouTube videos, and GitHub repositories. Metadata and tags are not supported via URL ingestion — add them with a follow-up PATCH.

RQL — Reasoning Query Language

RQL extends SQL with two new clauses: SEARCH for BM25 keyword matching and REASON for LLM-guided hierarchical retrieval.

SEARCH

BM25 keyword match · titles boosted 3×

~50ms

REASON

BM25 + tree-grep + LLM traversal

~2–5s

SEARCH clause

example.rql

-- Fast keyword search (BM25)
SELECT * FROM contracts
WHERE metadata.status = 'active'
SEARCH 'penalty clause termination'
ORDER BY score DESC
LIMIT 10;

-- Combine SEARCH with metadata filters
SELECT title, score FROM policies
WHERE metadata.cohort = 'home_2024'
  AND tags CONTAINS 'flood'
SEARCH 'water damage exclusions'
LIMIT 5;

REASON clause

example.rql

-- LLM-guided hierarchical retrieval
SELECT answer, confidence, trace
FROM policies
WHERE metadata.cohort = 'disability_2023'
REASON 'What benefits apply to partial inability to work after injury?'
LIMIT 3;

-- Combine SEARCH and REASON
SELECT answer, confidence FROM contracts
WHERE metadata.year = '2024'
SEARCH 'limitation period'
REASON 'What is the deadline for submitting a claim?'
LIMIT 5;

Tip

You can combine SEARCH and REASON in the same query. SEARCH pre-filters candidates with keywords before REASON runs LLM traversal — useful for large corpora.

Relationships & filtering

example.rql

-- Filter by relationship type
SELECT * FROM policies
RELATED TO 'policy_001' WITH RELATIONSHIP SUPERSEDES;

-- Supported relationship types:
-- REFERENCES, SUPERSEDES, PARENT_OF, CHILD_OF

-- UPDATE documents
UPDATE policies
SET metadata.status = 'archived'
WHERE metadata.year < '2020';

-- Aggregate queries
SELECT COUNT(*) as total, metadata.cohort
FROM policies
GROUP BY metadata.cohort;

-- EXPLAIN shows execution plan
EXPLAIN SELECT * FROM policies
REASON 'What is the excess for flood damage?';

Authentication

Authentication is disabled by default. Enable it for any internet-facing deployment.

terminal

# Enable authentication
REASONDB_AUTH_ENABLED=true \
REASONDB_MASTER_KEY="your-secret-master-key" \
reasondb serve

# Create an API key
curl -X POST http://localhost:4444/v1/auth/keys \
  -H 'Authorization: Bearer your-secret-master-key' \
  -H 'Content-Type: application/json' \
  -d '{
    "name": "production-reader",
    "permissions": ["read", "query"],
    "environment": "live"
  }'

API keys follow the format rdb_live_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx (production) or rdb_test_... (development). The secret is returned once — store it immediately.

Permissions

Permission	Grants
`read`	List tables, list and fetch documents
`write`	Create/update/delete documents
`ingest`	Submit ingestion jobs
`query`	Run SEARCH and REASON queries
`relations`	Create and query document relationships
`tables`	Create, update, and delete tables
`admin`	Full access including key management

Plugin API

Plugins are external processes that communicate via JSON over stdin/stdout. Each plugin runs independently in its own directory under $REASONDB_PLUGINS_DIR and follows a one-shot model (one request in, one response out).

Pipeline stages

Extractors — convert files or URLs to Markdown
Post-processors — transform Markdown before chunking
Chunkers — split Markdown into discrete chunks
Summarizers — generate node summaries
Formatters — shape the final output nodes

Supported runtimes

Python 3, Node.js, Bash/sh, and any compiled binary (Rust, Go, C, etc.). Declare capabilities in a plugin.toml manifest.

Configuration

terminal

REASONDB_PLUGINS_DIR=./plugins      # default
REASONDB_PLUGINS_ENABLED=true       # default
REASONDB_PLUGIN_TIMEOUT=120         # seconds

# Plugin discovery endpoints
GET /v1/plugins
GET /v1/plugins/:name
POST /v1/plugins/:name/test

Note

The built-in markitdown plugin ships pre-installed and handles PDF, Word, PowerPoint, Excel, images (OCR), audio transcription, and HTML. It runs at priority 200, so custom extractors can override it for specific formats by setting a higher priority.

LLM Providers

ReasonDB supports nine providers out of the box. Swap providers without re-ingesting documents.

Provider	REASONDB_LLM_PROVIDER	Notes
Anthropic Claude	`anthropic`	claude-opus-4-6 recommended
OpenAI GPT-4	`openai`	gpt-4o recommended
Google Gemini	`gemini`	gemini-2.0-flash recommended
Cohere	`cohere`	command-r-plus recommended
Ollama (local)	`ollama`	No API key required
Google Vertex AI	`vertex`	Requires GCP credentials
AWS Bedrock	`bedrock`	Requires AWS credentials
GLM	`glm`
Kimi	`kimi`

Full API reference

The complete interactive API reference — including all endpoints, request/response schemas, and a built-in playground — is at reason-db.devdoc.sh.

Quickstart guide RQL reference Ingestion guide Plugin development Configuration OpenAPI spec

Note

The GitHub repo also contains interactive tutorials for common use cases including insurance policy querying, legal contract analysis, and research knowledge management.