alpha · work in progress · open source

Introduction

ReasonDB is an AI-native document database built in Rust. Rather than splitting documents into flat chunks and searching by vector similarity, it uses Hierarchical Reasoning Retrieval (HRR) — an LLM-guided pipeline that navigates document trees the same way a human expert would.

Documents have structure. That structure has meaning. ReasonDB preserves it.

When to use ReasonDB

  • Complex, structured documents — policies, contracts, technical manuals — with cross-references and nested sections.
  • You need auditable, deterministic answers that can be replayed and cited.
  • Domain-specific vocabulary that end users won't phrase correctly (e.g. "big C" → critical illness).
  • Regulated industries (finance, insurance, legal, healthcare) where accuracy and traceability are non-negotiable.
Note
ReasonDB is open source under the ReasonDB License v1.0, which permits use, modification, and distribution while restricting DBaaS offerings. Read the license.

Quick Start

Get a document ingested and your first AI-powered query running in under 5 minutes.

Prerequisites

  • Docker (recommended) or Rust toolchain for source builds
  • An LLM API key — Anthropic, OpenAI, Gemini, Cohere, or a local Ollama instance

1. Start the server

docker-compose.yml
services:
  reasondb:
    image: ghcr.io/brainfish-ai/reasondb:latest
    ports:
      - "4444:4444"
    volumes:
      - ./data:/data
      - ./plugins:/plugins
    environment:
      REASONDB_LLM_PROVIDER: anthropic
      REASONDB_LLM_API_KEY: ${ANTHROPIC_API_KEY}
      REASONDB_MODEL: claude-opus-4-6
terminal
docker compose up -d
# Server available at http://localhost:4444

2. Create a table

terminal
curl -X POST http://localhost:4444/v1/tables \
  -H 'Content-Type: application/json' \
  -d '{"name": "policies", "description": "Insurance policy documents"}'

3. Ingest a document

terminal
# Ingest a PDF
curl -X POST http://localhost:4444/v1/tables/policies/ingest/file \
  -F 'file=@my_policy.pdf' \
  -F 'title=Home Insurance PDS 2024' \
  -F 'metadata={"cohort":"home_2024","year":"2024"}'

# Or ingest plain text / Markdown
curl -X POST http://localhost:4444/v1/tables/policies/ingest/text \
  -H 'Content-Type: application/json' \
  -d '{
    "title": "Disability Benefits Guide",
    "content": "# Section 1\nContent here...",
    "tags": ["disability", "benefits"]
  }'
Tip
Ingestion is async. The server returns a job ID immediately (HTTP 202). Documents are queryable once processing completes — typically a few seconds for small files.

4. Query with REASON

terminal
curl -X POST http://localhost:4444/v1/tables/policies/query \
  -H 'Content-Type: application/json' \
  -d '{
    "query": "SELECT answer, confidence, trace FROM docs REASON \'What disability benefits apply to partial inability to work?\' LIMIT 3"
  }'

5. Check the result

terminal
{
  "results": [
    {
      "answer": "Partial disability benefit pays 50% of the total disability...",
      "confidence": 0.94,
      "trace": {
        "phases": ["candidate_selection", "structural_filter", "llm_rank", "beam_search"],
        "nodes_visited": 18,
        "sections": ["§4.2 Disability Definitions", "§4.3 Partial Disability Benefit"]
      }
    }
  ]
}

Installation

Docker (recommended)

terminal
docker run -d \
  -p 4444:4444 \
  -e REASONDB_LLM_PROVIDER=anthropic \
  -e REASONDB_LLM_API_KEY=your_key \
  -v $(pwd)/data:/data \
  ghcr.io/brainfish-ai/reasondb:latest

Homebrew (macOS / Linux)

terminal
brew tap brainfish-ai/reasondb
brew install reasondb
reasondb config init   # interactive setup wizard
reasondb serve

From source

terminal
git clone https://github.com/brainfish-ai/reasondb
cd reasondb
cargo build --release
./target/release/reasondb serve

Configuration

Interactive wizard

terminal
reasondb config init

CLI commands

terminal
reasondb config set llm.provider anthropic
reasondb config set llm.api_key sk-ant-...
reasondb config set server.port 4444
reasondb config list      # show all values
reasondb config path      # show config file location

Config file locations

  • macOS: ~/Library/Application Support/reasondb/config.toml
  • Linux: ~/.config/reasondb/config.toml
  • Windows: %APPDATA%\reasondb\config.toml

Environment variables

Environment variables override config file values. Priority: CLI args → env vars → config file → defaults.

terminal
REASONDB_LLM_PROVIDER=anthropic
REASONDB_LLM_API_KEY=sk-ant-...
REASONDB_MODEL=claude-opus-4-6
REASONDB_PORT=4444
REASONDB_HOST=127.0.0.1       # use 0.0.0.0 in production
REASONDB_PATH=data/reasondb.redb
REASONDB_AUTH_ENABLED=false
RUST_LOG=info
Warning
The database file uses exclusive locking — only one server instance can access it at a time. For HA deployments, use the clustering guide.

HRR Pipeline

Every REASON query runs through four phases. No phase is skipped — each prunes the candidate set before passing it to the next.

Phase 01
Candidate Selection
BM25 keyword search (titles boosted 3×) over the entire corpus. Narrows millions of documents to ~100 candidates. Zero LLM calls.
0 LLM calls · ~50ms
Phase 02
Structural Filtering
tree_grep walks each document's node hierarchy, scoring title matches 3×, summary matches 1.5×, and content 1×. Eliminates structurally irrelevant documents.
0 LLM calls · ~200ms
Phase 03
LLM Ranking
LLM evaluates the top ~20 survivors using their summaries and match signals. Returns top-N most likely to contain the answer.
1 LLM call · ~1–2s
Phase 04
Parallel Beam Search
LLM traverses each winner's document tree (beam_width=3, max_depth=10). Extracts answer + confidence at leaf nodes. Cross-references resolved automatically.
N LLM calls · ~2–4s

By pruning at each level, ReasonDB visits only 20–50 nodes even across databases with 50 million nodes — exponentially more efficient than flat vector search.


Document trees

Every ingested document is converted into a hierarchical tree. Markdown headings define the structure; each heading becomes a node with an LLM-generated summary.

  • Branch nodes — contain overviews and guide navigation decisions.
  • Leaf nodes — hold detailed content where answers are extracted.
  • Summaries — generated at ingestion for every node. Enable intelligent routing without scanning raw text.
Tip
Setting generate_summaries: false speeds up ingestion but significantly reduces search quality. Only disable for keyword-only SEARCH workloads.

Tables

Tables group related documents and scope queries. Table names must be snake_case, start with a letter, and contain only letters, numbers, and underscores.

terminal
# Create
curl -X POST http://localhost:4444/v1/tables \
  -H 'Content-Type: application/json' \
  -d '{"name": "contracts", "description": "Legal contracts"}'

# List
curl http://localhost:4444/v1/tables

# Get details
curl http://localhost:4444/v1/tables/contracts

# Update
curl -X PATCH http://localhost:4444/v1/tables/contracts \
  -H 'Content-Type: application/json' \
  -d '{"description": "Updated description"}'

# Delete (documents moved to default table)
curl -X DELETE http://localhost:4444/v1/tables/contracts

# Delete + cascade (removes all documents)
curl -X DELETE http://localhost:4444/v1/tables/contracts \
  -d '{"cascade": true}'

Ingestion

Supported formats

The built-in markitdown plugin handles: PDF, Word (.docx/.doc), Excel (.xlsx), PowerPoint (.pptx), HTML, images with OCR, audio transcription, Markdown, and archive formats.

File ingestion

terminal
curl -X POST http://localhost:4444/v1/tables/policies/ingest/file \
  -F 'file=@policy.pdf' \
  -F 'title=Home Insurance PDS' \
  -F 'generate_summaries=true' \
  -F 'tags=["home","2024"]' \
  -F 'metadata={"cohort":"home_2024","region":"AU"}'

Text / Markdown ingestion

terminal
curl -X POST http://localhost:4444/v1/tables/policies/ingest/text \
  -H 'Content-Type: application/json' \
  -d '{
    "title": "Disability Benefits Guide",
    "content": "# Section 1\n\nContent here...\n\n## 1.1 Subsection",
    "generate_summaries": true,
    "tags": ["disability"],
    "metadata": {"cohort": "disability_2023"}
  }'

URL ingestion

terminal
curl -X POST http://localhost:4444/v1/tables/research/ingest/url \
  -H 'Content-Type: application/json' \
  -d '{
    "url": "https://example.com/report.html",
    "generate_summaries": true
  }'
Note
URL ingestion supports web pages, YouTube videos, and GitHub repositories. Metadata and tags are not supported via URL ingestion — add them with a follow-up PATCH.

RQL — Reasoning Query Language

RQL extends SQL with two new clauses: SEARCH for BM25 keyword matching and REASON for LLM-guided hierarchical retrieval.

SEARCH

BM25 keyword match · titles boosted 3×

~50ms

REASON

BM25 + tree-grep + LLM traversal

~2–5s

REASON clause

example.rql
-- LLM-guided hierarchical retrieval
SELECT answer, confidence, trace
FROM policies
WHERE metadata.cohort = 'disability_2023'
REASON 'What benefits apply to partial inability to work after injury?'
LIMIT 3;

-- Combine SEARCH and REASON
SELECT answer, confidence FROM contracts
WHERE metadata.year = '2024'
SEARCH 'limitation period'
REASON 'What is the deadline for submitting a claim?'
LIMIT 5;
Tip
You can combine SEARCH and REASON in the same query. SEARCH pre-filters candidates with keywords before REASON runs LLM traversal — useful for large corpora.

Relationships & filtering

example.rql
-- Filter by relationship type
SELECT * FROM policies
RELATED TO 'policy_001' WITH RELATIONSHIP SUPERSEDES;

-- Supported relationship types:
-- REFERENCES, SUPERSEDES, PARENT_OF, CHILD_OF

-- UPDATE documents
UPDATE policies
SET metadata.status = 'archived'
WHERE metadata.year < '2020';

-- Aggregate queries
SELECT COUNT(*) as total, metadata.cohort
FROM policies
GROUP BY metadata.cohort;

-- EXPLAIN shows execution plan
EXPLAIN SELECT * FROM policies
REASON 'What is the excess for flood damage?';

Authentication

Authentication is disabled by default. Enable it for any internet-facing deployment.

terminal
# Enable authentication
REASONDB_AUTH_ENABLED=true \
REASONDB_MASTER_KEY="your-secret-master-key" \
reasondb serve

# Create an API key
curl -X POST http://localhost:4444/v1/auth/keys \
  -H 'Authorization: Bearer your-secret-master-key' \
  -H 'Content-Type: application/json' \
  -d '{
    "name": "production-reader",
    "permissions": ["read", "query"],
    "environment": "live"
  }'

API keys follow the format rdb_live_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx (production) or rdb_test_... (development). The secret is returned once — store it immediately.

Permissions

PermissionGrants
readList tables, list and fetch documents
writeCreate/update/delete documents
ingestSubmit ingestion jobs
queryRun SEARCH and REASON queries
relationsCreate and query document relationships
tablesCreate, update, and delete tables
adminFull access including key management

Plugin API

Plugins are external processes that communicate via JSON over stdin/stdout. Each plugin runs independently in its own directory under $REASONDB_PLUGINS_DIR and follows a one-shot model (one request in, one response out).

Pipeline stages

  • Extractors — convert files or URLs to Markdown
  • Post-processors — transform Markdown before chunking
  • Chunkers — split Markdown into discrete chunks
  • Summarizers — generate node summaries
  • Formatters — shape the final output nodes

Supported runtimes

Python 3, Node.js, Bash/sh, and any compiled binary (Rust, Go, C, etc.). Declare capabilities in a plugin.toml manifest.

Configuration

terminal
REASONDB_PLUGINS_DIR=./plugins      # default
REASONDB_PLUGINS_ENABLED=true       # default
REASONDB_PLUGIN_TIMEOUT=120         # seconds

# Plugin discovery endpoints
GET /v1/plugins
GET /v1/plugins/:name
POST /v1/plugins/:name/test
Note
The built-in markitdown plugin ships pre-installed and handles PDF, Word, PowerPoint, Excel, images (OCR), audio transcription, and HTML. It runs at priority 200, so custom extractors can override it for specific formats by setting a higher priority.

LLM Providers

ReasonDB supports nine providers out of the box. Swap providers without re-ingesting documents.

ProviderREASONDB_LLM_PROVIDERNotes
Anthropic Claudeanthropicclaude-opus-4-6 recommended
OpenAI GPT-4openaigpt-4o recommended
Google Geminigeminigemini-2.0-flash recommended
Coherecoherecommand-r-plus recommended
Ollama (local)ollamaNo API key required
Google Vertex AIvertexRequires GCP credentials
AWS BedrockbedrockRequires AWS credentials
GLMglm
Kimikimi

Full API reference

The complete interactive API reference — including all endpoints, request/response schemas, and a built-in playground — is at reason-db.devdoc.sh.

Note
The GitHub repo also contains interactive tutorials for common use cases including insurance policy querying, legal contract analysis, and research knowledge management.