Benchmarks
ReasonDB was benchmarked against a production insurance document corpus — the same workload that drives demand for explainable RAG in regulated industries.
// Summary
// Methodology
The benchmark was run against a corpus of real-world insurance Policy Disclosure Statements (PDS) — including scanned legacy documents, multi-section contracts with cross-references, and policies using evolved terminology (e.g. "death cover" → "life cover" across different cohorts).
Corpus characteristics
Evaluation criteria
Each query was evaluated across three dimensions:
// Query-level results
12 queries across 3 complexity tiers run against the insurance corpus.
// System comparison
Same corpus, same queries, same evaluation criteria.
// Latency breakdown
End-to-end latency by phase for a typical complex query (6.1s median).
Run the benchmark yourself
The benchmark scripts are open source. Reproduce the results against your own document corpus.