Primary source data, made explorable
I wanted to search what the founders actually wrote. Not someone's interpretation. Not a curated excerpt in a textbook. The actual text, alongside everything that came after, in one index. That was the first project. Then it kept going.
Public data should be publicly explorable. Government agencies publish datasets that are technically available but practically inaccessible: bulk downloads, PDFs, paywalled FOIA responses, formats that require a data pipeline to read. Verbatim takes those datasets and makes them searchable, filterable, and visible.
What’s here
Thirteen datasets and one curated feature. Each dataset is a public record that was already available in theory, now available in practice.
- The Corpus — 43,990 records of political speech and writing, 1776 to 2025. Supreme Court opinions, presidential debates, floor speeches, executive orders, tweets, founding documents. Full-text search across 91 voices and all three branches of government.
- COMPAS — 7,214 criminal defendants scored by an algorithm that predicts recidivism. The data ProPublica used to show the algorithm was biased against Black defendants.
- Senate Stock Trades — 7,740 stock transactions disclosed by U.S. senators under the STOCK Act. Every purchase, every sale, every ticker.
- Pentagon to Police — 83,499 transfers of military surplus equipment from the Department of Defense to local law enforcement. Rifles, armored vehicles, night vision gear.
- Fatal Police Shootings — The Washington Post's Fatal Force database. 10,430 people killed by on-duty police officers since 2015.
- PPP Loans — 950,000+ Paycheck Protection Program loans over $150,000. Who got the money, how much, whether it was forgiven.
- Campaign Finance — 3,457 FEC-registered candidates for the 2024 cycle. Who raised what, from where.
- Qualified Immunity — 5,500+ federal appellate qualified immunity cases, 2010–2020. The Institute for Justice's comprehensive dataset of how courts apply the doctrine.
- Civil Asset Forfeiture — 342,982 property-level forfeiture records from 24 states, 1986–2019. Currency, vehicles, and real property seized by law enforcement.
- Workplace Injuries — 396,263 OSHA injury and illness reports for 2024. Every employer filing under the Injury Tracking Application, searchable by company, industry, and state.
- Animal Welfare — 107,767 USDA APHIS animal welfare inspections, 2015–2026. Breeders, exhibitors, research facilities, carriers. Critical and non-critical violations.
- Toxic Releases — 78,647 EPA Toxic Release Inventory reports from 2023. Industrial facilities reporting chemical releases to air, water, and land.
- Nursing Homes — 419,452 CMS health deficiency citations from federal nursing home inspections. Every Medicare/Medicaid-certified facility, scored by scope and severity from minimal harm potential to immediate jeopardy.
- Vs. — Curated debates using real quotes. Churchill vs. Twain, Lincoln vs. Jefferson. Two voices, one topic, side by side.
The approach
Each dataset gets the same treatment: search, filter, paginate, visualize. No editorializing, no analysis, no conclusions. The data speaks or it doesn't. Verbatim provides the microphone, not the script.
The name is the tell. Not "interpreted." Not "analyzed." Not "summarized." Verbatim.
Built by ksmith.
No ads. No sponsored content. No affiliate links.