Datasets and tools built alongside the writing on LegalRealist AI. Everything is free to use, with sources cited and methods documented. Code and data are hosted on legalhack.io.
Data#
| Title | Description | References |
|---|---|---|
| AI Court Orders Explorer | Search court orders on AI in legal proceedings by judge, state, or order type — no Westlaw or Lexis required. | Charts · GitHub · Post |
| Medicare Fraud Backtest | Backtest excluded Medicare providers against pre-exclusion billing data. 289 providers, 3.39M peers, 15 features, AUC 0.79. DOJ prosecution cross-reference and out-of-sample validation. | GitHub · Post · Walkthrough |
| FinCEN SARs + FCPA | Synthetic SARs joined with Stanford FCPA enforcement data for classification research. Coming soon. | Post |
Code#
| Title | Description | References |
|---|---|---|
| Lying Spreadsheets | Parser differential attack PoC: Excel number formats that make LLMs read different financials than humans see. Includes SheetGuard detection tool. | GitHub · Post |
| eDiscovery Cost Calculator | Compare traditional vs AI-enhanced eDiscovery workflows. Adjust staffing, risk profiles, and AI efficiency. | GitHub · Post |
| Law School LLM Wiki | AI-maintained knowledge base powered by Claude Code. Drop in your documents and get a searchable, interlinked wiki. | GitHub |
Suggestions welcome — get in touch.
