
TL;DR
- None of these teams used raw LLMs for their core work. Quinn Emanuel used Claude for cognitive brainstorming; structured document review ran on a purpose-built litigation platform. General-purpose AI and domain-specific tools play different roles.
- AI compresses timelines that human staffing cannot match. Quinn Emanuel compressed its trial prep into about eight weeks; Lighthouse processed a 400,000-document production in under a week. The pattern repeats across all five cases.
- The AI cross-check is one of the strongest use cases. Mayer Brown spent $300,000 on managed review and still missed key documents. AI found them. The question is no longer whether AI can replace human review — it’s whether you can afford not to run an AI cross-check on human work.
- All five case studies are vendor-sourced or self-reported. Three come from a single Syllo white paper; Lighthouse and EvenUp figures are from company marketing. No firm has published a truly transparent case study with real data.
- Ask who captures the efficiency gains before signing. The hourly billing model creates structural tension with AI productivity that no case study here resolves. When AI compresses six months of work into six weeks, the economics of who benefits remain unsettled.
Corrections & Updates
- June 14, 2026: Expanded the Desktop Metal section with details from the Law, disrupted podcast that aren’t in Syllo’s white paper — two capabilities (production-deficiency analysis and privilege-log scoring) built mid-case in ~48 hours, the legal-theory prompting style, Quinn Emanuel’s broader use of Claude (drafting, deposition sequencing, multi-model cross-checks) with a human-in-the-loop caveat, the engagement’s remote staffing and usage-based billing, and a cross-examination anecdote involving Google Gemini.
- June 13, 2026: Added market context to the Desktop Metal section comparing Syllo’s timing to competing tools — Relativity’s aiR for Review reached general availability in September 2024 with customer-reported validation, and the Relativity–Redgrave Beyond the Bar paper (2024, cited in Syllo’s own white paper) predated Syllo’s March 2025 white paper. Noted that the first rigorous independent head-to-head benchmark of aiR (88% recall vs. 64% for active learning) did not publish until 2026.
- June 7, 2026: Updated EvenUp’s adoption figure from “more than 1,500 PI firms” to “more than 2,000,” per its 2026 Series E announcement.
The first four posts in this series mapped the legal AI ecosystem layer by layer: the foundation models, their structural limits, the tools built on top of them, and the managed services providers who deploy those tools inside human-led workflows. What that map doesn’t tell you is whether any of it actually works when the deadline is real and the stakes are high. What follows are five case studies — the most specific accounts we could find of legal teams using AI on real work.
Most discussion of AI in legal practice stays abstract: vendors promising efficiency, consultants projecting cost savings, bar associations debating ethics. What follows is more specific: five deployments where legal teams used AI on real work, with real deadlines and real consequences — spanning BigLaw merger trials, antitrust second requests, plaintiffs’ employment litigation, and high-volume personal injury practice. A disclosure upfront: three of the five are sourced from a single vendor’s white paper (Syllo),1 and the remaining two come from eDiscovery and legal tech vendor case studies. We couldn’t find five independently verified case studies because the industry hasn’t developed a culture of publishing them. We flag sourcing throughout.
One distinction matters before the case studies: none of these teams used ChatGPT or a raw LLM for their core work. Quinn Emanuel used Claude for brainstorming legal theories — cognitive work. For structured, high-volume document review, they used a purpose-built litigation platform. For demand letter generation, EvenUp trained specialized models on hundreds of thousands of injury cases. The line between general-purpose LLMs and domain-specific legal tools runs through every case that follows.
The Merger Breach: Desktop Metal v. Nano Dimension#
Firm: Quinn Emanuel Urquhart & Sullivan
Matter: Desktop Metal, Inc. v. Nano Dimension Ltd., Consolidated C.A. No. 2024-1303-KSJM (Del. Ch.)
AI Tools: Syllo AI (agentic document review), Claude (Anthropic’s LLM)
Documents: 50,000+ produced; 70,000+ reviewed
Timeline: ~8–9 weeks from Quinn Emanuel’s engagement to trial
Desktop Metal, a 3D printing company facing potential bankruptcy, needed to compel Nano Dimension to complete their merger agreement after Nano’s new board allegedly slow-walked CFIUS regulatory approvals to run out the clock. Desktop Metal filed suit in the Delaware Court of Chancery on December 16, 2024 using its deal counsel, and the court granted an expedited trial on December 30. Quinn Emanuel came in as fresh litigation counsel in early January 2025, then ran roughly eight weeks of expedited fact and expert discovery before the March 11–12, 2025 trial.
Case chronology:
- July 2, 2024 — Merger Agreement signed
- December 16, 2024 — Desktop Metal files its Chancery complaint, using deal counsel
- December 30, 2024 — Court grants Desktop Metal’s motion for an expedited trial
- Early January 2025 — Quinn Emanuel retained as fresh litigation counsel; files an Amended Complaint
- ~6-week sprint (January–early March 2025) — per the white paper: responsiveness review of 30,000+ of the client’s own documents and review of 40,000+ opposing-party documents (document review), plus depositions and pre-trial submissions (deposition prep and pre-trial strategy)
- March 11–12, 2025 — Two-day trial
- March 24, 2025 — Post-trial opinion and Order/Partial Final Judgment (the 48-hour CFIUS/NSA order)
- April 2, 2025 — Merger closes
The team used Syllo’s agentic document review platform to review and organize the full document universe through natural language prompts, building timelines, tagging material by issue, and identifying patterns across the production. According to the Law, disrupted podcast featuring Quinn Emanuel partner Christopher Kercher and Syllo co-founder Jeffrey Chivers, the AI system performed a first-level document review achieving an estimated recall of 98% (it found 98% of all relevant documents) and precision of 74% (74% of the documents it flagged were actually relevant). For context, the Syllo white paper benchmarks against human review recall of roughly 60-80% — a baseline drawn from Grossman and Cormack’s widely cited 2011 study of technology-assisted review. Syllo’s 98% recall is the vendor’s own self-reported result, unvalidated by any outside source. On the podcast, Kercher described prompting the platform at the level of a legal theory rather than keywords — asking, in effect, for “all the documents suggesting the buyer was hesitating about the deal or slow-walking CFIUS” and getting back the documents that fit the idea, a step beyond the Boolean searches traditional e-discovery relies on.
The team also ran Claude — on an enterprise license, to avoid confidentiality and data-leakage concerns — as a daily strategy partner: brainstorming legal theories, arguing against its own suggestions, drafting and sharpening post-trial brief passages, adjusting tone, and even structuring arguments “like Kathleen Sullivan,” the firm’s retired appellate eminence, as shorthand for high-quality advocacy. Kercher described deposition sequencing decisions too — whether to lead with a document or hold it for recross — and a multi-model habit of pitching an idea to one model and having another revise it. His candor is worth keeping: he estimated a “decent number” of the AI’s suggestions were implausible or unhelpful, with a human always in the loop. The value was instantaneous volume to choose from, not finished work.
As the matter moved at speed, the Quinn Emanuel and Syllo teams built two bespoke capabilities mid-case — in roughly 48 hours each, per the podcast. The first was a deficiency analysis that flagged which of about 35 requested document categories the other side had answered with zero or near-zero, fueling fast follow-up demands. The second scored the opponent’s privilege log for weak entries, narrowing a long log to a few hundred the team could plausibly challenge — Quinn Emanuel filed four motions to compel, one the day after receiving the log. Both pushed the agentic system past responsiveness review into offensive discovery work, and led to supplemental productions before deposition deadlines. In its March 24, 2025 post-trial opinion, the Court found “damning” evidence of Nano’s breach, ordering Nano to sign a national security agreement within 48 hours and close the merger. The $300 million deal closed on April 2, 2025.
One courtroom moment captured the asymmetry. Nano Dimension’s chairman testified on cross that he had asked Google Gemini whether to hire Quinn Emanuel or Paul Weiss for the case. It first recommended Paul Weiss — whom Nano hired — and, during witness prep weeks later, flipped to Quinn Emanuel.
Claimed advantage: The firm credits AI with making an aggressively compressed schedule viable — the chronology above runs roughly eight weeks from Quinn Emanuel’s engagement to a two-day trial. The Quinn Emanuel team was named American Lawyer Litigators of the Week in March 2025. How much of the win is attributable to AI versus aggressive lawyering on a compressed schedule is impossible to disaggregate — but the team credits the tools with making the timeline viable. On the economics, Chivers said about four Syllo staff supported the trial remotely, billed on usage plus hours — though he noted he generally prefers a flat, outcome-based fee, a structure the hourly model still resists.
Market context: GenAI document review wasn’t novel by the time of the March 2025 trial. Relativity’s aiR for Review had reached general availability in September 2024 — roughly a year after its January 2024 limited-availability debut — and the incumbent folded GenAI review into standard RelativityOne pricing at Relativity Fest 2025. Syllo’s deployment was a strong showing, but it ran alongside a maturing market. Relativity had also already put numbers on the record — customer-reported recall above 95% and precision above 70% at aiR’s September 2024 GA, plus a Relativity–Redgrave Data research paper, Beyond the Bar (2024), that Syllo’s own white paper cites. But like Syllo’s, those early figures were self- or customer-reported; the first rigorous independent head-to-head benchmark of aiR — 88% recall versus 64% for active learning — didn’t publish until 2026.
The footnote nobody expected: Quinn Emanuel subsequently sued for $30 million in unpaid fees after Nano Dimension, having gained control of Desktop Metal, allegedly stripped assets and steered the company into bankruptcy. The lawyers who used AI to win the case may be the ones left unpaid.
The Cross-Check: Mayer Brown#
Firm: Mayer Brown LLP
Matter: Not publicly identified — the white paper anonymizes this construction and contracting dispute, so no docket or filing date is available
AI Tool: Syllo AI
Documents: 400,000+ documents (8,000,000+ pages)
Issues coded: 15 primary deposition and trial issues
Prior spend: $300,000+ on managed document review
Timeline: Review completed in less than one week
More than two years into a high-value construction and contracting litigation, the Mayer Brown team wanted assurance that its prior managed review hadn’t missed critical documents as they prepared for depositions. They articulated 15 primary issues and had Syllo run an automated first-level review across the full 400,000-document universe.
Syllo completed the review in under a week. The result: the AI found key documents that had been missed in the earlier human-led review. Partner Brandon Renken stated that the system “more than proved its value by finding key documents that had been missed in the previously conducted managed review.”
Claimed advantage: Quality assurance at scale. The Mayer Brown team concluded that running an AI cross-check on a $300,000 managed review was worth the incremental expense. The open question: would Syllo have found the same documents if it had run first, without the managed review as a baseline? The white paper doesn’t say.
HSR Second Requests#
AI Tool: Lighthouse AI (proprietary LLMs for relevance review, privilege review, privilege log automation, key document identification)
Matters: Two Hart-Scott-Rodino Second Requests — one unnamed ($20M+ claimed savings), one involving Cleary Gottlieb ($4M claimed savings)
When the FTC or DOJ issues an HSR Second Request, the responding company typically has weeks to collect, process, review, and produce massive volumes of documents — with multibillion-dollar deals hanging on compliance.
The largest public dollar figure comes from an anonymized case study involving a “global company’s high-profile acquisition.” Lighthouse doesn’t name the client, but the timing (generative AI privilege tools launched January 2024), deal profile, and concurrent private antitrust class actions point toward Exxon’s $64.5 billion acquisition of Pioneer Natural Resources (FTC Second Request December 2023, deal closed May 2024) — with the overlapping litigation likely being In re Shale Oil Antitrust Litigation. We could be wrong, but the shoe fits.
Whatever the company, the workflow is well-documented. Lighthouse deployed its proprietary LLMs to handle the full review without traditional linear review: AI-driven relevance review eliminated the need for first-pass human coding, AI privilege review substantially reduced the privilege population, and generative AI automated privilege log drafting and names-list assembly. Simultaneously, Lighthouse ramped a 300-person managed review team, processed and produced 10TB+ of data and 20M+ images in three weeks, and built a secure repository to reuse work product across the related antitrust matter — saving an additional $3.5M across 680,000 documents. Total claimed savings: $20M+, with a 100% error-free production.
A more granular breakdown comes from Cleary Gottlieb’s collaboration with Lighthouse on a DOJ Second Request. The dataset: 3.3 million documents, a significant subset in CJK (Chinese, Japanese, Korean) languages requiring expensive translation, and DOJ scope negotiations that kept adding data mid-project. Cleary’s eDiscovery head, CJ Mahoney, recognized that conventional TAR couldn’t handle a dataset with constantly changing review parameters and turned to Lighthouse’s AI. The results: Lighthouse removed 200,000 documents from privilege review beyond what conventional methods could achieve — an estimated $1.2 million and 8,000 review hours saved. The AI also reduced the responsive foreign-language document set by 120,000 documents compared to legacy TAR tools, cutting translation costs by approximately $1 million. Total claimed savings: $4 million versus what the team estimated they would have spent using prior-generation analytics — again, a comparison against a counterfactual, not a reduction from an actual invoice.
Claimed advantage: AI applied across the entire Second Request workflow — not just document review but privilege logging, names lists, key document identification, and cross-matter reuse. The $20M figure is the largest dollar claim in any case study in this post, but also the least verifiable: no named firm, no named individual, and Lighthouse is the sole source. The Cleary matter is smaller but more credible — a named firm, a named eDiscovery lead, and a specific breakdown of where the savings came from.
Outten & Golden: Plaintiffs’ Employment Litigation#
Firm: Outten & Golden LLP
Matter: Not publicly identified — anonymized in the white paper as an employment matter, so no docket or filing date is available
AI Tool: Syllo AI
Documents: 12,543
Issue codes: 28 (mapped directly to requests for production)
Precision: 84.09% (confirmed by associate second-level review)
Recall: Elusion testing found zero missed documents in the null set
Outten & Golden is a plaintiffs’ employment firm — a practice model where staffing is lean, budgets are tight, and the economics of a $300,000 managed review don’t work.
In an employment matter, Outten & Golden used Syllo to identify documents for production from a collection of 12,543 documents. The team mapped their review instructions directly to the requests for production served on their client, defining 28 issue tags. One code was flagged as overbroad during the review and was re-drafted — a real-time correction that illustrates how the agentic system interacts with attorney oversight.
Syllo tagged 484 documents as responsive to one or more requests for production. An associate then conducted a second-level review and confirmed a precision rate of 84.09%. Elusion testing on the non-responsive set found no missed documents, suggesting recall at or near 100%.
Claimed advantage: Speed at a price point that works for a plaintiffs’ firm. Whether these results would hold on a 500,000-document dataset with more complex privilege issues is untested — this was a relatively small collection. But for the scope of work involved, the numbers are strong.
Personal Injury Demand Letters: AI as Revenue Engine#
Firms: Jeffcoat Injury Lawyers (SC); Anderson Injury Lawyers (TX); McCready Law (IL); 1,500+ PI firms total
AI Tool: EvenUp (AI demand letter generation and claims intelligence)
Claims processed: $7 billion+
Data: 250,000+ verdict and settlement data points
Settlement impact: 69% higher likelihood of policy-limit settlements (EvenUp’s internal data)
Plaintiffs’ attorneys work on contingency, carry cases for months before seeing revenue, and compete on volume and settlement speed. AI’s impact here isn’t measured in recall rates — it’s measured in demand letters per month and days to settlement.
EvenUp, an AI platform purpose-built for personal injury law, processes medical records, generates demand letter packages, and provides settlement valuation benchmarks drawn from over 250,000 verdicts and settlements. The platform is now used by more than 2,000 PI firms.
The most specific public numbers come from three firms. South Carolina-based Jeffcoat Injury Lawyers reported generating 3x more demand letters and settling cases 30 days faster after adopting EvenUp’s Express Demands product. COO Dwuan Hammond stated that the firm grew its top line by approximately 300% while adding only 30% more staff — though attributing that growth to a demand-letter tool alone ignores case acquisition, marketing, and market conditions. Dallas-Fort Worth firm Anderson Injury Lawyers reported a 4x return on investment based on time savings, found missing documentation, lower case-carrying costs, and increased policy-limit settlements. Managing Partner Mark Anderson noted that in certain case types, the firm now mandates EvenUp-generated demands because the results exceed their human-drafted versions. McCready Law in Chicago reported that AI-generated demand packages organized information in a way that reduced adjuster pushback during negotiations.
Claimed advantage: Throughput. When a demand package that took 6-8 hours of paralegal time can be generated in minutes, the constraint on firm growth shifts from labor to case acquisition.
The caveat: EvenUp’s 69% policy-limit settlement figure and the individual firm results come from the company’s own data and marketing materials. Independent validation hasn’t been published.
What These Cases Actually Tell Us#
The same technology works differently across practice areas. In litigation, the metrics are recall and precision — Syllo’s self-reported average recall of 97.8% across its last ten reviews, if accurate, substantially exceeds the human baseline. In HSR second requests, the metric is dollars saved on a compressed timeline. In personal injury, the metric is demands per month. Same underlying technology, completely different value propositions.
But a common pattern runs through all five: AI trades compute for human hours. The Desktop Metal case is the starkest example — though how much of that is AI and how much is Quinn Emanuel’s willingness to take a six-week sprint is hard to separate. The pattern repeats in HSR matters.
The underlying economics are straightforward. A contract reviewer at a managed review provider doing first-pass coding performs the same cognitive operation thousands of times: read, classify, tag. That operation costs roughly $50-75/hour. An LLM performs a functionally similar operation — reading text, matching it against criteria, producing a classification — for pennies per document. At the extreme end, a Harvard CLP study of AmLaw 100 firms (Couture, 2025) found that complaint response systems reduced associate time from 16 hours to 3-4 minutes — a ratio that makes sense only when you recognize the task as high-volume text processing, not legal reasoning.
OpenAI’s GDPval study (2025) tested this at industry scale: 1,320 real-world professional tasks across 44 occupations, blind-evaluated by experts averaging 14 years of experience. On legal tasks, AI output was judged equal or superior to human expert output 46% of the time — approaching parity on well-defined deliverables, still losing the majority on tasks requiring judgment. The speed gains in these case studies were unambiguous. The quality gains were not — and the case studies that claim quality improvements (EvenUp’s “results exceed human-drafted versions,” Syllo’s 98% recall) are all self-reported by the tool’s maker or its customer.
The cross-check may be the most important use case. Mayer Brown spent $300,000 on managed review and still missed key documents. The AI found them. As document volumes grow into the millions, the question isn’t whether AI can replace human review — it’s whether any team can afford not to run an AI cross-check on human work.
AI works on both sides of the v. Outten & Golden’s use breaks the assumption that legal AI is a BigLaw luxury — though the 12,543-document collection is orders of magnitude smaller than the datasets in the Lighthouse and Quinn Emanuel cases. Whether plaintiff-side firms will use these tools on larger, more complex matters remains to be seen.
The defensibility question is open. Courts have accepted technology-assisted review since Da Silva Moore v. Publicis Groupe (2012) and Rio Tinto PLC v. Vale S.A. (2015). But those opinions addressed predictive coding — supervised machine learning trained on attorney seed sets. No published opinion has specifically addressed LLM-powered classification as a substitute for human first-pass review. When an AI misclassifies a privileged document and it gets produced, the consequences may be categorically different from a human reviewer’s mistake. Courts understand human error; the defensibility of an AI-driven privilege workflow remains untested at the appellate level.
The unresolved question: who captures the efficiency? When AI compresses billable work, does the client’s bill shrink, or does the firm redeploy those hours elsewhere? The Desktop Metal fee dispute involved a payment dispute tied to the complexities of the merger and takeover situation rather than AI-driven efficiency specifically, but the broader tension is real. The Harvard CLP study found that none of the ten AmLaw 100 firms interviewed anticipate reducing attorney headcount — but the billable hour model, still governing 80%+ of fee arrangements, creates a structural tension with productivity gains that no case study in this article addresses.
No firm has published a case study titled “We Used AI and It Missed the Smoking Gun.” Three of the five case studies above come from a single Syllo white paper co-authored with practitioners. Lighthouse’s case studies come from its sales materials. EvenUp’s numbers are drawn from internal data. That doesn’t make them meaningless — it means they should be read as evidence of what’s possible under favorable conditions, not a guarantee of what you’ll get on your next matter.
Further Reading#
- Syllo White Paper: Agentic AI Document Review Is Transformative for Complex Litigation (March 2025). The primary source for the Quinn Emanuel, Mayer Brown, and Outten & Golden case studies.
- Law, disrupted — “Winning at Trial With AI”. Quinn Emanuel’s John B. Quinn interviews Christopher Kercher and Jeffrey Chivers on the Desktop Metal case.
- Desktop Metal v. Nano Dimension: Quinn Emanuel’s Case Summary. The firm’s description of the legal victory.
- Desktop Metal, Inc. v. Nano Dimension Ltd., Order and Partial Final Judgment (Del. Ch. Mar. 24, 2025), Consolidated C.A. No. 2024-1303-KSJM. The primary-source post-trial order, with the full case chronology.
- Lighthouse: $20M in Savings in a High-Stakes Second Request. The largest dollar-figure case study, with full AI workflow details.
- Lighthouse Antitrust Case Studies. The Cleary Gottlieb DOJ Second Request collaboration.
- EvenUp: Anderson Injury Lawyers Case Study. 4x ROI on AI-generated demand letters.
- EvenUp: McCready Law Case Study. AI integration across the PI case lifecycle.
- Mayer Brown: Eight Practical Ways to Leverage Generative AI in Litigation. A practitioner’s framework for AI adoption.
- The Impact of Artificial Intelligence on Law Firms’ Business Models (Couture, Harvard CLP, 2025). Interviews with COOs at ten AmLaw 100 firms on AI’s impact on revenue models and staffing.
- GDPval: Evaluating AI Model Performance on Real-World Economically Valuable Tasks (OpenAI, 2025). Benchmark of AI vs. human expert output across 44 occupations, including legal tasks.
This post is part of the Legal AI Landscape series on LegalRealist AI. It is intended for informational and educational purposes only and does not constitute legal advice. AI capabilities and vendor claims described here reflect publicly available information as of the publication date and should be independently verified. The author has no commercial relationship with any vendor mentioned.
A quick aside on who’s behind Syllo. It was founded in 2019 by two litigators — Jeffrey Chivers and Theodore Rostow (COO/CSO) — who met while clerking in federal court and concluded that the legal technology on the market wasn’t built for litigators: it didn’t aggregate case information, let attorneys effectively search the case file, or help devise case strategy (LawSites; Syllo). Rostow came from the litigation side — five-plus years in practice (Simpson Thacher, then a small tech-enabled litigation boutique) and a clerkship with Third Circuit Judge Thomas L. Ambro (ZoomInfo). They didn’t build the AI alone: the technical foundation came together with Carnegie Mellon engineers — including Jamie Callan of CMU’s Language Technologies Institute, a well-known figure in information retrieval — which lends some real research credibility to the document-review claims. Clark Slater rounds out the founding team as lead engineer. The name nods to the syllogism: on the podcast Chivers said the team mapped the formal types of legal reasoning into the scaffolding the system wraps around the language models. ↩︎



