Skip to main content

Five Case Studies from the Firms Actually Using AI

The Legal AI Landscape - This article is part of a series.
Part 5: This Article

Five Case Studies from the Firms Actually Using AI
#

Most discussion of AI in legal practice stays abstract: vendors promising efficiency, consultants projecting cost savings, bar associations debating ethics. What follows is more specific: five deployments where legal teams used AI on real work, with real deadlines and real consequences — spanning BigLaw merger trials, antitrust second requests, plaintiffs’ employment litigation, and high-volume personal injury practice. A disclosure upfront: three of the five are sourced from a single vendor’s white paper (Syllo), and the remaining two come from eDiscovery and legal tech vendor case studies. We couldn’t find five independently verified case studies because the industry hasn’t developed a culture of publishing them. We flag sourcing throughout.

One distinction matters before the case studies: none of these teams used ChatGPT or a raw LLM for their core work. Quinn Emanuel used Claude for brainstorming legal theories — cognitive work. For structured, high-volume document review, they used a purpose-built litigation platform. For demand letter generation, EvenUp trained specialized models on hundreds of thousands of injury cases. The line between general-purpose LLMs and domain-specific legal tools runs through every case that follows.

Comparison matrix showing all five case studies side by side — practice area, AI tool, document volume, timeline, key metric, and source for each

The Merger Breach: Desktop Metal v. Nano Dimension
#

Firm: Quinn Emanuel Urquhart & Sullivan
Matter: Desktop Metal, Inc. v. Nano Dimension Ltd., Delaware Court of Chancery
AI Tools: Syllo AI (agentic document review), Claude (Anthropic’s LLM)
Documents: 50,000+ produced; 70,000+ reviewed
Timeline: Six weeks from engagement to trial

Desktop Metal, a 3D printing company facing potential bankruptcy, needed to compel Nano Dimension to complete their merger agreement after Nano’s new board allegedly slow-walked CFIUS regulatory approvals to run out the clock. Quinn Emanuel was retained in early January 2025 with trial set for March 11 — roughly six weeks to do what would typically take six months.

The team used Syllo’s agentic document review platform to review and organize the full document universe through natural language prompts, building timelines, tagging material by issue, and identifying patterns across the production. According to the Law, disrupted podcast featuring Quinn Emanuel partner Christopher Kercher and Syllo co-founder Jeffrey Chivers, the AI system performed a first-level document review achieving an estimated recall of 98% (it found 98% of all relevant documents) and precision of 74% (74% of the documents it flagged were actually relevant). For context, the Syllo white paper cites studies placing human review recall at roughly 60-80% — though both the AI performance claim and the human baseline come from the same vendor-authored source.

The team also used Claude to brainstorm legal theories, test arguments, and develop lines of questioning for depositions. As Kercher described it, Claude served as a “cognitive tool” that amplified the attorneys’ capabilities while lawyers maintained full responsibility for all work product.

During trial, Syllo provided real-time analysis to identify gaps in the opposing party’s discovery production, leading to supplemental productions before deposition deadlines. The Court found “damning” evidence of Nano’s breach, ordering Nano to sign a national security agreement within 48 hours and close the merger. The $300 million deal closed on April 2, 2025.

Claimed advantage: The firm says it compressed six months of trial preparation into six weeks. The Quinn Emanuel team was named American Lawyer Litigators of the Week in March 2025. How much of the win is attributable to AI versus aggressive lawyering on a compressed schedule is impossible to disaggregate — but the team credits the tools with making the timeline viable.

The footnote nobody expected: Quinn Emanuel subsequently sued for $30 million in unpaid fees after Nano Dimension, having gained control of Desktop Metal, allegedly stripped assets and steered the company into bankruptcy. The lawyers who used AI to win the case may be the ones left unpaid.

The Cross-Check: Mayer Brown
#

Firm: Mayer Brown LLP
AI Tool: Syllo AI
Documents: 400,000+ documents (8,000,000+ pages)
Issues coded: 15 primary deposition and trial issues
Prior spend: $300,000+ on managed document review
Timeline: Review completed in less than one week

More than two years into a high-value construction and contracting litigation, the Mayer Brown team wanted assurance that its prior managed review hadn’t missed critical documents as they prepared for depositions. They articulated 15 primary issues and had Syllo run an automated first-level review across the full 400,000-document universe.

Syllo completed the review in under a week. The result: the AI found key documents that had been missed in the earlier human-led review. Partner Brandon Renken stated that the system “more than proved its value by finding key documents that had been missed in the previously conducted managed review.”

Claimed advantage: Quality assurance at scale. The Mayer Brown team concluded that running an AI cross-check on a $300,000 managed review was worth the incremental expense. The open question: would Syllo have found the same documents if it had run first, without the managed review as a baseline? The white paper doesn’t say.

HSR Second Requests
#

AI Tool: Lighthouse AI (proprietary LLMs for relevance review, privilege review, privilege log automation, key document identification)
Matters: Two Hart-Scott-Rodino Second Requests — one unnamed ($20M+ claimed savings), one involving Cleary Gottlieb ($4M claimed savings)

When the FTC or DOJ issues an HSR Second Request, the responding company typically has weeks to collect, process, review, and produce massive volumes of documents — with multibillion-dollar deals hanging on compliance.

The largest public dollar figure comes from an anonymized case study involving a “global company’s high-profile acquisition.” Lighthouse doesn’t name the client, but the timing (generative AI privilege tools launched January 2024), deal profile, and concurrent private antitrust class actions point toward Exxon’s $64.5 billion acquisition of Pioneer Natural Resources (FTC Second Request December 2023, deal closed May 2024) — with the overlapping litigation likely being In re Shale Oil Antitrust Litigation. We could be wrong, but the shoe fits.

Whatever the company, the workflow is well-documented. Lighthouse deployed its proprietary LLMs to handle the full review without traditional linear review: AI-driven relevance review eliminated the need for first-pass human coding, AI privilege review substantially reduced the privilege population, and generative AI automated privilege log drafting and names-list assembly. Simultaneously, Lighthouse ramped a 300-person managed review team, processed and produced 10TB+ of data and 20M+ images in three weeks, and built a secure repository to reuse work product across the related antitrust matter — saving an additional $3.5M across 680,000 documents. Total claimed savings: $20M+, with a 100% error-free production.

A more granular breakdown comes from Cleary Gottlieb’s collaboration with Lighthouse on a DOJ Second Request. The dataset: 3.3 million documents, a significant subset in CJK (Chinese, Japanese, Korean) languages requiring expensive translation, and DOJ scope negotiations that kept adding data mid-project. Cleary’s eDiscovery head, CJ Mahoney, recognized that conventional TAR couldn’t handle a dataset with constantly changing review parameters and turned to Lighthouse’s AI. The results: Lighthouse removed 200,000 documents from privilege review beyond what conventional methods could achieve — an estimated $1.2 million and 8,000 review hours saved. The AI also reduced the responsive foreign-language document set by 120,000 documents compared to legacy TAR tools, cutting translation costs by approximately $1 million. Total claimed savings: $4 million versus what the team estimated they would have spent using prior-generation analytics — again, a comparison against a counterfactual, not a reduction from an actual invoice.

Claimed advantage: AI applied across the entire Second Request workflow — not just document review but privilege logging, names lists, key document identification, and cross-matter reuse. The $20M figure is the largest dollar claim in any case study in this post, but also the least verifiable: no named firm, no named individual, and Lighthouse is the sole source. The Cleary matter is smaller but more credible — a named firm, a named eDiscovery lead, and a specific breakdown of where the savings came from.

Outten & Golden: Plaintiffs’ Employment Litigation
#

Firm: Outten & Golden LLP
AI Tool: Syllo AI
Documents: 12,543
Issue codes: 28 (mapped directly to requests for production)
Precision: 84.09% (confirmed by associate second-level review)
Recall: Elusion testing found zero missed documents in the null set

Outten & Golden is a plaintiffs’ employment firm — a practice model where staffing is lean, budgets are tight, and the economics of a $300,000 managed review don’t work.

In an employment matter, Outten & Golden used Syllo to identify documents for production from a collection of 12,543 documents. The team mapped their review instructions directly to the requests for production served on their client, defining 28 issue tags. One code was flagged as overbroad during the review and was re-drafted — a real-time correction that illustrates how the agentic system interacts with attorney oversight.

Syllo tagged 484 documents as responsive to one or more requests for production. An associate then conducted a second-level review and confirmed a precision rate of 84.09%. Elusion testing on the non-responsive set found no missed documents, suggesting recall at or near 100%.

Claimed advantage: Speed at a price point that works for a plaintiffs’ firm. Whether these results would hold on a 500,000-document dataset with more complex privilege issues is untested — this was a relatively small collection. But for the scope of work involved, the numbers are strong.

Personal Injury Demand Letters: AI as Revenue Engine
#

Firms: Jeffcoat Injury Lawyers (SC); Anderson Injury Lawyers (TX); McCready Law (IL); 1,500+ PI firms total
AI Tool: EvenUp (AI demand letter generation and claims intelligence)
Claims processed: $7 billion+
Data: 250,000+ verdict and settlement data points
Settlement impact: 69% higher likelihood of policy-limit settlements (EvenUp’s internal data)

Plaintiffs’ attorneys work on contingency, carry cases for months before seeing revenue, and compete on volume and settlement speed. AI’s impact here isn’t measured in recall rates — it’s measured in demand letters per month and days to settlement.

EvenUp, an AI platform purpose-built for personal injury law, processes medical records, generates demand letter packages, and provides settlement valuation benchmarks drawn from over 250,000 verdicts and settlements. The platform is now used by more than 1,500 PI firms.

The most specific public numbers come from three firms. South Carolina-based Jeffcoat Injury Lawyers reported generating 3x more demand letters and settling cases 30 days faster after adopting EvenUp’s Express Demands product. COO Dwuan Hammond stated that the firm grew its top line by approximately 300% while adding only 30% more staff — though attributing that growth to a demand-letter tool alone ignores case acquisition, marketing, and market conditions. Dallas-Fort Worth firm Anderson Injury Lawyers reported a 4x return on investment based on time savings, found missing documentation, lower case-carrying costs, and increased policy-limit settlements. Managing Partner Mark Anderson noted that in certain case types, the firm now mandates EvenUp-generated demands because the results exceed their human-drafted versions. McCready Law in Chicago reported that AI-generated demand packages organized information in a way that reduced adjuster pushback during negotiations.

Claimed advantage: Throughput. When a demand package that took 6-8 hours of paralegal time can be generated in minutes, the constraint on firm growth shifts from labor to case acquisition.

The caveat: EvenUp’s 69% policy-limit settlement figure and the individual firm results come from the company’s own data and marketing materials. Independent validation hasn’t been published.

What These Cases Actually Tell Us
#

The same technology works differently across practice areas. In litigation, the metrics are recall and precision — Syllo’s self-reported average recall of 97.8% across its last ten reviews, if accurate, substantially exceeds the human baseline. In HSR second requests, the metric is dollars saved on a compressed timeline. In personal injury, the metric is demands per month. Same underlying technology, completely different value propositions.

But a common pattern runs through all five: AI trades compute for human hours. The Desktop Metal case is the starkest example — though how much of that is AI and how much is Quinn Emanuel’s willingness to take a six-week sprint is hard to separate. The pattern repeats in HSR matters.

The underlying economics are straightforward. A contract reviewer at a managed review provider doing first-pass coding performs the same cognitive operation thousands of times: read, classify, tag. That operation costs roughly $50-75/hour. An LLM performs a functionally similar operation — reading text, matching it against criteria, producing a classification — for pennies per document. At the extreme end, a Harvard CLP study of AmLaw 100 firms (Couture, 2025) found that complaint response systems reduced associate time from 16 hours to 3-4 minutes — a ratio that makes sense only when you recognize the task as high-volume text processing, not legal reasoning.

Diagram showing the split between the judgment layer (trial strategy, privilege calls, client counseling — attorneys required) and the volume layer (classification, extraction, screening — where AI trades compute for hours)

OpenAI’s GDPval study (2025) tested this at industry scale: 1,320 real-world professional tasks across 44 occupations, blind-evaluated by experts averaging 14 years of experience. On legal tasks, AI output was judged equal or superior to human expert output 46% of the time — approaching parity on well-defined deliverables, still losing the majority on tasks requiring judgment. The speed gains in these case studies were unambiguous. The quality gains were not — and the case studies that claim quality improvements (EvenUp’s “results exceed human-drafted versions,” Syllo’s 98% recall) are all self-reported by the tool’s maker or its customer.

The cross-check may be the most important use case. Mayer Brown spent $300,000 on managed review and still missed key documents. The AI found them. As document volumes grow into the millions, the question isn’t whether AI can replace human review — it’s whether any team can afford not to run an AI cross-check on human work.

AI works on both sides of the v. Outten & Golden’s use breaks the assumption that legal AI is a BigLaw luxury — though the 12,543-document collection is orders of magnitude smaller than the datasets in the Lighthouse and Quinn Emanuel cases. Whether plaintiff-side firms will use these tools on larger, more complex matters remains to be seen.

The defensibility question is open. Courts have accepted technology-assisted review since Da Silva Moore v. Publicis Groupe (2012) and Rio Tinto PLC v. Vale S.A. (2015). But those opinions addressed predictive coding — supervised machine learning trained on attorney seed sets. No published opinion has specifically addressed LLM-powered classification as a substitute for human first-pass review. When an AI misclassifies a privileged document and it gets produced, the consequences may be categorically different from a human reviewer’s mistake. Courts understand human error; the defensibility of an AI-driven privilege workflow remains untested at the appellate level.

Timeline showing court acceptance of predictive coding in 2012 and 2015, the emergence of LLM-powered review tools in 2019-2024, and the open question of when a court will first rule on LLM-driven privilege review

The unresolved question: who captures the efficiency? When AI compresses billable work, does the client’s bill shrink, or does the firm redeploy those hours elsewhere? The Desktop Metal fee dispute hints at a darker possibility — a case won in six weeks instead of six months generates a fraction of the billable revenue, and the client may refuse to pay even that. The Harvard CLP study found that none of the ten AmLaw 100 firms interviewed anticipate reducing attorney headcount — but the billable hour model, still governing 80%+ of fee arrangements, creates a structural tension with productivity gains that no case study in this article addresses.

No firm has published a case study titled “We Used AI and It Missed the Smoking Gun.” Three of the five case studies above come from a single Syllo white paper co-authored with practitioners. Lighthouse’s case studies come from its sales materials. EvenUp’s numbers are drawn from internal data. That doesn’t make them meaningless — it means they should be read as evidence of what’s possible under favorable conditions, not a guarantee of what you’ll get on your next matter.

Further Reading
#


This post is part of the Legal AI Landscape series on LegalAI Insights. It is intended for informational and educational purposes only and does not constitute legal advice. AI capabilities and vendor claims described here reflect publicly available information as of the publication date and should be independently verified. The author has no commercial relationship with any vendor mentioned.

The Legal AI Landscape - This article is part of a series.
Part 5: This Article

Related