You Can Agent the Work, Not the Walls

Table of Contents

Trust Boundaries - This article is part of a series.

Part 1: This Article

TL;DR
Authentication is built to be impossible to talk your way past — and Meta put a chatbot in that seat. A half-century of cryptography makes access depend on possessing a secret, not on persuading someone.
The attack never touched the victim’s email. The bot sent a reset code to an address the attacker supplied, then accepted the attacker reading it back as proof of ownership.
Legal practice runs on the same walls. Conflicts checks, ethical screens, and privilege boundaries exist to refuse the person who most wants past them.
Authorized insiders already walk straight through these walls. A decade-long insider trading ring pulled M&A files off six top firms using legitimate credentials. An agent at a firm endpoint is the same insider with fewer deterrents.
You can agent the work, not the walls. The test isn’t how risky a task feels — it’s whether the seat’s job is to gate.

Over the last weekend in May, Instagram accounts began getting hijacked, and a video on X laid out the method. The attacker used a VPN to spoof the victim’s location and slip past Instagram’s automated protections, opened the Meta AI Support Assistant, and asked it to add a new email to the target’s account. The bot sent a verification code to that attacker-supplied address; the attacker pasted it back, and the bot surfaced a “Reset Password” button. New password, new owner.

Victims ranged from the dormant Obama-era White House handle to U.S. Space Force chief master sergeant John Bentivegna. The intruder never needed the victim’s real email — TechCrunch confirmed the code landed in the attacker’s own inbox. Instagram said Monday the hole was fixed; the count is unknown.

It reads like a clever trick. It is closer to a category error — an agent placed where the job is gating, not helping. The same mistake is available to any law firm handing agents more authority.

Authentication Is a Wall, Not a Conversation
#

Account recovery exists to prove one thing: that you control a credential or channel already on file. A half-century of work since Diffie and Hellman’s 1976 “New Directions in Cryptography” converges on a single design goal — make access depend on possession, never on persuasion. The verification code is a small cryptographic primitive; its only meaning is “the holder controls the channel it was sent to.” That is the reason authentication works at all. A wall you can argue with is not a wall.

The Agent Turned the Wall Into a Conversation
#

Meta seated a persuadable system at exactly the control point engineered to be impersuadable. The bot let the attacker pick the channel the code went to, then accepted possession of that code as proof of ownership — though it proved only that the attacker controlled his own inbox. The cryptographic primitive worked perfectly; what failed was the wrapper that could be talked into pointing it at the wrong target.

This was not prompt injection — no smuggled instructions, no jailbreak. The attacker simply asked, and a large language model tuned to resolve requests did what it was built to do. The helpfulness wasn’t a bug; it was the feature the bot was deployed for. An assistant optimized to say yes is the wrong occupant for a seat whose function is to gate, verify, and authenticate.

Two account-recovery flows side by side: the legitimate path sends a verification code to the email already on file, so possessing the code proves ownership; the exploited path lets the attacker supply the destination email, so the same code proves only that the attacker controls their own inbox — the verification gate severed from the thing it is supposed to prove

The old defense against this — talk your way past the help desk — was a trained, skeptical human who escalated when a request felt wrong. Replace that desk with a friendly bot and you delete the skeptic but keep the door.

Legal Work Runs on the Same Walls
#

A law firm runs on the same kind of wall — gates built to refuse the person who most wants past them. The ethical screen (Model Rule 1.10), so a conflicted lawyer’s knowledge can’t reach the opposing matter team. The conflicts check that refuses an engagement before it opens. The privilege boundary, already live in United States v. Heppner, where a court held a defendant’s exchanges with a consumer AI assistant weren’t privileged — which we covered. Access controls and trust accounting guard the rest.

Drop a helpfulness-optimized agent into one of these and you have rebuilt the Instagram failure with a malpractice tail: an agent with cross-matter memory that “helpfully” surfaces a screened file is the wall talking back. The risk isn’t hypothetical — research we’ve cited before found 94.4% of tested LLM agents vulnerable to prompt injection.

The Walls Assume a Human
#

In May 2026, federal prosecutors indicted thirty attorneys and financial professionals in a decade-long insider trading ring that pulled confidential M&A files off six top firms’ systems. The ringleader used his own credentials to reach the deal rooms. The access controls worked exactly as designed — they just assumed the keyholder was loyal. The scheme, said U.S. Attorney Leah Foley, exploited “the special access and ethical duties that come with a law license.”

Every wall assumes the insider behind it is a human with a conscience and a career to lose. An agentic AI tool wired into a firm endpoint is a new kind of insider — full credentials, no loyalty, a documented tendency to do what it’s asked. The ring needed a decade and a network of trusted classmates; an over-permissioned agent collapses that to a single endpoint anyone who reaches it can try to talk past.

You Can Agent the Work, Not the Walls
#

The line is not “AI is too risky for law.” Most legal work is delegable — drafting, summarizing, indexing depositions, the volume work where an error costs time, not access.

What can’t be handed to a persuadable agent are the gatekeeping functions: authentication, authorization, conflicts, ethical screens, privilege routing, the movement of client funds. Automating those is fine — a deterministic rule can’t be argued with; a system optimized to be agreeable can. The boundary isn’t how sensitive the task feels. It is a binary question: is this seat a gate? If it is, a persuadable agent doesn’t belong in it — because being convinced is the precise vulnerability the seat was built to eliminate.

A decision diagram dividing legal tasks into two columns: the delegable column (drafting, summarizing, document indexing, deadline tracking, research assembly) marked as appropriate for agents, and the walled-off column (authentication, access control, conflicts checks, ethical screens, privilege decisions, trust-account transfers) marked off-limits, with the dividing test labeled: is this seat a gate?

What This Means in Practice
#

The defense is not a better-trained bot; it is a permissions decision — identify which seats are gates and keep agents out of them. A more capable agent at the wall is only a more capable attacker once someone reaches it. Every helpful capability granted at a gate is an instruction set for whoever gets there first.

When Documents Are the Attack Surface

23 May 2026 · 16 mins

Prompt-Injection Cybersecurity Adversarial-AI

The attack surface isn't AI — it's the documents AI processes. Prompt injection in discovery, adversarial inputs delivered through Rule 34 productions, and the cybersecurity gaps firms create by piping untrusted content through LLM pipelines.

AI Playbook: Litigation Workflows with Claude Cowork

10 February 2026 · 16 mins

Claude-Cowork Anthropic Agentic-AI

A practical walkthrough of Claude Cowork across the litigation lifecycle — organized around Projects for matters and Skills for recurring tasks — plus the privilege question every firm needs to answer first.

The Agents

21 January 2026 · 15 mins

Agentic-AI Autonomous-Workflows Harvey

Every major legal AI vendor shipped autonomous agents in Q1 2026. Here's what they actually do, what can go wrong, and why your ethical walls weren't built for this.

You Can Agent the Work, Not the Walls

Authentication Is a Wall, Not a Conversation
#

The Agent Turned the Wall Into a Conversation
#

Legal Work Runs on the Same Walls
#

The Walls Assume a Human
#

You Can Agent the Work, Not the Walls
#

What This Means in Practice
#

Further Reading
#

Related

When Documents Are the Attack Surface

AI Playbook: Litigation Workflows with Claude Cowork

The Agents

Authentication Is a Wall, Not a Conversation#

The Agent Turned the Wall Into a Conversation#

Legal Work Runs on the Same Walls#

The Walls Assume a Human#

You Can Agent the Work, Not the Walls#

What This Means in Practice#

Further Reading#

Enjoyed this? Subscribe for more.

Related

Authentication Is a Wall, Not a Conversation
#

The Agent Turned the Wall Into a Conversation
#

Legal Work Runs on the Same Walls
#

The Walls Assume a Human
#

You Can Agent the Work, Not the Walls
#

What This Means in Practice
#

Further Reading
#