Skip to main content
  1. Posts/

You Can Agent the Work, Not the Walls

Author
LegalRealist AI
Trust Boundaries - This article is part of a series.
Part 1: This Article

TL;DR

Over the last weekend in May, Instagram accounts began getting hijacked, and a video on X laid out the method. The attacker used a VPN to spoof the victim’s location and slip past Instagram’s automated protections, opened the Meta AI Support Assistant, and asked it to add a new email to the target’s account. The bot sent a verification code to that attacker-supplied address; the attacker pasted it back, and the bot surfaced a “Reset Password” button. New password, new owner.

Victims ranged from the dormant Obama-era White House handle to U.S. Space Force chief master sergeant John Bentivegna. The intruder never needed the victim’s real email — TechCrunch confirmed the code landed in the attacker’s own inbox. Instagram said Monday the hole was fixed; the count is unknown.

It reads like a clever trick. It is closer to a category error — an agent placed where the job is gating, not helping. The same mistake is available to any law firm handing agents more authority.

Authentication Is a Wall, Not a Conversation
#

Account recovery exists to prove one thing: that you control a credential or channel already on file. A half-century of work since Diffie and Hellman’s 1976 “New Directions in Cryptography” converges on a single design goal — make access depend on possession, never on persuasion. The verification code is a small cryptographic primitive; its only meaning is “the holder controls the channel it was sent to.” That is the reason authentication works at all. A wall you can argue with is not a wall.

The Agent Turned the Wall Into a Conversation
#

Meta seated a persuadable system at exactly the control point engineered to be impersuadable. The bot let the attacker pick the channel the code went to, then accepted possession of that code as proof of ownership — though it proved only that the attacker controlled his own inbox. The cryptographic primitive worked perfectly; what failed was the wrapper that could be talked into pointing it at the wrong target.

This was not prompt injection — no smuggled instructions, no jailbreak. The attacker simply asked, and a large language model tuned to resolve requests did what it was built to do. The helpfulness wasn’t a bug; it was the feature the bot was deployed for. An assistant optimized to say yes is the wrong occupant for a seat whose function is to gate, verify, and authenticate.

Two account-recovery flows side by side: the legitimate path sends a verification code to the email already on file, so possessing the code proves ownership; the exploited path lets the attacker supply the destination email, so the same code proves only that the attacker controls their own inbox — the verification gate severed from the thing it is supposed to prove

The old defense against this — talk your way past the help desk — was a trained, skeptical human who escalated when a request felt wrong. Replace that desk with a friendly bot and you delete the skeptic but keep the door.

Legal Work Runs on the Same Walls#

A law firm runs on the same kind of wall — gates built to refuse the person who most wants past them. The ethical screen (Model Rule 1.10), so a conflicted lawyer’s knowledge can’t reach the opposing matter team. The conflicts check that refuses an engagement before it opens. The privilege boundary, already live in United States v. Heppner, where a court held a defendant’s exchanges with a consumer AI assistant weren’t privileged — which we covered. Access controls and trust accounting guard the rest.

Drop a helpfulness-optimized agent into one of these and you have rebuilt the Instagram failure with a malpractice tail: an agent with cross-matter memory that “helpfully” surfaces a screened file is the wall talking back. The risk isn’t hypothetical — research we’ve cited before found 94.4% of tested LLM agents vulnerable to prompt injection.

The Walls Assume a Human
#

In May 2026, federal prosecutors indicted thirty attorneys and financial professionals in a decade-long insider trading ring that pulled confidential M&A files off six top firms’ systems. The ringleader used his own credentials to reach the deal rooms. The access controls worked exactly as designed — they just assumed the keyholder was loyal. The scheme, said U.S. Attorney Leah Foley, exploited “the special access and ethical duties that come with a law license.”

Every wall assumes the insider behind it is a human with a conscience and a career to lose. An agentic AI tool wired into a firm endpoint is a new kind of insider — full credentials, no loyalty, a documented tendency to do what it’s asked. The ring needed a decade and a network of trusted classmates; an over-permissioned agent collapses that to a single endpoint anyone who reaches it can try to talk past.

You Can Agent the Work, Not the Walls
#

The line is not “AI is too risky for law.” Most legal work is delegable — drafting, summarizing, indexing depositions, the volume work where an error costs time, not access.

What can’t be handed to a persuadable agent are the gatekeeping functions: authentication, authorization, conflicts, ethical screens, privilege routing, the movement of client funds. Automating those is fine — a deterministic rule can’t be argued with; a system optimized to be agreeable can. The boundary isn’t how sensitive the task feels. It is a binary question: is this seat a gate? If it is, a persuadable agent doesn’t belong in it — because being convinced is the precise vulnerability the seat was built to eliminate.

A decision diagram dividing legal tasks into two columns: the delegable column (drafting, summarizing, document indexing, deadline tracking, research assembly) marked as appropriate for agents, and the walled-off column (authentication, access control, conflicts checks, ethical screens, privilege decisions, trust-account transfers) marked off-limits, with the dividing test labeled: is this seat a gate?

What This Means in Practice
#

The defense is not a better-trained bot; it is a permissions decision — identify which seats are gates and keep agents out of them. A more capable agent at the wall is only a more capable attacker once someone reaches it. Every helpful capability granted at a gate is an instruction set for whoever gets there first.

Further Reading
#


This post is part of the Trust Boundaries series on LegalRealist AI. It is intended for informational and educational purposes only and does not constitute legal advice. AI capabilities, security incidents, and product features described here reflect publicly available information as of the publication date and are subject to rapid change. The ethics rules referenced are the ABA Model Rules; adopted rules and their interpretation vary by jurisdiction.

Trust Boundaries - This article is part of a series.
Part 1: This Article

Related