Skip to main content

Glossary

Definitions of AI and legal technology terms used throughout LegalRealist. Terms are linked from posts at first mention. The glossary updates automatically as new posts introduce new concepts.

A

Application Layer

The custom software a vendor builds on top of a foundation model — including prompts, retrieval pipelines, fine-tuning, user interface, and workflow logic. Most “proprietary AI” in legal tech is application-layer work; the foundation model itself is licensed from OpenAI, Anthropic, Google, or another lab.

First introduced in The Foundation.

C

Context Window

The maximum number of tokens a model can process in a single prompt — its working memory. A 200K-token window holds a lengthy contract and exhibits; 1M–2M token windows can ingest entire deal rooms. If a document exceeds the window, the model can’t reference earlier content when analyzing later content.

First introduced in The Foundation.

F

Fine-Tuning

The process of further training a pre-trained foundation model on a specific dataset to specialize it for a task or domain. Distinct from prompt engineering (which changes inputs) and retrieval (which changes what context the model sees). Most legal AI tools rely more heavily on retrieval than fine-tuning because legal content changes faster than fine-tuning cycles.

First introduced in The Foundation.

Foundation Model

A large, general-purpose model trained on broad data that can be adapted to many downstream tasks through fine-tuning, prompting, or retrieval. Examples: OpenAI’s GPT family, Anthropic’s Claude family, Google’s Gemini family, Meta’s Llama family.

First introduced in The Foundation.

Frontier Lab

A company building the most capable foundation models. As of 2026, the major frontier labs are OpenAI, Anthropic, Google DeepMind, Meta, xAI, and DeepSeek. Training a frontier model costs $100M+ and requires thousands of specialized processors.

First introduced in The Foundation.

G

Generative AI (GenAI)

AI systems that produce new content — text, images, audio, or code — rather than just classifying or predicting. The “G” in GPT.

GPT (Generative Pre-trained Transformer)

OpenAI’s family of foundation models, named for the architecture (transformer) and training approach (generative pre-training). The term is sometimes used loosely to refer to any large language model, but technically refers only to OpenAI’s models.

H

Hallucination

When a language model generates content that is fluent and plausible but factually wrong — citing a nonexistent case, misstating a holding, fabricating a statute. Not a bug; a structural feature of probabilistic text generation. Can be reduced through retrieval and verification but not eliminated.

First introduced in The Fundamental Limits.

L

LLM (Large Language Model)

An AI system trained on large volumes of text to predict and generate language. Modern LLMs are built on transformer architectures and trained on hundreds of billions to trillions of words. The technology underneath nearly every legal AI tool.

First introduced in The Foundation.

O

Open-Weight Model

A foundation model whose parameters (weights) are publicly released, allowing self-hosting and fine-tuning. Examples: Meta’s Llama, DeepSeek’s R1, Mistral’s models. Distinct from “open-source” in the strict sense, which would require training data and code to also be released. Allows firms to process documents without sending them to a third-party API.

First introduced in The Foundation.

P

Prompt Engineering

The practice of designing inputs to a language model to produce useful outputs. Includes structuring instructions, providing examples, and chaining multiple prompts together. The least technical layer of LLM application development; often the highest-leverage one.

R

RAG (Retrieval-Augmented Generation)

A technique where a system first retrieves relevant documents from a verified database, then provides them to the language model as context for generation. Used by virtually every serious legal AI tool to ground outputs in real sources rather than relying on the model’s training data alone. Reduces hallucination but does not eliminate it.

First introduced in The Fundamental Limits.

T

Token

A subword unit that language models process — roughly equal to four English characters or three-quarters of a word. APIs charge separately for input tokens (what you send) and output tokens (what the model generates), with output typically costing 3–10x more.

First introduced in The Foundation.

Transformer

The neural network architecture introduced in the 2017 Google paper “Attention Is All You Need,” underlying nearly every modern language model. Built around “self-attention,” which lets the model weigh how every word in a passage relates to every other word, regardless of distance.

First introduced in The Foundation.

Z

Zero-Retention Policy

A commitment from an AI provider that customer inputs are not stored after processing and not used to train future models. Standard for paid API access at major frontier labs; sometimes available only as an enterprise tier. Distinct from default consumer chat product terms, which often retain data.

First introduced in The Foundation.

There are no articles to list here yet.