Chapter 14 · The Policy Stack — The Agentic Enterprise

A governance charter names who is responsible. A policy stack defines what is allowed. The distinction matters because the same agent can be deployed in dozens of different contexts across a large enterprise, and each context carries its own risk profile, its own regulatory obligations, and its own tolerance for autonomous action. The policy stack is the body of living documents that governs all of those contexts simultaneously — not by anticipating every scenario, but by establishing principles and decision rules that teams can apply when novel situations arise. Getting the stack right is less a legal exercise than an operational engineering problem.

Acceptable Use Policy

The foundation of the stack is an acceptable use policy (AUP) for agentic systems. It differs from a conventional AUP in two important ways. First, it must address the actions of the agent as well as the actions of the user: a policy that only constrains what humans type into a chat interface provides no guidance on what the agent is permitted to do with the tools it has been given. Second, it must address probabilistic outputs: unlike a database query that returns a definitive result, an agent's output is a best-effort inference, and the AUP must establish how that uncertainty should be disclosed to downstream consumers of the agent's work.

A well-drafted AUP specifies approved use cases (the classes of task for which the agent is sanctioned), prohibited use cases (tasks that are explicitly off-limits regardless of apparent capability), handling rules for sensitive data categories encountered unexpectedly, and disclosure requirements when an agent's output is incorporated into a communication, a decision, or a deliverable that reaches an external party. The AUP should be reviewed annually and whenever a significant new capability — a new tool integration, a new model version, or a new deployment context — is introduced.

Model Risk Management

Financial services regulators have required model risk management (MRM) for algorithmic decision systems since the Federal Reserve's SR 11-7 guidance in 2011. That framework — model inventory, validation, monitoring, and governance — applies to agentic AI with some important extensions. The model in an agentic system is often a third-party foundation model that the organization did not train and cannot fully audit. MRM for agents therefore requires a vendor risk component that tracks the model provider's change management practices, their evaluation methodology, and their contractual commitments around performance stability.

The validation challenge is particularly acute. Traditional MRM validates a model on a held-out test set and declares it fit for purpose within a defined input distribution. An agent operates across a far wider and less predictable input distribution — including adversarially crafted inputs from external actors — and its outputs include not just predictions but actions with real-world consequences. Model risk teams adapting to this environment are increasingly building behavioral test suites: structured collections of scenarios designed to probe the agent's behavior at the edges of its approved use cases, including scenarios that involve conflicting instructions, ambiguous data, and attempts at prompt injection.

Data Classification for Agents

Most enterprise data classification schemes were designed for static documents and database records. They answer the question: how sensitive is this data at rest? For agents, the more important question is: how sensitive is this data in motion through an inference pipeline? An agent that retrieves a customer record (classified as Confidential), summarizes it (output now contains Confidential data), and sends the summary to an external vendor has moved data across a classification boundary in a way that no DLP rule written for email or file transfer will catch.

The policy response is to extend classification labels downstream: the output of any operation on Confidential data inherits a Confidential label unless an explicit declassification rule applies. This sounds straightforward but requires tooling — specifically, a way to attach classification metadata to the agent's working context and propagate it through tool calls, so that the agent's output-handling logic can check whether it is permitted to pass that output to its next destination. Several enterprises are implementing this as a context tagging system: a lightweight key-value store attached to the agent's session that accumulates classification labels as the agent retrieves data and is consulted before any outbound action.

"The data classification problem for agents is not a new problem; it is an old problem running at a speed that existing controls were not designed to match."

Third-Party AI Policy

Most agentic deployments rely on third-party components at multiple layers: a foundation model from a hyperscaler or frontier lab, an orchestration framework from an open-source project or commercial vendor, and various tool integrations that reach into external services. Each of these introduces a supply-chain risk that a third-party AI policy must address. The policy should require a minimum security baseline for any AI component integrated into a production agent — covering at minimum: data processing agreements, incident notification timelines, model change notification, and access to evaluation documentation sufficient to meet the organization's MRM requirements.

The emergence of the Model Context Protocol and similar standards is changing the third-party AI landscape: tool integrations that once required bespoke development can now be sourced from a growing ecosystem of pre-built MCP servers. This lowers integration cost dramatically, but it also means that the policy must address a new class of third-party component — the MCP server — whose security properties and data handling practices vary enormously. A third-party AI policy that was written before MCP existed will almost certainly need to be updated.

Incident Response for Agents

The incident response plan for an agentic system shares the general structure of a cybersecurity incident response plan — detect, contain, eradicate, recover, post-incident review — but requires agent-specific playbooks for the failure modes that are unique to autonomous systems. Three of these are particularly important. First, the runaway loop: an agent enters a state where it repeatedly calls a tool or produces a sequence of actions that is not converging. The playbook must include a mechanism for hard-stopping the agent without losing the state needed for post-incident analysis. Second, the unauthorized data exfiltration: an agent passes data to an unintended destination, either through a misconfigured tool or through a prompt injection attack. The playbook must specify who is notified, in what timeframe, and what forensic preservation steps are required. Third, the model substitution: a model provider silently updates the underlying model, and the agent's behavior changes in ways that violate its decision authority matrix. The playbook must specify how behavioral regression testing is triggered and what the threshold is for rolling back to a prior model version.