Agentic AI does not face a new kind of adversary. It faces familiar adversaries — financially motivated criminals, nation-state actors, opportunistic hackers — who have learned to exploit a new attack surface with significant speed. The tools of classical cybersecurity apply: threat modeling, layered defense, incident response. What is new is the specific texture of the vulnerabilities that agents introduce, and the speed at which adversaries have developed techniques to exploit them. Several documented attack campaigns against production agentic systems predate the governance frameworks designed to address them. Understanding the threat landscape, in its current specificity, is the prerequisite for building a defense that is more than decorative.
The OWASP Taxonomy
The most comprehensive published taxonomy of agentic AI threats is the OWASP Agentic Security Initiative's Threats and Mitigations document, version 1.0, published April 2025. It identifies fifteen threat categories purpose-built for autonomous agent systems, organized by attack surface and impact type. The OWASP LLM Top 10 v2025 provides the broader LLM application context; the ASI taxonomy provides the agentic-specific extension.
The most operationally significant categories for enterprise deployments are: Memory Poisoning — injecting false data into an agent's persistent memory store to influence future behavior across sessions; Tool Misuse — manipulating an agent through its tool inputs to abuse tool access for unauthorized purposes; Privilege Compromise — exploiting weaknesses in the agent's permission management to acquire capabilities beyond its intended scope; Cascading Hallucinations — exploiting error propagation in multi-agent pipelines to amplify a small initial manipulation into a large downstream impact; and Agent Communication Poisoning — manipulating the messages passed between agents in a multi-agent system.
Less discussed but increasingly relevant is Human Manipulation: using an agent that has established implicit trust with a user to coerce or deceive that user in ways that the agent's operators did not intend and cannot easily detect. An agent that has managed a user's calendar and email for several months has accumulated significant implicit trust. An adversary who can influence that agent's behavior — through memory poisoning or prompt injection — can leverage that trust for social engineering at scale.
MITRE ATLAS
The MITRE ATLAS knowledge base — Adversarial Threat Landscape for AI Systems — provides the ATT&CK-style structured taxonomy that security teams need to build threat models and red team exercises. As of 2025, ATLAS documents 16 tactics, 167 techniques, and 57 case studies of real-world AI attacks. The case studies are particularly valuable: they ground abstract techniques in documented incidents, making the threat model actionable rather than theoretical.
ATLAS coverage of agentic AI attack surfaces is actively expanding. The most relevant technique groups for enterprise agentic deployments include: LLM Prompt Injection (AML.T0051) and its sub-techniques covering direct, indirect, and stored injection; Backdoor ML Model (AML.T0020), relevant for supply chain attacks against third-party model providers; and Context Manipulation techniques that exploit the agent's trust in retrieved content. ATLAS integrates with OWASP LLM Top 10 via cross-referencing technique identifiers, enabling security teams that are familiar with one framework to orient in the other.
The MITRE AI Incident Sharing Initiative, launched alongside ATLAS, allows organizations to report and share AI security incidents in a standardized format. Early-stage enterprise AI programs have a strong interest in reading these reports — they represent some of the most specific threat intelligence available on real-world agentic attacks — and a long-term interest in contributing, both to build collective defense and to receive curated intelligence in return.
Supply Chain Threats
An agent is only as trustworthy as every component in its supply chain. The supply chain for a modern enterprise agent includes: the foundation model (from a vendor), the orchestration framework (typically open-source), the MCP servers providing tool access (from a mix of vendors and internal developers), the vector store and retrieval system (from a vendor), the evaluation framework (open-source or proprietary), and the underlying cloud infrastructure. Each component is a potential attack surface, and the attack surface of the assembled system is substantially larger than the sum of its parts.
Malicious MCP servers — packages that present as legitimate tool providers but exfiltrate data, inject instructions into tool responses, or manipulate the agent's tool call parameters — represent an emerging supply chain threat that has no direct analogue in conventional software. An organization that deploys an agent with an MCP server from a third party without auditing that server's behavior is accepting a supply chain risk that may be invisible until the attack has already succeeded. The practice of "MCP security review" — auditing third-party MCP server implementations before deploying them in production — is a nascent but necessary discipline.
Model provider supply chain risk is a different but equally real concern. Enterprises that deploy agents built on third-party foundation models are dependent on those providers' security practices for the integrity of the model's behavior. Vendor Responsible Scaling Policies, such as Anthropic's RSP, provide some visibility into provider security practices — they specify the testing and safety evaluations conducted before model deployment — but they do not eliminate the dependency. Organizations deploying agents in high-risk domains should require vendor security attestations and, where possible, evaluate model behavior against known adversarial test cases before production deployment.
The Adversary Advantage
One of the less comfortable facts in the agentic threat landscape is that adversaries have, in several documented cases, developed and deployed attack techniques against agentic systems before the defending organizations had completed their threat models. This is not unusual in cybersecurity — the attacker advantage in time-to-exploit is well-documented — but the agentic surface presents some specific characteristics that accelerate adversary learning.
Agents are, at least partially, transparent about their capabilities: they tell users what tools they have, what they can access, and how they process requests. This transparency, designed to build user trust, also builds adversary knowledge. An attacker who can interact with an agent has a detailed map of its attack surface. The techniques for exploiting that surface — particularly prompt injection and tool misuse — are well-documented in open literature and require no specialized capabilities to attempt. The barrier to entry for agentic AI attacks is lower than for most classes of cyberattack.
Defense in Depth
The security architecture for enterprise agentic systems should follow the same defense-in-depth principles that govern conventional cybersecurity, with layers adapted to the specific threat surface. At the input layer: prompt injection detection, content classification for retrieved documents, and explicit validation of tool call parameters. At the execution layer: least-privilege tool access, rate limiting, and behavioral anomaly detection. At the output layer: output sandboxing, post-processing validation, and human review for high-consequence actions. At the infrastructure layer: secure MCP server deployment, model provider attestations, and audit logging with integrity guarantees.
No single control is sufficient. The depth is the point: an attacker who defeats one layer should encounter the next. The specific controls most important for a given deployment depend on the threat model — which adversaries are realistic, which attack vectors are accessible, which assets are most valuable to protect — and that threat model should be built explicitly, using OWASP ASI and MITRE ATLAS as structured inputs, before deployment rather than after the first incident.
"The agentic attack surface is not theoretical. The techniques in OWASP ASI and MITRE ATLAS are documented from real incidents against real production systems. The organizations that treat these as research documents rather than operational threat intelligence are making a choice about risk that they may not have made explicitly."