Building an Agentic Enterprise
A long-form, practitioner-grade report on deploying agentic AI inside enterprises — process mapping, NIST AI RMF governance, orchestration architecture, real case studies, risk mitigation, and ROI. Honest about what works and what doesn't.
Foundations — the agent itself
Seven chapters on what an agent actually is, why workflows are not agents, the five honest questions to ask before building one, and the reference stack you'll keep coming back to.
The Quiet Test
An honest opening: the gap between agentic AI on stage and agentic AI in production, and the small test that separates the two.
From Workflow to Agent
The honest mechanics of process mapping for agentic systems — using SIPOC and value-stream techniques to decide what becomes an agent and what stays an automation.
The Five Honest Questions
Five gating questions every agent project should answer before any code is written: value, judgement, reversibility, data, ownership.
Anatomy of an Agent
The minimal anatomy of an agent — model, prompt, tools, memory, guardrails, observability — and what each one is actually for.
The Reference Stack
A six-layer reference architecture for enterprise agents — what each layer is, who builds it, and why none can be safely skipped.
Memory, in Plain English
An honest tour of agent memory — short-term, long-term, episodic, semantic — with the governance and failure modes for each.
Tools and Protocols
The protocols binding agents to tools and to each other — function calling, MCP, A2A — and the engineering tradeoffs each implies.
Governance — the spine
Seven chapters on NIST AI RMF as a working spine, the Agent Registry that turns shadow agents into governed ones, the risks that actually bite, and human-in-the-loop done honestly.
Why Governance Comes First
Governance is not the document you write after launch. It is the spine that makes launch survivable.
Mapping What an Agent Actually Does
The Map function of NIST RMF, applied: what context, authority, and consequence each agent carries, written down before it ships.
Measure: Evals That Actually Catch Things
What a real eval suite looks like for an agent — golden sets, online metrics, drift detection, and the discipline of noticing degradation before users do.
Manage: Incidents, Rollback, the Off-Switch
Incident response, rollback procedures, and kill-switches for agentic systems — the Manage function as an operational discipline.
The Agent Registry
What an agent registry contains, why it is the single highest-leverage piece of governance infrastructure, and how to build one that does not rot.
The Risks That Bite
The risks that most often produce real incidents in agentic deployments — prompt injection (direct and indirect), excessive agency, sensitive information disclosure, and the agent IAM problem.
Human-in-the-Loop, Done Honestly
Human-in-the-loop done honestly: where humans add value, where they rubber-stamp, and how to design checkpoints that catch real failures rather than create paperwork.
Deployment — ROI — the flywheel
Seven chapters on three real case studies (Klarna, JPMorgan, McDonald's), the platform map, the honest ROI arithmetic, the flywheel that compounds wins, and a 30-60-90 plan you can run from on Monday.
Klarna: The Most Honest Story on the Public Record
Klarna's customer-service agent has been the field's most-cited deployment, the most-misread, and — read carefully — the most useful template.
JPMorgan: LLM Suite and the Compliance-First Path
JPMorgan's LLM Suite is the most-deployed enterprise generative AI platform of 2025 — and the lesson is not the scale, but the order of operations.
McDonald's: How to Pull the Plug Without Embarrassment
The 2024 McDonald's IBM AOT rollback is widely cited as an enterprise AI failure. Read carefully, it is the field's best example of how to run a pilot.
The Platform Map
An honest tour of the major no-code, low-code, and code-first agent platforms — what each is good at, what the lock-in costs, and which to choose when.
ROI: The Cost Stack and the Honest Arithmetic
An honest ROI model for agentic AI — the full TCO stack, the value-side metrics that hold up, and the sensitivity variables that change the answer.
The Flywheel
The data flywheel — structured logging, feedback capture, eval pipeline, improvement cycle, deployment CI/CD — is the difference between agents that compound and agents that plateau.
A 30-60-90 Plan
An honest, sequenced 30-60-90 plan for an enterprise agent program — the artefacts, decisions, and small wins to land in each window.
A note before you begin
This is not a hype document. Every claim is anchored to a primary source. Where the public record is messy, the report is honest about the mess. Where wins are real, they are celebrated; where risks bite, they are named. Read the three parts in order for the full arc, or jump in wherever your problem lives. The last chapter is a 30-60-90 plan you can take to a steering committee on Monday.