The ninety-day plan is one of the most abused instruments in management. Executives produce them for onboarding, for strategy reviews, for board decks — and then file them. What makes a ninety-day plan for an agentic AI program different is that it is not a communication document; it is a commitment device. It names specific deliverables, assigns them to named owners, and creates accountability to a board or steering committee that can verify completion. Done right, ninety days is enough time to establish the institutional foundations that will determine whether the next three years of the program succeed or fail.
Days 1–30: Orient, inventory, charter
The first thirty days are not for building. They are for learning what already exists and establishing the authority to govern it. Three deliverables are non-negotiable. First, the programme charter: a two-page document, signed by the most senior executive sponsor available, that names the governance owner, defines the scope of the programme (which agent deployments are in scope, which business units, which risk categories), and commits the organisation to the Scorecard baseline and twelve-month roadmap. A charter without a named governance owner is an aspiration, not a commitment.
Second, the NIST RMF gap assessment: a structured review of the organisation's current AI risk management practice against the four functions of the NIST AI Risk Management Framework — Govern, Map, Measure, Manage. The gap assessment does not need to be exhaustive; it needs to be honest. Which of the four functions does the organisation currently perform, at what tier, and what is missing? This document becomes the governance backlog that the programme will work down over the following sixty days.
Third, the governance owner named and briefed: a specific individual, with the authority and the budget to make decisions, who accepts accountability for the programme's governance outcomes. Without this, every subsequent decision becomes a negotiation between teams that each have partial authority. The governance owner is not the most senior AI person in the organisation; they are the person most capable of saying no to a deployment that fails the risk bar, and being heard when they do.
"The first thirty days should produce three sheets of paper and one conversation — the conversation where someone accepts accountability." — adapted from MIT CISR digital transformation research.
Days 31–60: Baseline, scorecard, use-case scoping
With a charter and a governance owner in place, the programme can now make its first honest assessment. The pillar-by-pillar baseline is produced using the Scorecard Method from Chapter 30: a half-day facilitated session per pillar, producing a scored radar and an evidence log. This baseline is the contract against which progress will be measured. Every organisation discovers something uncomfortable in this phase; the value is in the discovery, not in the score.
The baseline feeds directly into the first three use cases scoped. Scoping here means not selecting which three agents to build but deciding which three agent opportunities the organisation is prepared to govern, instrument, and evaluate. Each use case must have: a named business owner, a stated success metric, a defined eval set (even a small one), an identified data source with its classification, and a kill criterion — the specific condition under which the pilot will be terminated. Use cases without kill criteria are not pilots; they are commitments.
The third deliverable of this phase is the scorecard run as a shared document, distributed to the steering committee and reviewed in a sixty-day check-in. The check-in is not a progress presentation; it is a gap-closure negotiation. For each pillar where the score is below the target, the governance owner must present the specific action and the owner responsible for closing it by day ninety.
Days 61–90: First pilot, evals dashboard, roadmap signed
The final thirty days are about demonstrating that the institutional foundations built in days one through sixty are real enough to support a production agent. The first guarded pilot in production does not need to be impressive; it needs to be governed. An agent that processes a narrow, well-defined task with full audit logging, a documented eval set with a passing score, a scoped machine identity, and a named human accountable for every action the agent takes is worth more than ten pilots that are impressive and ungoverned. The first production pilot is the proof of concept for the governance model, not just for the agent.
The evals dashboard is the instrumentation that makes the pilot's governance visible. At minimum: a pass rate for the eval set, a latency distribution, a log of every tool call, and a flag for any action that exceeded the agent's defined permission scope. The dashboard need not be sophisticated — a spreadsheet updated daily is sufficient for a guarded pilot — but it must exist and must be reviewed by the governance owner at least once a week.
The twelve-month roadmap, introduced in Chapter 33, must be signed off by the steering committee before day ninety closes. Signed off means more than approved: it means budgeted, staffed at least at the CoE seed level, and tied to a specific set of pillar-level targets on the Scorecard that the programme commits to reach by each quarter. A roadmap that is reviewed but not funded is a wish list.
What not to do in ninety days
The list of things that a ninety-day plan should not include is as important as what it should. Do not select an enterprise platform before the Scorecard baseline is complete: the platform choice should follow the maturity gap, not precede it. Do not attempt to deploy more than one guarded pilot in the first ninety days: the governance model is being proven for the first time, and a second pilot running in parallel doubles the complexity before the model is validated. Do not commission an external consultant to write the governance policy: policy that is written externally and owned internally by nobody is the single most reliable predictor of L1 Governance scores eighteen months later.
Ninety days is also not enough time to train a workforce, stand up a full CoE, complete an ISO 42001 readiness review, or select and onboard an orchestration vendor at enterprise scale. These are Q2 and Q3 activities, and attempting them in Q1 is how programmes run out of momentum before they have a win to show for it. The discipline of the ninety-day plan is knowing what to defer.