Every significant technology shift creates new roles before the labour market has names for them, and destroys old roles before the people in them have somewhere to go. Agentic AI is producing both effects simultaneously. The organisations that navigate this transition well are those that name the new roles clearly, build career paths for people who want to grow into them, and create a reskilling bridge for the people whose existing roles are being transformed. Organisations that wait for the labour market to do this for them will be late.
The AI product manager
The AI product manager is the most important role that most enterprises have not yet formally created. This is not a software product manager who happens to work on an AI project; it is a role specifically designed for the governance, scoping, and lifecycle management of agents and AI systems as products. The AI PM owns the use-case selection process — the value-feasibility scoring, the stakeholder alignment, the kill criteria — and is accountable for the agent's performance in production in the same way a traditional product manager is accountable for a software product's Net Promoter Score.
The AI PM must be fluent in three languages: the business domain (what does this process actually do, and who depends on it?), the technical architecture (what are the orchestration layer's constraints, and what can the evals harness actually measure?), and the governance framework (which regulatory requirements apply, and what does the risk owner need to approve deployment?). This combination is rare; organisations typically find it by promoting experienced product managers with an appetite for technical depth, not by hiring from ML research backgrounds.
Agent engineers
The agent engineer role sits at the intersection of software engineering and applied ML. Unlike a traditional ML engineer, who is primarily concerned with model training and inference infrastructure, the agent engineer is primarily concerned with the agent's runtime behaviour: the orchestration logic, the tool integrations, the memory architecture, the structured output contracts that allow the agent to interoperate with enterprise systems. The agent engineer writes the code that runs between the model's API and the business system's API — and that code is where most agent failures originate.
Key competencies for an agent engineer: proficiency in at least one orchestration framework; experience with structured output design and eval harness construction; understanding of identity and access management sufficient to design a scoped agent credential; and the discipline to write evals before writing code, in the same way that test-driven development rewired software engineering habits in the 2000s. This last competency is the most culturally difficult to instil, because it requires slowing down before speeding up — and agent engineering culture, at this stage in the technology's development, runs fast by default.
"The difference between an agent engineer and a prompt engineer is that the agent engineer is worried about what happens on the fifth tool call, not the first." — observation from an enterprise AI engineering workshop, 2025.
The evals lead
The evals lead is the newest and least well-understood of the core roles. This person owns the organisation's evaluation library: the test cases, the red-team scenarios, the automatic graders, the human-evaluation protocols. They are the quality director of the agent estate, in the same sense that a QA director is the quality function for a software engineering organisation. Without this role, evals are an afterthought — something the agent engineer runs before deployment and forgets about.
The evals lead is responsible for three things: maintaining the eval library as the agent estate grows (new agents need new evals, but there should also be a shared library of safety and reliability tests that applies to every agent); running the red-team schedule (periodic adversarial exercises designed to find failures before adversaries do, informed by the OWASP Agentic AI threat taxonomy); and owning the incident post-mortem process (when an agent fails in production, the evals lead is responsible for translating the failure into new test cases that prevent recurrence).
Governance and risk roles
Established governance and risk functions — model risk management, information security, legal, compliance — need not create entirely new roles for agentic AI, but they do need to extend existing ones. The model risk manager who has historically reviewed statistical models for credit and fraud must now review LLM-based agents, which behave probabilistically in ways that traditional model validation frameworks were not designed to handle. The information security architect who has designed identity and access management for human users must extend those designs to machine identities that execute actions on behalf of those users. The legal team that has negotiated SaaS vendor contracts must now negotiate AI model provider contracts, which raise different questions about data usage, model versioning, and liability for model-generated outputs.
These extensions are not optional; they are required by the governance frameworks the organisation has already committed to. The ISO/IEC 42001 management system, the NIST AI RMF, and the EU AI Act all presuppose that the governance function has people who understand what agents are and how they fail.
Career paths for the new roles
Career paths for the new roles are still being invented, and any description here will be overtaken by events. But some patterns are already visible. Agent engineers who develop deep evals expertise tend to move toward evals lead roles or AI governance roles, where technical credibility is required to hold a line against engineering teams under deployment pressure. AI product managers who develop deep domain expertise in a specific vertical — financial services, healthcare, supply chain — become highly valuable as the organisations in that vertical start competing on the quality of their agent portfolios. Evals leads who build comprehensive red-team libraries develop a kind of institutional memory that makes them difficult to replace and easy to promote.
McKinsey's State of AI research finds that the organisations with the highest AI value realisation are those that invest in AI capability building for employees in existing roles, not just in hiring external AI specialists. The implication for the operating model is clear: the programme needs a learning and development component, not just a hiring plan.