The Agentic Enterprise  ·  Chapter 17
Chapter 17

The Orchestration Stack

Pillar II — frameworks, runtimes, and the plumbing of multi-step work

If governance is the organizational layer that defines who owns the agent and what it is permitted to do, orchestration is the technical layer that actually makes it do it. An orchestration stack is the collection of frameworks, runtimes, and plumbing components that translate a goal into a sequence of model calls, tool invocations, and state updates — and that manages the coordination between multiple agents when the task requires more than one. The choice of orchestration architecture is not merely a developer preference; it is a risk architecture decision that determines what kinds of failures the system is capable of, what kinds of interventions are possible at runtime, and how well the system's behavior can be observed and audited.

The Orchestration Layer Defined

The orchestration layer sits between the application that initiates a task and the models and tools that execute it. Its responsibilities include: decomposing the task into subtasks, sequencing those subtasks according to their dependencies, routing each subtask to the appropriate model or tool, managing the shared context that flows between steps, handling errors and retries at the step level, and surfacing the information needed for human review at appropriate checkpoints. In a single-agent system, orchestration is relatively straightforward — essentially a loop over the agent's plan-act-observe-remember cycle. In a multi-agent system, orchestration becomes substantially more complex because it must also manage the communication protocols, trust relationships, and state synchronization between agents that may be running on different runtimes with different capabilities and different permission sets.

The orchestration layer is where most of the interesting engineering decisions in an agentic deployment are made, and it is where most of the security controls that matter at runtime are implemented. An orchestration framework that provides no hooks for runtime policy enforcement — no way to inspect a tool call before it executes, no way to modify the agent's context based on a policy decision, no way to pause execution for human review — cannot implement the governance charter's decision authority matrix in practice, regardless of how carefully that matrix has been designed.

LangGraph: Directed Graphs for Complex Workflows

LangGraph represents the current state of the art in explicit, graph-based orchestration. Rather than expressing agent behavior as a monolithic loop, LangGraph asks the developer to model it as a directed graph of nodes and edges, where nodes are computation steps (model calls, tool calls, conditional branches) and edges are the transitions between them. This representation makes the agent's possible execution paths explicit and statically analyzable — a significant advantage from a governance perspective, because it means the set of things the agent can do is defined by the graph structure, not emergent from the model's reasoning.

LangGraph's stateful execution model — where each node in the graph can read and write a typed state object that persists across the entire run — makes it well-suited for complex, multi-step workflows that require careful state management. The framework's support for human-in-the-loop checkpoints, where execution pauses and a human must approve or modify the state before the graph continues, is particularly valuable for agents that operate in high-stakes domains where the governance charter requires human oversight at specific decision points. LangGraph's persistence layer can checkpoint the full graph state to an external store, enabling long-running agents that can be interrupted and resumed and providing the forensic trail needed for post-incident analysis.

AutoGen and CrewAI: The Multi-Agent Dimension

AutoGen, Microsoft's multi-agent orchestration framework, takes a different approach: it models agents as conversational participants in a group chat, where agents exchange messages and the orchestrator routes those messages according to a defined policy. This conversational model makes it easy to build systems where the decomposition of a task emerges from the agents' interaction rather than being specified in advance — useful for exploratory tasks where the right decomposition is not known at design time, but harder to audit because the agent's behavior is less constrained by the framework's structure.

CrewAI offers a higher-level abstraction: agents are defined as roles (Researcher, Writer, Reviewer) with associated tools and goals, and tasks are assigned to roles rather than to specific model instances. The framework handles the coordination — including parallel execution of independent tasks and sequential execution of dependent ones — and provides a structured output format that makes it easier to integrate agent work products into downstream systems. CrewAI's role-based model maps naturally onto organizational structures, which makes it attractive for enterprise teams that want to reason about their agent deployments in familiar terms.

The choice between LangGraph's explicit graph model, AutoGen's conversational model, and CrewAI's role-based model is ultimately a choice between different tradeoffs on the spectrum from control to flexibility. Organizations with strong governance requirements and well-defined workflows tend toward LangGraph; organizations building exploratory or research agents where the workflow is inherently open-ended tend toward AutoGen; organizations that want to align their agent architecture with their organizational structure tend toward CrewAI.

"The orchestration framework you choose is not a neutral technical decision. It determines what you can observe, what you can control, and what you can audit. Choose it with the same deliberateness you would apply to selecting a database architecture."

Runtimes and Plumbing

Beneath the orchestration framework sit the runtime components that actually execute the agent's actions: the inference API that calls the model, the tool execution environment that runs the functions the agent invokes, and the message broker that routes information between agents in a multi-agent system. Each of these has implications for the security and observability of the overall system that practitioners frequently underestimate when they are focused on the higher-level orchestration logic.

The tool execution environment deserves particular attention. A tool that the agent can call is, by definition, code that executes with some level of privilege in the organization's environment. The principle of least privilege — already well-established in conventional software security — applies to agent tools with additional force, because the agent may call tools in combinations and sequences that the tool developers did not anticipate. Tool execution should be sandboxed wherever possible, with explicit grants of the permissions each tool requires and automatic revocation when the agent's session ends. The AWS IAM model of fine-grained, time-limited, scope-specific permissions is a useful template for thinking about tool permission architecture.

The Orchestration Stack as Control Surface

A mature orchestration architecture treats the stack not just as a mechanism for executing tasks but as the primary control surface for runtime risk management. This means instrumenting every layer: traces at the model call level that capture the prompt, the response, the tool calls, and the latency; metrics at the workflow level that track success rates, escalation rates, and cost per run; alerts at the policy level that fire when the agent's behavior approaches or crosses a defined threshold. The orchestration stack that provides this instrumentation out of the box is not just more observable — it is more governable, because the governance charter's escalation thresholds can be implemented as runtime controls rather than as post-hoc audits.