Part 1 argued that approved intent does not become governable authority on its own. In structured domains, Mission shaping can produce a bounded Mission Authority Model and a reviewable approval object. That is real progress.
It is still not enough.
The CFO example in Part 1 was the strong case: one enterprise, known systems, legible stages, and a control plane that can absorb transition friction. Even there, the hidden complexity was already visible.
The agent still runs in the world. It still encounters external content, discovers new facts, spawns child agents, accumulates partial state, and continues after the moment of initial approval. That is where the second problem begins. Mission shaping defines the authority envelope. Containment and runtime governance determine whether the system stays survivable when that envelope turns out to be incomplete.
Where Mission Shaping Breaks Under Pressure
The problems in this section all arise within a principled staged model, not only in undisciplined deployments. They are the pressure points where a coherent authority model begins to fray under operational reality.
Scope Creep
The staging model handles the case where an agent surfaces a discovered gap and requests governed expansion. Part 1’s context integrity section handles adversarial redirection. There is a third failure mode between them: quiet authority expansion.
An agent decides that accomplishing the approved task requires something not originally anticipated, concludes it is obviously necessary, and does it without surfacing the discovery. No adversarial input. No explicit re-planning request. The agent’s own judgment becomes the expansion mechanism.
Preventing this requires a minimum-authority principle. The shaped envelope has to be narrow enough that “obviously necessary” actions outside the original approval are not silently inside the authority region.
But minimum authority has its own failure mode: shape the envelope too narrowly and the system creates constant replanning churn. Operators get paged for predictable expansions. Teams respond by broadening every template until the model is no longer meaningfully bounded.
This is not only a theoretical tradeoff. Empirical work on semantic task-to-scope matching shows the same tension: over-scoping grants too much authority, but under-scoping can stall task completion when the matcher fails to include necessary permissions. IBAC’s AgentDojo evaluation makes the failure mode concrete: the permissive-mode breaches traced directly to over-scoped wildcard permissions that happened to cover the injected goal. The agent’s reasoning was fully compromised. The authorization surface let the compromised action through. Strict mode blocked all 240 injection attempts by scoping permissions to specific recipients and resources, at the cost of requiring escalation for predictable prerequisites.
That is why templates are necessary but not sufficient.
Task templates are pre-approved Mission archetypes for known task classes. In the structured case, they may also be pre-compiled authority archetypes. A human has already approved that “board packet preparation for a CFO” maps to approximately this set of resources and operations. The template becomes the shaping baseline. Individual Missions attenuate from it rather than starting from scratch. This anchors minimum authority in prior human approval rather than in each operator’s judgment call.
Alongside templates, pre-approved expansion paths address predictable edge cases: known situations where discovery commonly reveals a need for one additional resource class, pre-authorized for this task type without requiring full re-authorization. Together, these two mechanisms reduce the re-planning friction that makes quiet expansion attractive.
But templates accumulate exceptions. Left ungoverned, they grow until the template itself becomes a broad standing grant with a respectable name. This is the IE Trusted Sites failure described in Part 1: every enterprise application that did not work in the restricted zone got added to Trusted Sites because that was easier than fixing the application or tightening the template. The same pressure operates here. Every agent task that stalls on a missing permission gets the template broadened rather than the governed expansion path used. Eventually the template covers everything the agent might plausibly need and nothing is actually constrained.
So templates have to be governed: versioned, reviewed, and retired when they become broader than the task class they were originally meant to represent. That governance is not optional. It is the mechanism that keeps minimum authority meaningful over time.
Delegation
When a Mission is delegated, the child Mission’s authority model must be derived from the parent. It cannot exceed the parent envelope. RFC 8693 (OAuth Token Exchange) provides the protocol plumbing for this derivation, but the semantic constraint, that the child authority is a proper subset of the parent, is a Mission shaping requirement. Enforcement of semantic attenuation is implementation-dependent and not a protocol guarantee.
This is not a hypothetical risk introduced by agents. In production OAuth token exchange deployments today, child tokens routinely inherit full parent scope unless the implementation explicitly attenuates them. The multi-agent case compounds a failure mode already present in standard OAuth delegation rather than introducing a new one. Concretely, an orchestrator that receives a board-packet Mission and spawns a sub-agent to handle number reconciliation has delegated a slice of the parent Mission. The child’s authority should be the parent’s read-and-compare envelope for finance systems, and nothing else.
| |
But that is not the only issue. Delegation also changes governability.
A child agent may:
- run in a different runtime
- expose weaker telemetry
- use different tool adapters
- have poorer provenance guarantees
The delegated authority may be narrower on paper and still be less governable in practice.
This means delegation must attenuate trust assumptions, not just authority bounds. Child Missions may need tighter budgets, stricter callbacks, or less autonomy simply because visibility and control quality are worse at that hop.
The mechanism for this attenuation should be explicit. The Mission state owner should assign a trust quality level to each hop based on observable signals: whether the child runtime emits verifiable tool-call provenance, whether telemetry is independently observed at a boundary rather than self-reported, and whether the child Mission has an established behavioral baseline or is operating cold-start. A child with low trust quality receives a more conservative budget policy and stricter callback thresholds regardless of whether its authority bounds are otherwise equivalent.
A harder case is spontaneous delegation: an orchestrator spawning a sub-agent at runtime without any prior human delegation event. This is common in practice and hard to govern cleanly. The Mission state owner must still issue child authority, but who triggers that issuance when the spawn is an autonomous runtime decision rather than a pre-authorized step? Without an answer to this, spontaneous sub-agents fall outside the governance model and effectively operate in the headless case, without the provenance hierarchy that headless governance requires.
This is also where requester context has to survive the delegation chain. If child agents cannot inherit and present the right task context at each hop, even well-bounded authority models become hard to evaluate consistently.
Headless Missions
In headless cases, no human is present to verify the proposed Mission. That means deterministic shaping is not enough. You also need authority provenance. A scheduled nightly reconciliation agent operating from a standing organizational policy with no user present is the canonical headless case.
What is being shaped?
- an organizational policy
- a runbook
- a prior approved Mission
- a service owner’s standing delegation
Those are not interchangeable sources of legitimacy.
So headless governance needs:
- machine-verifiable shaping rules
- machine-verifiable provenance for the authority source being shaped
If the provenance source cannot be independently verified by the Mission state owner without relying on the agent’s own claims, it is not a headless governance anchor.
Valid provenance sources are not interchangeable. In descending order of legitimacy:
- A prior human-approved template Mission: a human already confirmed this shaping output is acceptable for this task class.
- A standing organizational policy expressed in a formal, auditable policy language.
- A service owner’s cryptographically verifiable standing delegation.
What does not qualify: LLM inference about organizational intent, runtime configuration supplied by an agent or orchestration layer at execution time, or a general-purpose scope grant that predates the specific task.
Headless governance keeps the same authority anchor requirement as human-present governance. It just moves provenance verification to the machine. The next failure mode shifts from no human to no stable authority state at all.
Resumption
Suspension is not necessarily terminal for long-running enterprise workflows.
A Mission interrupted by an anomalous action may need to resume once a human has reviewed what happened. But resumption is not the same as re-approval of the original Mission. The state has changed. Some reversible actions may have occurred. The trust budget has a consumption history.
A governable resumption model therefore needs:
- persistent execution state at suspension time
- a resumption decision that sees that state, not just the original intent
- trust budget continuity rather than full reset
Mission termination mid-execution also raises a distinct sub-problem: compensating or rolling back partial state from actions that have already completed. That is a workflow concern, but the governance layer must coordinate with it. A terminated Mission’s authority does not undo side effects that have already propagated.
This is another place where the architecture starts to look less like IAM and more like a workflow governance control plane. Durable execution state across suspension events, compensating actions for partially completed workflows, and governed resumption with inherited context are patterns from durable execution systems in workflow orchestration, not from traditional IAM. That boundary crossing is not accidental. Mission governance for long-running agents needs to borrow from both disciplines. IAM without workflow state is insufficient for multi-stage tasks. Workflow orchestration without authority governance is insufficient for delegated, externally observable agents. The architecture that emerges from taking both seriously is neither a pure IAM system nor a pure orchestrator.
That architectural breadth has a cost. Running this control plane at enterprise scale means policy owners to govern templates, operators to review expansions and resumptions, telemetry pipelines that can support trusted observation, and engineering teams willing to absorb friction in exchange for bounded risk. Any serious argument for this architecture has to be honest about that operational burden.
Each of these failure modes is a form of survivable incorrectness in practice. The architecture cannot prevent scope creep entirely, cannot eliminate spontaneous delegation, cannot guarantee headless provenance, and cannot make resumption seamless. What it can do is bound the damage from each: staged authority limits quiet expansion, trust quality attenuation limits the radius of a delegated failure, headless provenance hierarchies establish accountability chains even without a human present, and resumption requirements prevent stale authority from being silently reactivated. The goal is not a control plane that never fails. It is one that fails in bounded, detectable, and recoverable ways.
The Interoperability Problem
Each of the limits above operates within a single organization: shared purpose taxonomy, known systems, and a single Mission state owner. Cross-organizational deployment introduces complications the single-org model cannot absorb. Mission authority shaped in one trust domain cannot be evaluated semantically in another, even when the transport format is identical.
If two organizations deploy agent systems independently, each with their own purpose taxonomy and shaping logic, a Mission created in one cannot be meaningfully evaluated in the other. The mission_ref is portable. The semantics it encodes are not.
Interoperability requires more than a purpose taxonomy. It needs shared meaning for:
- resource classes
- action classes
- canonical subject and object identifiers
- selector and constraint semantics
- attenuation and expansion rules
- parent-to-child derivation rules
Without those, a “portable authority representation” is just a transport container for incompatible local meaning.
This identity resolution layer is easy to understate, and it is often where enterprise deployments actually break. “Board packet”, “final numbers”, “Q1”, and even the CFO’s own authority scope may resolve differently across finance systems, content repositories, and messaging tools.
Adjacent standards work helps with plumbing, but not with semantic authority:
- OAuth RAR (RFC 9396) provides a richer authorization request format than scopes, allowing structured authorization detail in the request itself. It is the closest existing standard to a Mission authority carrier. But RAR is a request language, not a shaping substrate; it still requires the semantic work to be done before the request is formed, and it does not govern the Mission lifecycle after issuance.
- Policy engines such as OPA and Cedar help express and evaluate policy once the world has already been reduced to stable resources, attributes, and actions, but they do not solve the shaping step that produces those stable inputs from natural-language intent.
The interoperability problem still points back to a missing semantic layer for Mission shaping.
RAR and token formats can carry authority. They do not supply the shared task semantics that make that authority interoperable.
What Containment Adds
In open-world environments, containment has to carry safety properties that Mission shaping cannot provide.
Containment means:
- narrow discovery envelopes by default
- strong tool mediation and trusted adapters
- short-lived, attenuated credentials
- boundary observation that does not rely on agent self-report
- irreversible actions gated by callbacks, checkpoints, or explicit human release
- runtime isolation strong enough that a compromised agent has limited room to act
A tool call, as used here, means any invocation to an external system through an adapter: file reads and writes, API calls, browser navigation, code execution, and message dispatch. Every such call is a potential observation and enforcement point. Containment is meaningful only if the tool mediation surface is consistently enforced. A single unmediated path through a privileged adapter negates the rest.
Model Context Protocol (MCP) is one framework for tool mediation at this layer. IBAC demonstrates one concrete implementation of the enforcement boundary itself: a higher-order invokeToolWithAuth wrapper that every tool call must pass through, with no path around it. The wrapper checks the request against the intent-derived authorization store before execution; unauthorized calls are denied before they reach the tool. Whether using MCP, a proprietary adapter pattern, or an enforcement wrapper of this kind, the architectural requirement is the same: tool calls must flow through a boundary that can enforce authority constraints, log call provenance, and interrupt on anomaly without relying on the agent to report its own behavior accurately. But tool mediation alone is not enough. A gateway can enforce requests and still lack the deeper runtime context needed to judge intent well. Direct integration with underlying systems can provide richer behavioral visibility and still lack decisive control. Runtime governance needs both: enforcement points that can block and trusted observation points that can explain why a given action is risky.
Agent containment differs from standard defense-in-depth in its threat model. Traditional defense-in-depth assumes threats arrive from outside the perimeter and that authenticated actors are trusted. Agent containment inverts that assumption: the threat is authorized misuse by the agent itself, with correct credentials but misaligned or injected intent. The attack surface is the agent’s context window, not network ingress. The goal is semantic integrity across the agent’s execution, not perimeter security at the organizational boundary.
NIST SP 800-207 (Zero Trust Architecture) provides the foundational framing: no implicit trust based on network location or credential possession, continuous verification of each request. Agent containment extends ZTA’s microsegmentation principle to the agent’s execution environment, treating each tool call as a request to be evaluated rather than trusting the agent to police its own behavior inside a broad authority grant. NIST AI 100-1 (AI Risk Management Framework) places this in the broader governance context: the Mission shaping and containment controls described here are the authorization-layer instantiation of AI RMF’s govern, map, measure, and manage functions for agentic AI systems.
This is not an alternative to Mission governance. It is what makes Mission governance survivable when the semantic layer is incomplete. Mission shaping gives you declared purpose, reviewable approval, and auditable authority boundaries. Containment gives you bounded blast radius, safer behavior when the semantics are wrong, and more reliable interruption when drift is detected.
Containment has costs. Tool mediation adds latency on every call. Callback gates for irreversible actions introduce availability dependencies. Narrow credentials create friction for legitimate scope expansions. There is also a staffing cost: someone has to configure and monitor the containment layer, triage pause events, and respond to budget exhaustion at runtime. Neither Mission shaping governance nor containment governance is free to operate. Treating containment as a first-class design decision means making these tradeoffs explicitly rather than discovering them under production load.
For open-world agents, containment is the more important layer.
When To Rely On Each
Containment is not a fallback activated when Mission shaping fails. The two layers are co-equal. The right balance depends on deployment context. There are signals that indicate which layer needs to carry more weight.
Mission shaping is stronger when:
- The domain is structured and the task class is well-defined (board packet preparation, not “do research”)
- The systems involved are known and enumerable at approval time
- The agent operates within a single organizational trust domain
- There are observable, gated transition points between reversible and irreversible phases
- The control plane can tolerate approval friction for stage transitions
Containment is more important when:
- The environment is partially observable (the control plane cannot independently verify what tool calls the agent is making)
- The agent may encounter external content it did not generate and cannot be fully trusted not to act on
- The task involves cross-domain or cross-organizational boundaries with incompatible shaping semantics
- The agent’s telemetry is self-reported rather than independently observed
- Failure needs to be survivable regardless of whether the semantic model is correct
The IE zone model is a useful map. Local Intranet corresponds to the structured case: known systems, organizational trust domain, observable transitions, controllable policy. The Internet zone corresponds to the containment-primary case: unknown origins, uncontrollable external content, incompatible semantics, weak observability. Most agent deployments sit somewhere between them, which is exactly where IE zone policy was hardest to get right.
The IE lesson about defaults applies directly: the right default for an unclassified agent task is the most restricted posture. Not “figure out what this task probably needs and grant approximately that.” The default is minimum authority, and the organizational process should be to elevate by governed exception with documented justification, not to restrict by exception from a broad default. Getting the default wrong is how Trusted Sites fills up.
The two examples in this series illustrate the poles. The CFO board-packet task is the structured case: one enterprise, known systems, structured approval path, legible stages. A shaping-centered architecture with staged compilation, governed expansion, and callbacks is tractable. The vendor research task (“research vendors, summarize options, and begin outreach where appropriate”) is the containment-primary case: open-ended scope, external content at every step, ambiguous action boundaries, no way to fully enumerate the authority model upfront. Mission shaping still provides a governance record and outer bounds, but the real safety properties come from mediated outreach, short-lived credentials, sandboxed browsing, and explicit release gates before any external communication is sent.
Most real deployments fall somewhere between the two poles. The practical answer for an enterprise finance workflow is: lead with staged Mission shaping for structured phases, often implemented as staged compilation, and surround it with containment controls that keep the blast radius bounded if the semantic model turns out to be wrong.
For open-world tool-using agents with no known execution path in advance, lead with containment and treat Mission shaping more as a governance record than as a safety guarantee.
Both layers still leave a residual problem: an agent can stay within its Mission authority envelope while drifting away from the user’s intent. A runtime layer is needed to detect that drift and act on it before it compounds.
Runtime Alignment: Trust Scores and Budgets
If Mission shaping defines the declared authority ceiling, runtime trust defines how much residual uncertainty the system is willing to tolerate inside it. What follows is one practical control-plane pattern, not a settled standard. Continuous observability is not governance, but it is the runtime evidence layer that makes behavioral analysis and dynamic intervention possible.
| |
This should be understood as part of the same control plane, not as a second hidden policy engine. The Mission authority artifact remains authoritative for what the Mission permits. The runtime trust layer does not expand authority. It only decides whether the system is still willing to rely on the agent’s local judgment within already-shaped bounds.
Even conceptually, that assumes a lot from the underlying system. To score runtime alignment, the control plane needs structured tool calls, stage context, parent Mission linkage, and normalized provenance about where the action came from. Most enterprises do not have that by default, and the visibility gap varies by agent type. Homegrown agents expose different control surfaces than SaaS agent platforms or local workstation agents, and locally run tools are often the hardest for enterprise controls to observe reliably.
And visibility is not enough. If the agent runtime, orchestration layer, or tool adapter can freely invent the telemetry that the trust layer depends on, the system has recreated self-governance one layer down. Runtime alignment needs integrity as well as visibility. Telemetry must be attributable to a verifiable source, not taken on the agent’s word. SPIFFE/SPIRE workload identity provides one foundation: cryptographic attestation of which runtime component produced a given report. CAEP (Continuous Access Evaluation Protocol) and the Shared Signals Framework provide the transport: a standardized event stream for runtime security events flowing from the components that observe them to the Mission state owner that needs to act on them. Without both integrity and a signal pathway, the runtime alignment layer either trusts what agents report about themselves or polls blindly.
The right model treats residual runtime trust not as binary, but as a continuously scored, consumable resource. The pattern is just-in-time trust applied to Mission lifecycle. The system grants only the minimum authority needed at the moment it is needed and reconfirms rather than assumes continuity across the task’s execution. Recent runtime-governance work, including MI9, converges on a similar point: semantic telemetry, continuous authorization monitoring, drift detection, and graduated containment become necessary once agent behavior cannot be fully governed at approval time.
Each agent action is evaluated against a weighted intent score: does this align with the approved intent? Is it consistent with the agent’s established pattern for this task? How much damage if it results from context taint? How much of the Mission window remains? A low intent score on a high-risk action is grounds for pausing and requiring re-confirmation even when the action falls within the Mission authority envelope.
Each Mission carries a consumable trust budget. Actions that score low on intent alignment consume budget faster. Actions anomalous relative to the behavioral baseline consume faster still. Budget exhaustion triggers re-confirmation or Mission suspension. Budget state belongs with the Mission state owner. Exhaustion is a lifecycle event, not a side-channel signal.
Practical calibration needs a signal model, a baseline, and a cold-start policy. Useful observable signals include resource type distribution relative to the declared purpose, rate of external communication attempts relative to task phase, and semantic similarity of each action to prior approved executions of the same task class. The baseline for a given task class is most reliably established from reviewed template Missions. Prior human-reviewed executions define the expected action distribution that a new run is scored against. The cold-start problem has no clean solution. A reasonable default is to open with the strictest callback thresholds in the relevant template class and relax the budget as an execution baseline accumulates.
Semantic drift detection is the operational form: monitoring whether the agent’s actions are drifting from the semantic neighborhood of the approved intent. A board-packet preparation Mission should produce mostly reads, comparisons, and internal assembly operations. If the action log starts showing HR system reads, treasury queries, or external communication attempts, the distribution has shifted outside the expected neighborhood for this purpose class even if each individual call was permitted.
The governance boundary is precise. Shaped authority is authoritative for permit and deny decisions. Trust scores are authoritative for pause, callback, and suspension decisions. Trust scores do not expand Mission scope.
A trust budget is not additional privilege. It is a continued-reliance threshold for exercising already-shaped authority.
This is closer to risk-adaptive and continuous-evaluation control than to entitlement expansion.
The shaped Mission Authority Model answers “is this action permitted?” The trust layer answers “is the system still willing to rely on this agent’s judgment while it does it?”
Staged Mission shaping and trust budgets interact. A Stage 1 discovery envelope carries lower inherent risk (the agent cannot write, publish, or communicate externally), so the trust budget can be calibrated more permissively. When the Mission transitions to Stage 2, the budget policy should tighten: the same semantic drift tolerable during read-only discovery becomes a higher-stakes signal during write-capable execution. The Mission state owner issues a revised budget policy alongside the revised authority artifact at each stage transition.
Trust scoring is probabilistic, not deterministic. A sophisticated adversary may craft injected instructions that score well on intent alignment. The behavioral baseline must be established before drift can be detected, creating a cold-start problem. The scoring model itself becomes an attack surface. These limitations are real. The value of this layer is not certainty. A governed control plane that can act on partial confidence and bounded suspicion is more honest than one that pretends runtime alignment is either fully knowable or irrelevant.
Design Positions
The following positions have defensible answers. Each has a live counterargument worth naming. The first two concern Mission shaping directly; the last two concern containment and the runtime layer explored in this part.
LLM-based Mission shaping is acceptable as a proposal step, but should not be the sole authoritative one in high-assurance systems. The counterargument is that human review of a structured proposal does not actually close the verification gap: if the CFO cannot evaluate the compiled artifact, reviewing the LLM proposal does not help either. That objection is correct about the limits of review, but it argues for better review tooling (progressive disclosure, natural-language summaries of effective permissions), not for removing the human approval step. The LLM proposes. The authorizing system approves. The Mission state owner locks the result. Treating unconstrained LLM output as the authority artifact without that review is inference masquerading as authorization.
Mission shaping belongs with the Mission state owner, not in the agent. The counterargument is that the orchestration layer already owns execution state and is better positioned to perform shaping incrementally. That is architecturally attractive but it conflates semantic authority with execution management. If shaping lives in the agent or orchestrator, it is self-governance. The system of record for Mission state should own the authority artifact, staged expansions, and the runtime signals that can suspend or terminate it.
Confirmation thresholds for irreversible actions should be risk-based, not permission-based. Whether an action falls within the Mission authority envelope is the wrong question for high-stakes decisions. The right question is whether the action is irreversible and whether the risk magnitude justifies a human checkpoint regardless of authorization status. Financial transactions above a threshold, permanent data deletion, external communications on the user’s behalf: these warrant re-confirmation even when they are within authority bounds.
Containment controls should be first-class design decisions, not deployment accidents. Most deployments have some containment by accident. Treating containment as an intentional layer (deciding which tool calls are mediated, what credentials are issued, and which boundaries are independently observed) is a different and more defensible posture than hoping accidental constraints are sufficient.
Open Questions
These questions span both parts of the series.
Mission shaping
- What is the minimum viable purpose taxonomy for common agent use cases, and who defines it?
- Can Mission shaping be made deterministic enough to govern without losing the expressiveness that makes natural language intent useful?
- What does a verification UI look like that achieves better bounded comprehension without overwhelming the approver?
- Who can authorize a stage transition, and what mechanism prevents an agent from self-promoting to the next stage?
- At what point do accumulated expansion decisions make the executed authority model so different from the originally shaped one that the original approval no longer meaningfully covers the activity? This boundary defines when re-approval is required rather than another governed expansion; the architecture currently has no answer for it.
Authorization and enforcement
- What is the developer-facing API for agent teams to propose Missions, receive a Mission Authority Model, request expansions, and report telemetry? The governance model is described from the operator perspective; the agent side of the interface is not.
- How does Mission governance retrofit onto orchestration frameworks (LangGraph, CrewAI, AutoGen) that already create child agents in production without Mission authority? What is the integration surface?
- What is the right composition between enforcement-point-level authorization (IBAC or CaMeL style at the tool boundary, ASTRA style at token issuance) and Mission-level shaping? Are these complementary layers in the same control plane, or competing designs that reflect different threat models?
Runtime alignment
- How should trust budgets be sized and calibrated for different task classes?
- Can trust scoring resist adversarial optimization by an attacker who understands the scoring model?
- What is the right cold-start policy when no behavioral baseline exists for a new agent or task class?
Interoperability
- Is cross-organization authority portability achievable, or is it always federation with semantic gaps?
- Which task verticals are positioned to define domain-specific purpose taxonomies first?
Control-plane architecture
- Where should the Mission control plane boundary actually sit: inside the authorization server, in a separate authority service, or partially inside the orchestrator that already owns durable execution state?
- What is the governed failure mode when the Mission state owner is unavailable during an active Mission? An active execution that reaches Stage 2 while the Mission state owner loses quorum will continue honoring stale Stage 1 tokens at enforcement points; this is not a hypothetical edge case.
The Deeper Problem
The field is treating governance problems as engineering problems. Better token formats, tighter delegation chains, more expressive policy languages. Standards bodies are good at this work. But the protocol layer assumes the hard Mission shaping work is solved upstream.
It is not.
Some adjacent proposals, such as Authenticated Delegation and Authorized AI Agents, try to solve more of the problem at the delegation and credential layer by emphasizing authenticated delegation, agent-specific credentials, and auditable chains of accountability. Those are useful contributions. They still do not remove the need for Mission shaping, because binding a delegation chain to an agent does not by itself tell the system what bounded authority should exist for the task in the first place.
The systems that come closest to solving the enforcement problem are each explicit about this gap in their own limitations sections. IBAC’s model is scoped to a single request and a single agent; it does not address multi-agent delegation or long-running missions that span many requests. CaMeL handles data provenance within a single execution but not across delegation hops. The ASTRA work on semantic task-to-scope matching acknowledges directly that “applying delegated authorization in this way across chains of agents demands mechanisms for preserving requester context” that the approach does not yet provide. The gap this series addresses is not a gap the field has missed. It is a gap the field’s best current work has named and left open.
You can have a perfectly specified Mission-Bound OAuth deployment with a
mission_refon every token, lifecycle management at the AS, and verified actor continuity across every hop. And still have no idea whether the agent is operating within the bounds the user actually intended.
The conclusion is not that compilation is useless. It is that compilation is one disciplined form of Mission shaping, and Mission shaping alone is insufficient, whether formal or informal, without containment. Containment is not a fallback. It is the co-equal operational resilience layer. Better Mission shaping is necessary for structured domains where the task class is well-defined and semantic fidelity is achievable. Containment is equally necessary in open environments where the semantic model will always be incomplete. Mission governance requires both, and architectures should be explicit about which layer is carrying the weight in any given deployment.
The Mission shaping record also has a governance value independent of its safety properties. The Approved Mission is the prerequisite for meaningful audit: without a declared governance record of what was supposed to happen, the system cannot determine whether what happened was authorized, who approved it, and on what basis. In regulated environments where agent actions affect financial records, health data, or external communications, that audit prerequisite is not optional. Mission shaping may not prevent all misalignment. But it is the only mechanism that produces a reviewable approval object that post-hoc investigation can compare against the actual execution record.
The operational goal is survivable incorrectness: a control plane that can remain governable and limit damage even when the semantic model is partial, noisy, or wrong. Perfect semantic fidelity is not achievable at the scale and openness of real agent deployments. Survivable incorrectness is.
Deployment posture also depends on where the organization sits on the human approval spectrum. In practice there is a range: fully automated, human-on-the-loop (reviews batches), supervised autonomy (can interrupt), human-in-the-loop (approves each step). Where an organization sits on that spectrum directly determines how much Mission shaping and how much containment the deployment requires. That calibration is an organizational policy decision, not a protocol question.
That also narrows some of the design uncertainty. The overhead of formal Mission shaping is justified first for structured enterprise workflows with stable systems, legible approval points, and tolerable governance friction. In those domains, Mission shaping may take the form of explicit compilation. In the middle ground between full upfront planning and containment-first execution, the more realistic target is bounded adaptation: the agent can discover, propose, and adjust, but only inside a staged Mission envelope with governed expansion and runtime checkpoints.
For open-world tool-using agents, organizations should expect to lead with containment and treat Mission shaping more as a governance record than as the primary safety guarantee. In some task classes, that should be an explicit policy decision: choose containment-first governance rather than waiting for semantic shaping to mature, because survivability matters more than semantic richness. In other cases, the right policy may be stricter still. Do not allow the task to run agentically at all until the runtime is sufficiently contained and the blast radius is acceptable. The evidence so far also suggests that a shared semantic layer for shaped Mission authority is more likely to emerge as domain-specific families than as one universal model for all agent behavior.
That is a more demanding view of what agent authorization requires. It is also the more operationally honest one.