Microsoft Agent Governance Toolkit: Runtime Security for AI Agents
On 2 April 2026, Microsoft published the Agent Governance Toolkit (AGT) on GitHub under the MIT licence. In a landscape where AI agents routinely hold API keys, shell access, and database credentials, AGT treats agent governance as a systems problem, not a prompt engineering one. The toolkit enforces security policies at runtime with the same rigour an operating system applies to process isolation: ring-based execution, kernel-level policy evaluation, and zero-trust identity between agents.
The problem with prompt-based safety
Prompt-based guardrails have a fundamental flaw: they ask a language model to police itself. Microsoft's own benchmarks quantify the gap. When safety rules are embedded in system prompts, the violation rate sits at 26.67%. The model can be socially engineered, distracted by complex instructions, or simply fails to follow rules under adversarial pressure. Every major prompt injection demonstration, from Simon Willison's early experiments to the "Comment and Control" attacks on Claude Code and GitHub Copilot, exploits this weakness.
AGT sidesteps the problem entirely. Rather than asking the model to behave, it enforces boundaries in the execution layer. Policies are evaluated by a dedicated policy kernel before any tool call reaches the target system. The result: a 0.00% violation rate across the same benchmark suite.
Architecture
The toolkit ships as seven composable packages, each addressing a distinct layer of the agent governance stack.
<svg viewBox="0 0 480 320" xmlns="http://www.w3.org/2000/svg" style="width:100%;max-width:480px;margin:1.5rem auto;display:block;font-family:system-ui,sans-serif"> <!-- Background --> <rect x="0" y="0" width="480" height="320" rx="8" fill="var(--background-secondary)"/> <!-- Title --> <text x="240" y="24" text-anchor="middle" fill="var(--foreground)" font-size="13" font-weight="600">Agent Governance Toolkit Architecture</text> <!-- Layer 3: Top --> <rect x="30" y="40" width="140" height="44" rx="6" fill="var(--accent)" opacity="0.15" stroke="var(--accent)" stroke-width="1.5"/> <text x="100" y="58" text-anchor="middle" fill="var(--accent)" font-size="11" font-weight="600">Agent Compliance</text> <text x="100" y="73" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">Audit & reporting</text> <rect x="180" y="40" width="140" height="44" rx="6" fill="var(--accent)" opacity="0.15" stroke="var(--accent)" stroke-width="1.5"/> <text x="250" y="58" text-anchor="middle" fill="var(--accent)" font-size="11" font-weight="600">Agent Marketplace</text> <text x="250" y="73" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">Policy sharing</text> <rect x="330" y="40" width="120" height="44" rx="6" fill="var(--accent)" opacity="0.15" stroke="var(--accent)" stroke-width="1.5"/> <text x="390" y="58" text-anchor="middle" fill="var(--accent)" font-size="11" font-weight="600">Agent Lightning</text> <text x="390" y="73" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">RL governance</text> <!-- Arrow down --> <line x1="240" y1="88" x2="240" y2="105" stroke="var(--foreground-secondary)" stroke-width="1" marker-end="url(#arrow)"/> <!-- Layer 2: Middle --> <rect x="30" y="108" width="140" height="44" rx="6" fill="var(--background)" stroke="var(--foreground-secondary)" stroke-width="1"/> <text x="100" y="126" text-anchor="middle" fill="var(--foreground)" font-size="11" font-weight="600">AgentMesh</text> <text x="100" y="141" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">Zero-trust identity</text> <rect x="180" y="108" width="140" height="44" rx="6" fill="var(--background)" stroke="var(--foreground-secondary)" stroke-width="1"/> <text x="250" y="126" text-anchor="middle" fill="var(--foreground)" font-size="11" font-weight="600">Agent Runtime</text> <text x="250" y="141" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">Ring 0-3 execution</text> <rect x="330" y="108" width="120" height="44" rx="6" fill="var(--background)" stroke="var(--foreground-secondary)" stroke-width="1"/> <text x="390" y="126" text-anchor="middle" fill="var(--foreground)" font-size="11" font-weight="600">Agent SRE</text> <text x="390" y="141" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">Circuit breakers</text> <!-- Arrow down --> <line x1="240" y1="156" x2="240" y2="173" stroke="var(--foreground-secondary)" stroke-width="1" marker-end="url(#arrow)"/> <!-- Layer 1: Bottom (Agent OS) --> <rect x="60" y="176" width="360" height="50" rx="6" fill="var(--accent)" opacity="0.08" stroke="var(--accent)" stroke-width="2"/> <text x="240" y="198" text-anchor="middle" fill="var(--accent)" font-size="13" font-weight="700">Agent OS — Policy Kernel</text> <text x="240" y="215" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">YAML · OPA/Rego · Cedar · 0.011ms per evaluation</text> <!-- Arrow down to SDKs --> <line x1="240" y1="230" x2="240" y2="247" stroke="var(--foreground-secondary)" stroke-width="1" marker-end="url(#arrow)"/> <!-- SDK row --> <rect x="30" y="250" width="70" height="30" rx="4" fill="var(--background)" stroke="var(--foreground-secondary)" stroke-width="0.5"/> <text x="65" y="269" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">Python</text> <rect x="108" y="250" width="70" height="30" rx="4" fill="var(--background)" stroke="var(--foreground-secondary)" stroke-width="0.5"/> <text x="143" y="269" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">TypeScript</text> <rect x="186" y="250" width="70" height="30" rx="4" fill="var(--background)" stroke="var(--foreground-secondary)" stroke-width="0.5"/> <text x="221" y="269" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">Go</text> <rect x="264" y="250" width="70" height="30" rx="4" fill="var(--background)" stroke="var(--foreground-secondary)" stroke-width="0.5"/> <text x="299" y="269" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">.NET</text> <rect x="342" y="250" width="70" height="30" rx="4" fill="var(--background)" stroke="var(--foreground-secondary)" stroke-width="0.5" stroke-dasharray="4 2"/> <text x="377" y="269" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">Rust (dev)</text> <!-- Frameworks --> <text x="240" y="300" text-anchor="middle" fill="var(--foreground-secondary)" font-size="8">Integrates with LangChain · AutoGen · CrewAI · Semantic Kernel · LlamaIndex · 20+ frameworks</text> <!-- Arrow marker --> <defs><marker id="arrow" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="6" markerHeight="6" orient="auto"><path d="M 0 0 L 10 5 L 0 10 z" fill="var(--foreground-secondary)"/></marker></defs> </svg>Agent OS
The policy kernel at the bottom of the stack. Supports three policy formats: YAML for simple allow/deny rules, OPA/Rego for complex conditional logic, and Cedar (AWS's policy language) for attribute-based access control. Policies are loaded at startup and evaluated on every tool invocation, with zero dependence on the LLM for enforcement.
Agent Runtime
Implements a four-ring execution model borrowed from CPU privilege architecture:
| Ring | Access Level | Use Case |
|---|---|---|
| Ring 0 | Full system access | Orchestration layer, privileged operations |
| Ring 1 | External tool calls | API requests, database queries, file I/O |
| Ring 2 | Internal computation | Data transformation, reasoning steps |
| Ring 3 | Sandbox only | User input processing, output formatting |
An agent running in Ring 1 cannot escalate to Ring 0 without an explicit policy grant. The runtime includes a kill switch that terminates an agent within 50ms of a policy violation.
AgentMesh
Provides zero-trust identity between agents. Each agent receives a cryptographic identity using Ed25519 for signing and ML-DSA-65 (NIST post-quantum) for key exchange. Agents verify each other's identity and permissions before any inter-agent communication, preventing spoofing and privilege escalation chains.
Agent SRE
Operational reliability layer with circuit breakers, rate limiting, and Prometheus-compatible metrics. When an agent enters a failure loop (repeated tool calls, escalating permissions, excessive output), the circuit breaker trips and the agent is isolated.
OWASP Agentic Top 10 compliance
AGT maps directly to all ten categories in the OWASP Agentic Top 10, published in early 2026:
- Prompt Injection — intercepted before reaching the LLM
- Sensitive Information Disclosure — data exfiltration policies block unauthorised output
- Supply Chain Vulnerabilities — policy signing and verification for agent packages
- Data Poisoning — input validation at the kernel layer
- Model Denial of Service — rate limiting and resource budgets
- Excessive Agency — ring-based privilege boundaries
- System Prompt Leakage — prompt content never reaches tool execution layer
- Code Interpreter Abuse — sandboxed execution with allowlisted capabilities
- Improper Authorization — AgentMesh identity verification on every call
- Overreliance on LLM — deterministic policy enforcement, not LLM judgment
Performance
The policy kernel is designed for the hot path. Benchmark results from the published evaluation:
- 0.011ms per single policy evaluation
- 0.103ms for a full kernel evaluation (all active policies)
- 47,000 operations per second at 1,000 concurrent agents
- 10,000x faster than inserting a policy check as an LLM call
In production, one early adopter reported 473 unauthorised actions blocked over 11 days with a total enforcement overhead of 0.43 seconds. The policy kernel adds less latency than a typical network round-trip.
Getting started
pip install "agent-governance-toolkit[full]"
from agt import PolicyEvaluator, PolicyDocument, PolicyRule, Effect
policy = PolicyDocument(rules=[
PolicyRule(
id="no-shell-access",
effect=Effect.DENY,
action="bash:*",
condition="agent.ring > 1"
),
PolicyRule(
id="read-only-database",
effect=Effect.DENY,
action="sql:write",
resource="production_*"
)
])
evaluator = PolicyEvaluator(policy)
result = evaluator.evaluate(
agent_id="code-reviewer",
action="bash:rm -rf /",
resource="/tmp/cache",
context={"ring": 1}
)
# result.allowed == False
# result.matched_rule == "no-shell-access"
Where it fits
AGT enters a crowded space. NeMo Guardrails (NVIDIA) and Llama Guard (Meta) take a similar prompt-centric approach. Guardrails AI and Lakera focus on input/output filtering. Bedrock Guardrails provides AWS-native policy enforcement. What distinguishes AGT is the operating system analogy: governance as a kernel concern, not an application concern. The question for teams evaluating AGT is whether the added architectural complexity is justified by the security guarantees. For organisations running agents with access to production systems, the answer is increasingly clear.
The toolkit is in public preview. Microsoft has stated its intention to move it to a foundation governance model once the API stabilises. The repository is at github.com/microsoft/agent-governance-toolkit.