Skip to main content
graphwiz.aigraphwiz.ai
← Back to DevOps

Microsoft Agent Governance Toolkit: Runtime Security for AI Agents

DevOpsSecurityAI
microsoftagent-governanceowaspsecurityruntimeopen-source

On 2 April 2026, Microsoft published the Agent Governance Toolkit (AGT) on GitHub under the MIT licence. In a landscape where AI agents routinely hold API keys, shell access, and database credentials, AGT treats agent governance as a systems problem, not a prompt engineering one. The toolkit enforces security policies at runtime with the same rigour an operating system applies to process isolation: ring-based execution, kernel-level policy evaluation, and zero-trust identity between agents.

The problem with prompt-based safety

Prompt-based guardrails have a fundamental flaw: they ask a language model to police itself. Microsoft's own benchmarks quantify the gap. When safety rules are embedded in system prompts, the violation rate sits at 26.67%. The model can be socially engineered, distracted by complex instructions, or simply fails to follow rules under adversarial pressure. Every major prompt injection demonstration, from Simon Willison's early experiments to the "Comment and Control" attacks on Claude Code and GitHub Copilot, exploits this weakness.

AGT sidesteps the problem entirely. Rather than asking the model to behave, it enforces boundaries in the execution layer. Policies are evaluated by a dedicated policy kernel before any tool call reaches the target system. The result: a 0.00% violation rate across the same benchmark suite.

Architecture

The toolkit ships as seven composable packages, each addressing a distinct layer of the agent governance stack.

<svg viewBox="0 0 480 320" xmlns="http://www.w3.org/2000/svg" style="width:100%;max-width:480px;margin:1.5rem auto;display:block;font-family:system-ui,sans-serif"> <!-- Background --> <rect x="0" y="0" width="480" height="320" rx="8" fill="var(--background-secondary)"/> <!-- Title --> <text x="240" y="24" text-anchor="middle" fill="var(--foreground)" font-size="13" font-weight="600">Agent Governance Toolkit Architecture</text> <!-- Layer 3: Top --> <rect x="30" y="40" width="140" height="44" rx="6" fill="var(--accent)" opacity="0.15" stroke="var(--accent)" stroke-width="1.5"/> <text x="100" y="58" text-anchor="middle" fill="var(--accent)" font-size="11" font-weight="600">Agent Compliance</text> <text x="100" y="73" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">Audit &amp; reporting</text> <rect x="180" y="40" width="140" height="44" rx="6" fill="var(--accent)" opacity="0.15" stroke="var(--accent)" stroke-width="1.5"/> <text x="250" y="58" text-anchor="middle" fill="var(--accent)" font-size="11" font-weight="600">Agent Marketplace</text> <text x="250" y="73" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">Policy sharing</text> <rect x="330" y="40" width="120" height="44" rx="6" fill="var(--accent)" opacity="0.15" stroke="var(--accent)" stroke-width="1.5"/> <text x="390" y="58" text-anchor="middle" fill="var(--accent)" font-size="11" font-weight="600">Agent Lightning</text> <text x="390" y="73" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">RL governance</text> <!-- Arrow down --> <line x1="240" y1="88" x2="240" y2="105" stroke="var(--foreground-secondary)" stroke-width="1" marker-end="url(#arrow)"/> <!-- Layer 2: Middle --> <rect x="30" y="108" width="140" height="44" rx="6" fill="var(--background)" stroke="var(--foreground-secondary)" stroke-width="1"/> <text x="100" y="126" text-anchor="middle" fill="var(--foreground)" font-size="11" font-weight="600">AgentMesh</text> <text x="100" y="141" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">Zero-trust identity</text> <rect x="180" y="108" width="140" height="44" rx="6" fill="var(--background)" stroke="var(--foreground-secondary)" stroke-width="1"/> <text x="250" y="126" text-anchor="middle" fill="var(--foreground)" font-size="11" font-weight="600">Agent Runtime</text> <text x="250" y="141" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">Ring 0-3 execution</text> <rect x="330" y="108" width="120" height="44" rx="6" fill="var(--background)" stroke="var(--foreground-secondary)" stroke-width="1"/> <text x="390" y="126" text-anchor="middle" fill="var(--foreground)" font-size="11" font-weight="600">Agent SRE</text> <text x="390" y="141" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">Circuit breakers</text> <!-- Arrow down --> <line x1="240" y1="156" x2="240" y2="173" stroke="var(--foreground-secondary)" stroke-width="1" marker-end="url(#arrow)"/> <!-- Layer 1: Bottom (Agent OS) --> <rect x="60" y="176" width="360" height="50" rx="6" fill="var(--accent)" opacity="0.08" stroke="var(--accent)" stroke-width="2"/> <text x="240" y="198" text-anchor="middle" fill="var(--accent)" font-size="13" font-weight="700">Agent OS — Policy Kernel</text> <text x="240" y="215" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">YAML · OPA/Rego · Cedar · 0.011ms per evaluation</text> <!-- Arrow down to SDKs --> <line x1="240" y1="230" x2="240" y2="247" stroke="var(--foreground-secondary)" stroke-width="1" marker-end="url(#arrow)"/> <!-- SDK row --> <rect x="30" y="250" width="70" height="30" rx="4" fill="var(--background)" stroke="var(--foreground-secondary)" stroke-width="0.5"/> <text x="65" y="269" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">Python</text> <rect x="108" y="250" width="70" height="30" rx="4" fill="var(--background)" stroke="var(--foreground-secondary)" stroke-width="0.5"/> <text x="143" y="269" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">TypeScript</text> <rect x="186" y="250" width="70" height="30" rx="4" fill="var(--background)" stroke="var(--foreground-secondary)" stroke-width="0.5"/> <text x="221" y="269" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">Go</text> <rect x="264" y="250" width="70" height="30" rx="4" fill="var(--background)" stroke="var(--foreground-secondary)" stroke-width="0.5"/> <text x="299" y="269" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">.NET</text> <rect x="342" y="250" width="70" height="30" rx="4" fill="var(--background)" stroke="var(--foreground-secondary)" stroke-width="0.5" stroke-dasharray="4 2"/> <text x="377" y="269" text-anchor="middle" fill="var(--foreground-secondary)" font-size="9">Rust (dev)</text> <!-- Frameworks --> <text x="240" y="300" text-anchor="middle" fill="var(--foreground-secondary)" font-size="8">Integrates with LangChain · AutoGen · CrewAI · Semantic Kernel · LlamaIndex · 20+ frameworks</text> <!-- Arrow marker --> <defs><marker id="arrow" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="6" markerHeight="6" orient="auto"><path d="M 0 0 L 10 5 L 0 10 z" fill="var(--foreground-secondary)"/></marker></defs> </svg>

Agent OS

The policy kernel at the bottom of the stack. Supports three policy formats: YAML for simple allow/deny rules, OPA/Rego for complex conditional logic, and Cedar (AWS's policy language) for attribute-based access control. Policies are loaded at startup and evaluated on every tool invocation, with zero dependence on the LLM for enforcement.

Agent Runtime

Implements a four-ring execution model borrowed from CPU privilege architecture:

Ring Access Level Use Case
Ring 0 Full system access Orchestration layer, privileged operations
Ring 1 External tool calls API requests, database queries, file I/O
Ring 2 Internal computation Data transformation, reasoning steps
Ring 3 Sandbox only User input processing, output formatting

An agent running in Ring 1 cannot escalate to Ring 0 without an explicit policy grant. The runtime includes a kill switch that terminates an agent within 50ms of a policy violation.

AgentMesh

Provides zero-trust identity between agents. Each agent receives a cryptographic identity using Ed25519 for signing and ML-DSA-65 (NIST post-quantum) for key exchange. Agents verify each other's identity and permissions before any inter-agent communication, preventing spoofing and privilege escalation chains.

Agent SRE

Operational reliability layer with circuit breakers, rate limiting, and Prometheus-compatible metrics. When an agent enters a failure loop (repeated tool calls, escalating permissions, excessive output), the circuit breaker trips and the agent is isolated.

OWASP Agentic Top 10 compliance

AGT maps directly to all ten categories in the OWASP Agentic Top 10, published in early 2026:

  1. Prompt Injection — intercepted before reaching the LLM
  2. Sensitive Information Disclosure — data exfiltration policies block unauthorised output
  3. Supply Chain Vulnerabilities — policy signing and verification for agent packages
  4. Data Poisoning — input validation at the kernel layer
  5. Model Denial of Service — rate limiting and resource budgets
  6. Excessive Agency — ring-based privilege boundaries
  7. System Prompt Leakage — prompt content never reaches tool execution layer
  8. Code Interpreter Abuse — sandboxed execution with allowlisted capabilities
  9. Improper Authorization — AgentMesh identity verification on every call
  10. Overreliance on LLM — deterministic policy enforcement, not LLM judgment

Performance

The policy kernel is designed for the hot path. Benchmark results from the published evaluation:

  • 0.011ms per single policy evaluation
  • 0.103ms for a full kernel evaluation (all active policies)
  • 47,000 operations per second at 1,000 concurrent agents
  • 10,000x faster than inserting a policy check as an LLM call

In production, one early adopter reported 473 unauthorised actions blocked over 11 days with a total enforcement overhead of 0.43 seconds. The policy kernel adds less latency than a typical network round-trip.

Getting started

pip install "agent-governance-toolkit[full]"
from agt import PolicyEvaluator, PolicyDocument, PolicyRule, Effect

policy = PolicyDocument(rules=[
    PolicyRule(
        id="no-shell-access",
        effect=Effect.DENY,
        action="bash:*",
        condition="agent.ring > 1"
    ),
    PolicyRule(
        id="read-only-database",
        effect=Effect.DENY,
        action="sql:write",
        resource="production_*"
    )
])

evaluator = PolicyEvaluator(policy)
result = evaluator.evaluate(
    agent_id="code-reviewer",
    action="bash:rm -rf /",
    resource="/tmp/cache",
    context={"ring": 1}
)
# result.allowed == False
# result.matched_rule == "no-shell-access"

Where it fits

AGT enters a crowded space. NeMo Guardrails (NVIDIA) and Llama Guard (Meta) take a similar prompt-centric approach. Guardrails AI and Lakera focus on input/output filtering. Bedrock Guardrails provides AWS-native policy enforcement. What distinguishes AGT is the operating system analogy: governance as a kernel concern, not an application concern. The question for teams evaluating AGT is whether the added architectural complexity is justified by the security guarantees. For organisations running agents with access to production systems, the answer is increasingly clear.

The toolkit is in public preview. Microsoft has stated its intention to move it to a foundation governance model once the API stabilises. The repository is at github.com/microsoft/agent-governance-toolkit.