Zero Trust for AI Agents

Anthropic wrote the framework. emisar enforces it.

The lab behind Claude published Zero Trust for AI Agents — a security framework for giving autonomous agents access to real systems. Its core controls are least agency, deny-by-default tools, a human in the loop for high-risk actions, and an immutable audit trail. That is the exact boundary emisar enforces between an LLM and your infrastructure — several of which the framework files under its Enterprise and Advanced tiers, shipped on emisar's Free plan.

Least agency, by contract Human approval for high-risk actions Tamper-evident, hash-chained audit Built for "assume breach"

emisar is not affiliated with, endorsed by, or sponsored by Anthropic. “Zero Trust for AI Agents” is Anthropic's publication; we cite it because emisar is built to the control set it describes. The framework is, in its own words, “offered as a framework for your own evaluation, not as legal, compliance, or security assurance for any particular environment.”

The design test

Impossible, not tedious.

“Does this make the attack impossible, or just tedious? … Prefer a control that removes a capability over a control that throttles it.”
— Zero Trust for AI Agents, Anthropic

emisar is built to pass that test the hard way. An action that isn't declared in a trusted pack doesn't exist for the agent to call — there is no rate limit to grind through, no unusual port to find, no prompt to talk past. The capability is simply absent. A compromised agent can't escalate to a command the runner was never taught, because the runner refuses anything outside the contract — by construction, not by friction.

Control by control

The framework, mapped to the product.

Anthropic groups controls into three tiers — Foundation, Enterprise, Advanced. Here is where emisar lands on the ones it owns: the boundary between the agent and your hosts.

Control in the framework Tier How emisar enforces it
Deny-by-default tool allow-listing Foundation The trusted pack catalog is the allowlist. The runner refuses any action it doesn't declare — the model can't even see an undeclared command, let alone run it.
Least agency — typed, scoped capability Foundation Every action is a typed contract with per-argument bounds. Per-user and per-group runner scopes limit which hosts an agent can reach at all.
Short-lived, IdP-issued credentials — no static keys Foundation OAuth 2.1 for remote MCP clients; scoped, revocable keys (actions:execute, audit:read) for everything else. No long-lived shell key sitting on the host.
Parameter validation, tool-side Foundation The runner re-validates every argument against the action schema and clamps options to their min/max before exec — argv arrays, never a shell string.
Version-controlled, integrity-checked configuration Enterprise Packs are versioned, content-addressed YAML. A changed hash blocks dispatch until an admin trusts it; the runner recomputes the hash and refuses a mismatch.
Immutable audit trail Enterprise Every action is recorded to the searchable portal audit — the cloud system of record — and to an append-only, hash-chained JSONL journal on the host, where each line carries the prior line's SHA-256 so emisar audit verify catches any edited entry or break in the chain.
Human-in-the-loop approval for high-risk actions Advanced Risk-tiered policy holds destructive actions for a person. The approver sees the actor, the args, the target host, and the rule that fired — one click to allow or deny, recorded forever.
Just-in-time / just-enough access Advanced One-use approvals and scoped standing grants. The agent gets the capability for the task, not standing access that outlives it.
Real-time streaming to SIEM Advanced NDJSON audit export over a dedicated audit:read key with keyset cursor pagination — your SIEM correlates emisar activity without ever holding a token that can execute an action.

The tiers are Anthropic's. The point of the colors: the approval gate, JIT access, and SIEM streaming are filed under Advanced — and emisar ships them on the Free plan.

Where emisar stops

One pillar, not the whole framework.

The framework is broader than any one product, and we'd rather be precise than oversell. emisar owns the tool-access pillar — what an agent can do to your infrastructure, and the record of it. It does not do these, and doesn't claim to:

  • Model supply chain — poisoned weights, fine-tuning backdoors, AI-BOM
  • Memory & context poisoning across an agent's sessions
  • Prompt-injection classifiers, spotlighting, input sanitization
  • Behavioral anomaly detection and ML baselining

Here's the reframe that makes the scope a feature. emisar doesn't try to keep the agent pure — it assumes the agent may already be jailbroken or prompt-injected, and makes sure a compromised agent still can't exceed declared, gated, audited actions. That is the framework's first principle — assume breach — applied to the one place an AI agent can do real damage: the keyboard in front of your servers.

Operations

Automate the bookkeeping, not the decisions.

“Models should take notes, capture artifacts, pursue parallel investigation tracks, and draft the postmortem. Humans should make the containment calls.”
— Zero Trust for AI Agents, Anthropic

That is exactly the emisar loop. The agent investigates through read actions, correlates, and drafts the fix; the one destructive step stops for a person and lands on the record. We have a real incident written up end to end — a CSI driver that reformatted a live volume and wiped 33 hours of metrics, contained on a single approval.