← All posts

Least Privilege for AI Agents: A Practical Playbook

We've spent two decades learning to apply least privilege to human users and service accounts. AI agents are a new kind of principal — they hold credentials, run commands, and call cloud APIs, but they also act autonomously and can be steered by the very text they read. That combination raises the stakes. This is a practical playbook for giving an agent exactly the access it needs and not one permission more.

Why agents are a different kind of principal

A traditional script does exactly what it was written to do. An AI agent decides what to do at runtime, in response to inputs — including documents, web pages, tool outputs, and tickets it reads along the way. That opens a failure mode classic IAM never had to worry about: prompt injection, where hostile text in the data convinces the agent to misuse its own legitimate permissions. As we covered in Security in the Age of AI, the real-world incidents of agents wiping databases and infrastructure weren't exotic hacks — they were over-privileged agents doing what they were (accidentally) told. Least privilege is the control that turns "catastrophe" into "contained mistake."

The playbook

1. Give each agent its own identity

Don't let an agent borrow a human's credentials or a shared admin role. Provision a dedicated identity per agent or per task (an IAM role, a service principal, a scoped token) so its actions are attributable, its permissions are tailored, and you can revoke it without disrupting anyone else. "The agent ran as me" is a sentence you never want to say during an incident review.

2. Read-only by default; writes are a deliberate grant

Most agent work — reading code, querying data, investigating — needs no write access at all. Default to read-only, and make every mutating capability an explicit, justified addition. This pairs naturally with multi-agent designs: let read-only sub-agents do the exploring and route the few write actions through a single, closely-watched parent or a human.

3. Scope tightly — resources, not just actions

"Least privilege" means narrowing what and where, not only which verb. Grant access to the specific bucket, table, repository, or namespace the task touches — not *. Use resource constraints, conditions, and tags so a token that can write to one project can't reach another. The blast radius of a compromised or confused agent is defined by this scoping.

4. Make credentials short-lived

Long-lived API keys sitting in an agent's environment are a standing liability. Prefer short-lived, automatically-rotated credentials — temporary role assumption (e.g. STS), workload identity federation, or vault-issued leases that expire in minutes. If a credential does leak, a tight expiry caps the damage window. Never bake static secrets into prompts, configs, or logs.

5. Sandbox the execution environment

IAM scoping limits what the agent's credentials can do; sandboxing limits what its process can do. Run agents inside an OS-level sandbox or isolated container with filesystem boundaries and network egress controls — the model the leading agentic tools now ship (macOS Seatbelt, Linux namespaces and seccomp, hosted ephemeral VMs). Default network access off and allow-list only the endpoints a task needs; that single control blunts most data-exfiltration paths, including injection-driven ones.

6. Gate irreversible actions on a human

Speed is the point of agents, but not for everything. Put an approval checkpoint in front of actions you can't take back: deletes, production deploys, force-pushes, IAM changes, spending. Let the agent move fast on reversible work and pause for a human on the rest. Most modern agent runners support exactly this — auto-approve safe operations, prompt on dangerous ones.

7. Defend against prompt injection at the boundary

Treat any external content an agent ingests as untrusted input. Keep high-privilege actions away from untrusted-data contexts — an agent summarizing arbitrary web pages should not also hold deploy keys. Separate the "read the world" agent from the "change our systems" agent, and require the privileged one to act only on validated, structured instructions rather than free-form text it scraped.

8. Log everything and review it

Because an agent acts on its own, the audit trail is how you reconstruct what happened. Capture the tools it called, the resources it touched, and the decisions it made, ship those logs somewhere tamper-resistant, and alert on the high-risk ones. Least privilege reduces what can go wrong; audit tells you what did.

A quick maturity check

QuestionIf the answer is "no"…
Does each agent have its own scoped identity?Start here — shared/admin creds are the biggest risk.
Is the default read-only?Flip it; make writes opt-in.
Are credentials short-lived?Move to temporary, rotated credentials.
Is execution sandboxed with egress control?Add a sandbox; default network off.
Do irreversible actions require approval?Add a human gate for deletes/deploys.
Can you audit what the agent did?Turn on and centralize logging now.

The takeaway

AI agents don't need a new security philosophy — they need the old one applied rigorously to a faster, more autonomous, more manipulable principal. Scope tightly, keep credentials short-lived and read-only by default, sandbox execution, gate the irreversible, separate untrusted reading from privileged acting, and audit all of it. Do that and an agent stays what it should be: a powerful assistant whose worst day is an inconvenience, not an outage. These are also the muscles certified cloud-security practitioners already train — see our take on why that expertise still matters.

Sources & further reading

← Back to all posts