Feb 28, 20264 min read

Permissioned Tools for AI Agents: Stop Giving the Bot Root

An agent with every tool is a liability. Here's how I design scopes, budgets, and audit trails so agents stay useful without being dangerous.

Abstract permission boundaries

Every tool you give an agent is a capability you accept responsibility for.

If you hand it root, it will eventually do something you can't explain or undo. So I design tools like a security engineer, not a prompt engineer.

I assume mistakes will happen. The only question is how big the blast radius is when they do.

The goal isn't zero risk. It's controlled risk you can explain and undo.

I won't compromise on this.

Scope Tools Like APIs

Tools are not magic. They're APIs with blast radius. I scope them aggressively.

Scope: What the tool can touch, not just what it can do.

ToolAllowed scopeDenied scope
crm.updateAccounts tagged managed=trueAny enterprise account
email.sendCustomer support repliesMarketing campaigns
billing.refundAmount <= $50Anything above
repo.write/agent-sandbox/*Production branches

I treat scope like a permission boundary. If I can't explain the boundary in one sentence, it's too loose.

I scope by resource, environment, and time. Production is a different world than staging. “Today” is different from “last quarter.” Those are separate permissions.

Scope tests: I write policy tests that assert the agent is blocked outside the boundary. If it can slip through in staging, it will slip through in prod.

Default to Read-Only

Read-only tools are the fastest way to ship useful agents without fear.

Mode first: Read, then write, then destructive.

ModeExample toolRisk level
Readmetrics.query, crm.readLow
Writecrm.update, repo.writeMedium
Destructivebilling.refund, user.deleteHigh

I start every agent in read-only mode and earn the right to add writes. It forces discipline and exposes where human review is actually needed.

When I add writes, I start in shadow mode. The agent proposes the write, I log it, and I compare it to what a human would have done.

Enforce Budgeted Access

Capabilities need budgets: time, cost, and frequency. This is the same safety pattern as circuit breakers.

Budget: A hard cap that forces the agent to stop or ask for help.

# tool-budget.json
{
  "billing.refund": { "daily_limit": 5, "max_amount": 50 },
  "repo.write": { "max_files": 3, "max_lines": 200 },
  "search.web": { "cost_ceiling_usd": 1.00, "timeout_ms": 4000 }
}

I don't rely on the model to self-regulate. I rely on budgets. I covered the same logic in circuit breakers for AI.

Budgets should fail closed. If a tool times out or hits the cap, the agent escalates or stops. “Try again later” is safer than a runaway loop.

I also add cool-down windows. A tool that just failed should not be retried 20 times in a minute.

Use Capability Tokens

Prompts can be tricked. Tokens can't.

Capability token: A signed, short-lived grant attached to a specific action.

// pseudo-code
const token = sign({
  tool: "billing.refund",
  amount: 25,
  reason: "duplicate charge",
  expiresIn: "5m",
});

agent.callTool("billing.refund", { amount: 25, token });

The agent can't invent a token. It has to earn it through a guardrail or a human approval.

I tie tokens to policy decisions, not just actions. If the policy changes, old tokens become invalid immediately.

This is how I prevent “prompt drift.” Even if the prompt changes, the token rules don't.

Add Human Gates for High-Risk Actions

Some actions are too risky for automation. I put a human gate there and move on.

Gate: Escalate when the decision is irreversible or expensive.

flowchart LR
  Request["Agent proposes action"] --> Risk{"High risk?"}
  Risk -- "no" --> Execute["Execute tool"]
  Risk -- "yes" --> Review["Human review"]
  Review --> Execute

This is not a failure of automation. It's respect for reality. I go deeper on this in human-in-the-loop done right.

I define “high risk” explicitly: money moves, data deletion, external communication, and anything with legal or compliance impact. If it hits one of those, a human signs off.

Audit Everything

If a tool call can't be traced, it didn't happen. Audits are the safety net.

Audit log: Enough context to replay the decision.

2026-02-28T12:14:22Z
agent=triage-bot
tool=billing.refund
input.amount=25
input.order_id=ord_2841
policy=refund_under_50
decision=approved
latency_ms=812

This is how you debug incidents and explain behavior to stakeholders. It also powers your evals and observability stack.

I log the policy version, the exact tool arguments, and the model ID. If you can't replay it, you can't fix it.

Audits are also how you build better systems. They feed your eval suites, show where humans disagree, and prove compliance when someone inevitably asks.


Give agents fewer tools, and they become more trustworthy.

Enjoyed this article?

Share it with others or connect with me