Why AI agents need a safety layer (before you let them touch real APIs)

Why agent + raw API calls are risky, and the safety loop I use: dry-run → review → apply → verify → receipt.

Qwayk

The quick version

Agents are powerful, but raw API calls are risky.

One wrong ID can edit the wrong thing. One missing check can publish by mistake. One careless log can leak a secret.

So I built Qwayk around one idea: a safety layer between the agent and the API.

The real problem

When you “just let an agent call an API”, these are the failure modes I see in real life:

  • Wrong ID (edits the wrong post/customer/asset)
  • Wrong environment (staging vs production)
  • Partial updates (some fields updated, some not, and now you’re in a weird state)
  • No audit trail (you can’t prove what happened later)
  • Silent data leaks (tokens, emails, or auth headers end up in logs/screenshots)

The scary part is not one mistake. It’s that agents can do mistakes at scale.

The safety layer (plain English)

This is the loop I recommend:

1) Dry-run (plan) — show what would change, but don’t write. 2) Review — you (or Codex) check intent. 3) Apply — you rerun with --apply (and --yes for risky actions). 4) Verify — the tool re-fetches and checks expected vs observed. 5) Receipt — you get a record of what happened.

And if the tool isn’t sure, it should refuse instead of guessing.

Example: a dry-run plan (JSON)

{
  "tool": "ghost-api-tool",
  "mode": "plan",
  "apply": false,
  "target": { "resource": "post", "id": "<post_id>" },
  "operations": [
    { "action": "update", "field": "meta_title", "from": "<old>", "to": "<new>" }
  ],
  "verification_plan": { "read_back": true, "idempotence_check": true }
}

Example: a good refusal

This is what “safe” looks like:

Refused: multiple posts match slug <slug>.
I won’t guess. Please use --id <post_id> or run posts list --filter ... to narrow it down.

It’s annoying for 10 seconds, but it prevents disasters.

What changes for you

The goal is not “full automation with no thinking”.

The goal is:

  • you move faster because the agent does the busywork
  • you sleep better because the tool refuses and verifies
  • you can prove what happened (plan + receipt)

What to do next