Classify Agent Actions by Risk
Before you add a single guardrail you need a risk taxonomy. This topic teaches the two axes that matter β blast radius and reversibility β and shows how Microsoft's Responsible AI pattern (Discover β Protect β Govern) maps onto an action-by-action classification you can defend in a design review.
Classify Agent Actions by Risk
Every guardrail you will ever add answers a question that should have been asked first: how bad is this if it goes wrong? The exam expects you to be able to walk into a design review with a defensible risk classification for every action your agent can take.
The 2x2 that matters
Two axes do almost all the work:
| | Reversible | Irreversible | | --- | --- | --- | | Small blast radius | Read a file, open a draft PR | Send an internal email, post to a team channel | | Large blast radius | Run a sandboxed eval, build in CI | Deploy to prod, charge a customer, force-push to main |
Anything in the bottom-right cell needs explicit human authorization. Anything in the top-left can usually run with logging only. The two off-diagonals are where teams get into trouble β irreversible-but-small actions look harmless until the agent does 10,000 of them.
Map to the Responsible AI pattern
Microsoft's Responsible AI guidance for Foundry organises trust work into three stages: Discover risks, Protect at the model and agent runtime levels, and Govern through tracing and monitoring. Risk classification is a Discover activity. You cannot pick a content filter, a permission scope, or an approval policy until you have an enumerated list of actions with tiers attached.
Exam tip: when a question asks "what is the first thing you do before adding guardrails?" the answer is almost always "classify and assess the risks" β not "turn on a filter."
A working taxonomy
A defensible three-tier taxonomy:
- Tier 1 β Low: read-only, auditable, in-tenant. Log it; no approval.
- Tier 2 β Medium: writes that are reversible inside the system of record (open PR, create issue, schedule a job). Require a reviewable artifact.
- Tier 3 β High: irreversible, externally-visible, or touching production. Require explicit per-action human authorization and an audit trail.
Quick check
Quick check
Which pair of axes is most useful for classifying agent actions into risk tiers?
Where this shows up on the exam
GH-600 will hand you a scenario and ask which guardrail to apply. The shortcut is: classify the action first, then the right guardrail is usually the minimum control that matches the tier. Over-guarding low-tier actions is just as wrong as under-guarding high-tier ones.
Key terms
- Blast radius
- The scope of users, systems, or data an action can affect if it goes wrong. A staging branch has a small blast radius; production billing has a large one.
- Reversibility
- Whether the action can be undone without external coordination. Opening a PR is reversible; sending an email or charging a card is not.
- Risk tier
- A discrete bucket (e.g., low / medium / high) that pairs blast radius and reversibility with a required guardrail profile.
- Discover / Protect / Govern
- Microsoft's Responsible AI pattern: discover risks before deployment, protect at model and runtime layers, govern through tracing and monitoring in production.
Common pitfalls
- Treating risk as a single dimension ("is it dangerous?") instead of a 2x2 of blast radius and reversibility β a tiny but irreversible action still belongs in the highest tier.
- Classifying by the *tool* (e.g., "shell is high risk") instead of by the *action* ("git status" is read-only; "rm -rf" is not). The taxonomy lives at the action level.
- Forgetting that read-only data access can still be high-risk when the data itself is sensitive (PII, customer financial records, security incident detail).