Designing Inspectable Artifacts
An inspectable artifact is what turns an agent run from a black box into a reviewable change. Learn which artifacts the exam expects, how they compose, and the pitfalls that make them useless in practice.
Designing Inspectable Artifacts
An agent that runs without leaving an artifact is, for review purposes, a rumour. The exam expects you to know which artifacts to produce, which properties make them actually inspectable, and which traps make them theatre.
The minimum artifact bundle
For any side-effecting run, the agent should leave behind:
- The plan β the structured plan that was approved at the boundary.
- The trace β every model call and tool call, joined by a stable trace ID.
- The diff (or equivalent output) β what actually changed in the world.
- The outcome β success / failure with a classified reason.
The canonical packaging for a code-modifying agent is a pull request: plan in the description, diff in the files, tool calls and trace linked from the body.
Properties that make an artifact inspectable
| Property | Why it matters | | --- | --- | | Durable | Outlives the agent's session, the deploy, the on-call rotation. | | Addressable | Has a stable URL or ID a reviewer can paste into Slack two weeks later. | | Granular | Line-by-line, step-by-step β coarse summaries hide the bugs. | | Reproducible | The inputs are captured well enough to re-run the agent. | | Accessible | Reviewers have permission to read it without a ticket. |
Failure-path artifacts matter more
A common trap: the agent emits gorgeous artifacts on the happy path and error: 500 on failure. This is exactly backwards. On failure you need more detail, not less:
- the last successful step,
- the tool call that broke,
- the classified error (
permission_denied/validation_failed/transient), - the partial plan up to the failure,
- a one-line "what would I need to retry?" hint.
Exam tip: when a question contrasts two designs, the one that produces a richer failure artifact is almost always the right answer.
The trace ID is the join key
Every artifact carries the same trace ID. That ID is what joins:
plan_v1 ββ trace_id ββ> tool_call_1 ββ trace_id ββ> diff ββ trace_id ββ> deploy_log
Without a stable trace ID you have files, not an investigation.
Quick check
Quick check
Which of the following is the strongest single example of an inspectable artifact for a code-modifying agent?
Where this shows up on the exam
Several Domain 1 questions phrase inspectability as "the next morning". Picture a reviewer arriving with a coffee, opening Slack, and trying to understand what the agent did overnight. If your design lets them do that in under five minutes β plan, diff, trace, classified outcome β you are answering the question correctly.
Key terms
- Inspectable artifact
- A durable, addressable, time-stable output of an agent run that a human reviewer can read, comment on, and reproduce later.
- Trace
- An end-to-end record of an agent run β every model call, tool call, plan, and outcome β joined by a stable trace ID.
- Pull request as artifact
- The canonical inspectable artifact for code-changing agents: plan in the description, diff in the files, tool calls linked from the body, trace ID in metadata.
- Reproducibility
- The ability to re-run the agent against the same inputs and see why it produced the same (or different) output.
Common pitfalls
- Rotating trace logs aggressively to save storage β then being unable to investigate an incident filed a week later.
- Storing the inspectable artifact behind authenticated dashboards that the actual reviewers don't have access to.
- Producing artifacts only for *successful* runs. Failed runs are where you need inspectability the most.