8 min readmedium+40 XP

Designing Inspectable Artifacts

An inspectable artifact is what turns an agent run from a black box into a reviewable change. Learn which artifacts the exam expects, how they compose, and the pitfalls that make them useless in practice.

After this topic, you'll be confident about Inspectable artifact, Trace, Pull request as artifact and 1 more concept.

Designing Inspectable Artifacts

An agent that runs without leaving an artifact is, for review purposes, a rumour. The exam expects you to know which artifacts to produce, which properties make them actually inspectable, and which traps make them theatre.

The minimum artifact bundle

For any side-effecting run, the agent should leave behind:

The plan — the structured plan that was approved at the boundary.
The trace — every model call and tool call, joined by a stable trace ID.
The diff (or equivalent output) — what actually changed in the world.
The outcome — success / failure with a classified reason.

The canonical packaging for a code-modifying agent is a pull request: plan in the description, diff in the files, tool calls and trace linked from the body.

Properties that make an artifact inspectable

| Property | Why it matters | | --- | --- | | Durable | Outlives the agent's session, the deploy, the on-call rotation. | | Addressable | Has a stable URL or ID a reviewer can paste into Slack two weeks later. | | Granular | Line-by-line, step-by-step — coarse summaries hide the bugs. | | Reproducible | The inputs are captured well enough to re-run the agent. | | Accessible | Reviewers have permission to read it without a ticket. |

Failure-path artifacts matter more

A common trap: the agent emits gorgeous artifacts on the happy path and error: 500 on failure. This is exactly backwards. On failure you need more detail, not less:

the last successful step,
the tool call that broke,
the classified error (permission_denied / validation_failed / transient),
the partial plan up to the failure,
a one-line "what would I need to retry?" hint.

Exam tip: when a question contrasts two designs, the one that produces a richer failure artifact is almost always the right answer.

The trace ID is the join key

Every artifact carries the same trace ID. That ID is what joins:

plan_v1 ── trace_id ──> tool_call_1 ── trace_id ──> diff ── trace_id ──> deploy_log

Without a stable trace ID you have files, not an investigation.

Quick check

1 of 3

+40 XP

Which of the following is the strongest single example of an inspectable artifact for a code-modifying agent?

Pick your answer.

Where this shows up on the exam

Several Domain 1 questions phrase inspectability as "the next morning". Picture a reviewer arriving with a coffee, opening Slack, and trying to understand what the agent did overnight. If your design lets them do that in under five minutes — plan, diff, trace, classified outcome — you are answering the question correctly.

Anchor concepts

Key terms

Inspectable artifact: A durable, addressable, time-stable output of an agent run that a human reviewer can read, comment on, and reproduce later.
Trace: An end-to-end record of an agent run — every model call, tool call, plan, and outcome — joined by a stable trace ID.
Pull request as artifact: The canonical inspectable artifact for code-changing agents: plan in the description, diff in the files, tool calls linked from the body, trace ID in metadata.
Reproducibility: The ability to re-run the agent against the same inputs and see why it produced the same (or different) output.

Watch out

Common pitfalls

Rotating trace logs aggressively to save storage — then being unable to investigate an incident filed a week later.
Storing the inspectable artifact behind authenticated dashboards that the actual reviewers don't have access to.
Producing artifacts only for *successful* runs. Failed runs are where you need inspectability the most.