10 min readmedium+50 XP

Tune Instructions, Workflows and Constraints

Tuning an agent is rarely about changing the model — it is about tightening the instructions, the workflow steps, and the constraints around tool use. This topic walks through how to iterate on those levers safely using evaluators as a guard.

After this topic, you'll be confident about Instructions, Workflow agent, Constraint and 2 more concepts.

Tune Instructions, Workflows and Constraints

Most "we need a better model" instincts are misdiagnoses. The cheaper, safer levers are instructions, workflow steps, and constraints. Microsoft Foundry's agent model — Model + Instructions + Tools — makes the levers explicit, and version snapshotting makes iteration reversible.

The four levers, in order of cheap-to-expensive

Lever	When to use it	Risk
Instructions edit	Behaviour drift, missing reminders, formatting	Bloated prompt, contradictions
Workflow / constraint change	Missing or mis-ordered steps	More rigid agent, less flexibility
Tool config change	Wrong tool selected, bad schema, ambiguous description	Breaks existing flows that depended on the old behaviour
Model swap	Reasoning failures the other levers cannot fix	Largest blast radius; revalidate everything

The rule: change one lever at a time, re-run a fixed evaluation dataset, and keep the version snapshot so you can revert.

Try a tuning and see the outcome

Choose your own outcome

+50 XP

Your refund agent occasionally issues refunds without first verifying the customer owns the order. Evaluation shows the failure clusters in one mode. You have to pick a tune.

Which tune do you try first?

Quick check

1 of 3

+50 XP

You change four things in the agent — model, instructions, tool list, and workflow branch — and re-run the eval. Score improves. What is wrong with this tune?

Pick your answer.

Where this shows up on the exam

GH-600 questions will hand you a failure cluster and four candidate tunes. The right answer is almost always the cheapest lever that actually addresses the failure mode — and it is almost never "swap the model". When the failure is a missing step, the right answer is a workflow constraint, not a longer prompt.

Anchor concepts

Key terms

Instructions: The agent's prompt-based definition of goals, constraints, and behaviour. In Microsoft Foundry, prompt agents are defined almost entirely through instructions plus tool config.
Workflow agent: A declarative orchestration of multiple steps or agents — built visually or in YAML — that supports branching, human-in-the-loop, and group-chat patterns.
Constraint: An explicit rule the agent must follow: required tool, forbidden action, mandatory verification step, output schema, refusal condition.
Tuning loop: Iterative cycle of observe failures → form a hypothesis → change one lever (instruction, workflow step, constraint) → re-evaluate against a fixed dataset → keep or revert.
Versioning: Foundry snapshots agent versions automatically so you can roll back or A/B compare; tuning without versioning makes regressions irreversible.

Watch out

Common pitfalls

Changing several levers at once. When the score moves you cannot tell which change caused it.
Tuning to the model output you saw last time instead of to a fixed evaluation dataset — you optimise for noise.
Adding more text to the instructions for every failure. Instructions become bloated, contradictory, and lower quality.
Skipping versioning and snapshotting, so rolling back a bad tune means rewriting it from memory.