Run a Refinement Loop Today: A Sequential Procedure

This is a procedure, not an essay. If you have a task in front of you right now and want to refine a model's output into something good, follow these steps in order. Each step tells you exactly what to do and what to produce before moving on. The whole sequence is designed so you never have to guess what comes next.

The procedure assumes you understand why looping beats one-shot prompting. If you do not, read Mastering the Generate-Critique-Revise Cycle With Prompts first for the rationale, then come back here to execute. Beginners may want Refining Model Output by Looping: A Plain Introduction for a gentler on-ramp.

Work through the steps once on a real task. The procedure is short enough to internalize after a few runs, at which point it becomes second nature rather than a checklist.

Step One: Write the Standard First

Before you generate anything, write down what a good output looks like.

Do this

List three to five specific qualities the output must have.
Make each one checkable: you can point at the output and say yes or no.
Tie them to the actual task and reader, not generic quality.

Produce this

A short written standard you will hold every draft against. Do not skip to generation without it; a missing standard is the top cause of loops that never finish.

Step Two: Generate a Probe Draft

Now get a first draft, treating it as a probe rather than a product.

Do this

Write a plain request for what you want.
Do not over-engineer it; details get fixed in the loop.
Generate and read.

Produce this

A concrete first draft to react to. Its only job is to give you something real to evaluate against the standard from step one.

Step Three: Compare the Draft to the Standard

Hold the draft against your written standard, item by item.

Do this

Go through each standard item and mark pass or fail.
For each failure, note the location and the specific gap.
Ignore your general impression; judge against the list.

Produce this

A short list of specific failures, each with a place and a problem. This is your critique. Vague critique here produces vague revision later, so be precise. The reader-fitness version of this check is in Tailoring Prompts to Readers: Direct Answers to Real Questions.

Step Four: Pick the One Highest-Value Failure

Do not try to fix everything at once. Choose one.

Do this

From your failure list, pick the single one that matters most.
Usually this is the failure that would most stop someone from using the output.
Set the rest aside for later loops.

Produce this

One target for this iteration. Fixing one thing at a time is what keeps cause and effect legible, which the mistakes piece, 7 Common Mistakes with Prompting for Iterative Refinement Loops, treats as essential.

Step Five: Revise With a Targeted Instruction

Ask for exactly that one change and nothing else.

Do this

Name the location and the change you want.
Explicitly say to keep everything else the same.
Generate the revised draft.

Produce this

A new draft that should address your one target. Phrasing matters: "Rewrite the opening so it states the conclusion first; leave the rest unchanged" beats "make the opening better."

Step Six: Verify the Fix Without Regression

Check that the change worked and did not break anything else.

Do this

Confirm the target failure now passes.
Re-check the items that previously passed to make sure they still do.
Note any new failure the change introduced.

Produce this

An updated pass/fail picture. If the fix worked cleanly, move on. If it broke something, decide whether to keep the change and fix the new issue, or revert and try a different instruction.

Step Seven: Decide Loop or Stop

Make a single decision: go around again or finish.

Stop if

Every standard item now passes.
Remaining failures cost more to fix than they are worth.
The last revision produced no improvement anyone would notice.

Loop if

Standard items still fail.
Return to step four, pick the next highest-value failure, and continue.

Produce this

Either a finished output or a clear reason to run one more iteration. The discipline of stopping is covered in Prompting for Iterative Refinement Loops: Best Practices That Actually Work.

Running the Procedure at Speed

Once you have done this a few times, the steps compress.

What it becomes

Standard written quickly from memory of the task.
Probe, compare, pick one, revise, verify in a fast rhythm.
Stop decision made almost automatically against the standard.

Keeping the discipline

The temptation as you speed up is to drop the standard or batch changes. Resist both. The speed comes from familiarity with the steps, not from skipping them. The full rationale for each step lives in the complete guide.

A Worked Example of the Full Sequence

To make the procedure concrete, here is the whole sequence applied to a single task: turning a rough draft of a product announcement into a finished one.

The run, step by step

Standard: it must lead with the customer benefit, stay under 150 words, and name one clear next step.
Probe draft: a plain request produces an announcement that leads with company history and runs long.
Compare: it fails the benefit-first item, fails the length item, passes the next-step item.
Pick one: the benefit-first failure matters most, since a reader who does not see the benefit will not read on.
Revise: "Rewrite the opening to lead with what the customer gains; keep the rest." The new opening states the benefit.
Verify: benefit-first now passes, next-step still passes, length still fails.
Loop: pick the length failure, instruct "cut to under 150 words without losing the benefit or the next step," verify all three pass.
Stop: every standard item passes, so the loop ends.

Two iterations, each changing one thing, produced a finished announcement. Notice that the standard did all the heavy lifting: it told you what to fix, what order roughly to fix it in, and exactly when to stop.

Adapting the Procedure to Harder Tasks

The same seven steps scale to more complex work; only the standard and the number of loops change.

What shifts on a harder task

The standard has more items, so the compare step takes longer.
Picking the highest-value failure requires more judgment when several matter.
Verifying for regression matters more, because changes interact more in complex output.

What stays the same

The shape never changes: standard first, probe, compare, pick one, revise, verify, decide. A harder task means more loops, not different steps. Keeping the procedure identical across easy and hard tasks is what lets it become automatic. The opinionated reasoning behind holding this shape is in Prompting for Iterative Refinement Loops: Best Practices That Actually Work.

Frequently Asked Questions

Can I skip writing the standard if the task is simple?

You can hold a simple standard in your head, but write it down for anything with real stakes. The cost is a minute; the benefit is a loop that finishes. Most loops that wander do so because the standard was never made explicit, simple task or not.

What if no single failure is clearly the most important?

Pick the one a reader would hit first, or the one blocking the others. Order matters less than picking exactly one. The goal of choosing a single target is to keep each revision legible, so any reasonable choice beats trying to fix several at once.

How do I phrase a revision instruction well?

Name the location, state the change, and say to leave the rest alone. Specific and bounded beats broad and vague every time. "Tighten the third paragraph to two sentences, keep the others" works; "make it punchier" does not give the model enough to act on.

What do I do when a fix breaks something else?

Decide between two paths: keep the change and add the new issue to your failure list, or revert and try a different instruction. Because you changed only one thing, reverting is clean. This is the payoff of the one-change-at-a-time rule.

How many times should I run the loop?

Until the standard fully passes, which is usually three to five iterations. If you are going much longer, your standard is probably vague or you are batching changes. Both shorten the loop once corrected. The standard is also your finish line.

Can I automate this procedure?

The structure automates well: a fixed standard, a fixed critique format, and one change per step are exactly what a repeatable process needs. The judgment of what good looks like and which failure matters most stays human, but the mechanics around it standardize cleanly across many tasks.

Key Takeaways

Write a concrete, checkable standard before generating anything.
Treat the first draft as a probe to react to, not a finished product.
Compare against the standard item by item and produce specific, located failures.
Fix one highest-value failure per iteration and verify it without regression.
Stop when the standard fully passes; the standard is your finish line.

Work through the steps once on a real task. The procedure is short enough to internalize after a few runs, at which point it becomes second nature rather than a checklist.

Step One: Write the Standard First

Before you generate anything, write down what a good output looks like.

Do this

List three to five specific qualities the output must have.
Make each one checkable: you can point at the output and say yes or no.
Tie them to the actual task and reader, not generic quality.

Produce this

A short written standard you will hold every draft against. Do not skip to generation without it; a missing standard is the top cause of loops that never finish.

Step Two: Generate a Probe Draft

Now get a first draft, treating it as a probe rather than a product.

Do this

Write a plain request for what you want.
Do not over-engineer it; details get fixed in the loop.
Generate and read.

Produce this

A concrete first draft to react to. Its only job is to give you something real to evaluate against the standard from step one.

Step Three: Compare the Draft to the Standard

Hold the draft against your written standard, item by item.

Do this

Go through each standard item and mark pass or fail.
For each failure, note the location and the specific gap.
Ignore your general impression; judge against the list.

Produce this

Step Four: Pick the One Highest-Value Failure

Do not try to fix everything at once. Choose one.

Do this

From your failure list, pick the single one that matters most.
Usually this is the failure that would most stop someone from using the output.
Set the rest aside for later loops.

Produce this

Step Five: Revise With a Targeted Instruction

Ask for exactly that one change and nothing else.

Do this

Name the location and the change you want.
Explicitly say to keep everything else the same.
Generate the revised draft.

Produce this

A new draft that should address your one target. Phrasing matters: "Rewrite the opening so it states the conclusion first; leave the rest unchanged" beats "make the opening better."

Step Six: Verify the Fix Without Regression

Check that the change worked and did not break anything else.

Do this

Confirm the target failure now passes.
Re-check the items that previously passed to make sure they still do.
Note any new failure the change introduced.

Produce this

An updated pass/fail picture. If the fix worked cleanly, move on. If it broke something, decide whether to keep the change and fix the new issue, or revert and try a different instruction.

Step Seven: Decide Loop or Stop

Make a single decision: go around again or finish.

Stop if

Every standard item now passes.
Remaining failures cost more to fix than they are worth.
The last revision produced no improvement anyone would notice.

Loop if

Standard items still fail.
Return to step four, pick the next highest-value failure, and continue.

Produce this

Either a finished output or a clear reason to run one more iteration. The discipline of stopping is covered in Prompting for Iterative Refinement Loops: Best Practices That Actually Work.

Running the Procedure at Speed

Once you have done this a few times, the steps compress.

What it becomes

Standard written quickly from memory of the task.
Probe, compare, pick one, revise, verify in a fast rhythm.
Stop decision made almost automatically against the standard.

Keeping the discipline

A Worked Example of the Full Sequence

To make the procedure concrete, here is the whole sequence applied to a single task: turning a rough draft of a product announcement into a finished one.

The run, step by step

Standard: it must lead with the customer benefit, stay under 150 words, and name one clear next step.
Probe draft: a plain request produces an announcement that leads with company history and runs long.
Compare: it fails the benefit-first item, fails the length item, passes the next-step item.
Pick one: the benefit-first failure matters most, since a reader who does not see the benefit will not read on.
Revise: "Rewrite the opening to lead with what the customer gains; keep the rest." The new opening states the benefit.
Verify: benefit-first now passes, next-step still passes, length still fails.
Loop: pick the length failure, instruct "cut to under 150 words without losing the benefit or the next step," verify all three pass.
Stop: every standard item passes, so the loop ends.

Adapting the Procedure to Harder Tasks

The same seven steps scale to more complex work; only the standard and the number of loops change.

What shifts on a harder task

The standard has more items, so the compare step takes longer.
Picking the highest-value failure requires more judgment when several matter.
Verifying for regression matters more, because changes interact more in complex output.

What stays the same

Frequently Asked Questions

Can I skip writing the standard if the task is simple?

What if no single failure is clearly the most important?

How do I phrase a revision instruction well?

What do I do when a fix breaks something else?

How many times should I run the loop?

Can I automate this procedure?

Key Takeaways

Write a concrete, checkable standard before generating anything.
Treat the first draft as a probe to react to, not a finished product.
Compare against the standard item by item and produce specific, located failures.
Fix one highest-value failure per iteration and verify it without regression.
Stop when the standard fully passes; the standard is your finish line.

Run a Refinement Loop Today: A Sequential Procedure

Step One: Write the Standard First

Do this

Produce this

Step Two: Generate a Probe Draft

Do this

Produce this

Step Three: Compare the Draft to the Standard

Do this

Produce this

Step Four: Pick the One Highest-Value Failure

Do this

Produce this

Step Five: Revise With a Targeted Instruction

Do this

Produce this

Step Six: Verify the Fix Without Regression

Do this

Produce this

Step Seven: Decide Loop or Stop

Stop if

Loop if

Produce this

Running the Procedure at Speed

What it becomes

Keeping the discipline

A Worked Example of the Full Sequence

The run, step by step

Adapting the Procedure to Harder Tasks

What shifts on a harder task

What stays the same

Frequently Asked Questions

Can I skip writing the standard if the task is simple?

What if no single failure is clearly the most important?

How do I phrase a revision instruction well?

What do I do when a fix breaks something else?

How many times should I run the loop?

Can I automate this procedure?

Key Takeaways

Agency Script Editorial

Related Articles

Rolling Out AI Hallucinations Across a Team

A Model Behind an API Is Only Potential

Case Study: Large Language Models in Practice

Ready to certify your AI capability?

Run a Refinement Loop Today: A Sequential Procedure

Step One: Write the Standard First

Do this

Produce this

Step Two: Generate a Probe Draft

Do this

Produce this

Step Three: Compare the Draft to the Standard

Do this

Produce this

Step Four: Pick the One Highest-Value Failure

Do this

Produce this

Step Five: Revise With a Targeted Instruction

Do this

Produce this

Step Six: Verify the Fix Without Regression

Do this

Produce this

Step Seven: Decide Loop or Stop

Stop if

Loop if

Produce this

Running the Procedure at Speed

What it becomes

Keeping the discipline

A Worked Example of the Full Sequence

The run, step by step

Adapting the Procedure to Harder Tasks

What shifts on a harder task

What stays the same

Frequently Asked Questions

Can I skip writing the standard if the task is simple?

What if no single failure is clearly the most important?

How do I phrase a revision instruction well?