Running Stateful Conversations Without Losing the Thread

A playbook is different from a guide. A guide explains concepts; a playbook tells you what to do, when to do it, and who is responsible. This is the operating playbook for keeping dialogue state coherent across a long conversation — a set of named plays, the triggers that fire each one, the owner accountable for it, and the sequence that ties them together.

Treat it as something you can hand to a team and have them execute without you in the room. Each play below is small and self-contained. The value is in the sequencing: knowing that contradiction handling fires before compaction, that validation fires before commit, and that reconciliation fires on every consequential turn. Run the plays out of order and the system still breaks in familiar ways.

The plays assume the architecture established elsewhere in this series — a canonical state object owned by your code, with the model proposing updates. If that foundation is unfamiliar, start with Tracking Conversation State When Prompts Get Complicated and come back.

Play 1: Initialize State

The first play fires when a conversation begins.

Trigger and Owner

Trigger: a new session starts. Owner: the conversation runtime.

The Play

Create an empty state object from the canonical schema.
Seed it with known facts — authenticated user identity, account tier, entitlements pulled from your application.
Establish anchor fields that may never be compacted away.

Starting with a schema-shaped, pre-seeded object means every later play has a consistent structure to operate on.

Play 2: Propose and Validate an Update

This play fires on every user turn.

Trigger and Owner

Trigger: a new user message arrives. Owner: the update pipeline.

The Play

The model reads the current state plus recent turns and proposes a structured update.
Your code validates the proposed update against the schema.
Reject malformed updates with a repair prompt instead of committing them.

Validation before commit is the single play that prevents the most production incidents. The cost of skipping it is detailed in When Tracked Conversation State Quietly Breaks Your Agent.

Play 3: Resolve Contradictions

This play fires when an update conflicts with existing state.

Trigger and Owner

Trigger: a proposed update changes a value that is already set. Owner: the update pipeline.

The Play

Apply explicit override rules: newer and more specific usually wins, but locked constraints resist casual revision.
Record what was overwritten in an audit trail.
For high-stakes conflicts, confirm with the user rather than guessing.

Play 4: Reconcile With Ground Truth

This play fires on every consequential turn.

Trigger and Owner

Trigger: any turn that drives an action or a factual claim. Owner: the update pipeline.

The Play

Compare model-tracked state against your application's real data — cart, account, order status.
When they diverge, trust the application and correct the state.
Log the divergence so you can find systematic drift.

This play is what keeps the assistant honest, and it pairs directly with the testing discipline in A Repeatable Process for Carrying State Between Turns.

Play 5: Compact the Conversation

This play fires when the context approaches its budget.

Trigger and Owner

Trigger: token count crosses a threshold. Owner: the conversation runtime.

The Play

Keep recent turns verbatim, summarize the middle, reduce the distant past to durable facts.
Exclude all anchor facts from the lossy pass.
Verify the summary preserves commitments before discarding the raw turns.

Compaction fires after contradiction resolution so you never summarize a value that is about to be overwritten.

Play 6: Recover From a Failed Turn

This play fires when a turn errors or returns garbage.

Trigger and Owner

Trigger: a model call fails, times out, or returns an invalid update. Owner: the conversation runtime.

The Play

Roll back to the last valid state snapshot.
Retry idempotently, or ask the user to clarify.
Never carry forward a partially applied or invalid update.

Sequencing and Ownership Summary

The order is the playbook's real content.

The Standard Sequence

Per turn: propose and validate, resolve contradictions, reconcile with ground truth, then act. Compaction runs when the budget threshold trips; recovery runs only on failure. Initialization runs once at session start.

Who Owns What

Conversation runtime: initialization, compaction, recovery.
Update pipeline: validation, contradiction resolution, reconciliation.
A named standard owner: the schema, the anchor-fact policy, and the override rules — the same governance role described in Standardizing Stateful Prompts Across Every Conversation Designer.

Adapting the Plays to Your Context

The plays are a template, not a straitjacket. Tune them to your stakes and scale.

Dial Up or Down by Risk

A low-stakes internal assistant can skip ground-truth reconciliation on most turns and compact loosely. A regulated, customer-facing system should reconcile aggressively, confirm contradictions explicitly, and keep a complete audit trail. Run the same plays, but adjust how strictly each one fires based on what a wrong answer costs you. The myths that lead teams to under-invest here are unpacked in What People Get Wrong About Stateful Prompt Design.

Add Plays for New Capabilities

As you add tools, sub-tasks, or persistent user memory, add corresponding plays — a focus-switch play when you introduce multiple concurrent tasks, a consent-check play when you introduce cross-session memory. The playbook grows with the product rather than being rewritten.

Running the Playbook in Practice

A playbook only earns its keep if people actually follow it under pressure.

Make the Plays Observable

Instrument each play so you can see it firing: log validations, reconciliation corrections, compaction events, and rollbacks. When something goes wrong, the logs tell you which play misbehaved instead of leaving you to guess from a transcript. Observability turns the playbook from a document into a diagnosable system.

Rehearse the Failure Plays

Initialization and validation run constantly, so they get tested by normal traffic. Recovery and contradiction handling fire rarely, which means they rot quietly. Deliberately exercise them — inject a failed turn, feed a contradictory sequence — so the rarely-used plays still work the day you actually need them. This rehearsal discipline mirrors the deterministic-replay testing in A Repeatable Process for Carrying State Between Turns.

A Worked Example

Concrete sequencing is easier to grasp through a single scenario. Consider a support assistant handling a billing dispute.

How the Plays Fire in Order

Initialization seeds the state with the authenticated customer, their plan, and their open invoice — all pulled from the application, all marked as anchors. As the customer explains the problem, each turn proposes and validates an update, recording the disputed charge and the resolution they want. When the customer says the charge is for a plan they cancelled, contradiction resolution checks that claim against ground truth via reconciliation: the application shows the cancellation was processed after the billing date, so the model's tentative state is corrected rather than accepted. Twenty turns in, compaction summarizes the back-and-forth but pins the invoice ID, the cancellation date, and the resolution offered.

Where the Playbook Earns Its Keep

If reconciliation had not fired, the assistant might have agreed the customer was wrongly charged based purely on their account of events. If compaction had not protected anchors, the agreed resolution could have evaporated before it was applied. The plays, in sequence, are what keep a high-stakes conversation both empathetic and correct — the kind of reliability the career case in Conversation State Skills That Make You Hard to Replace is built on.

Frequently Asked Questions

What order should the plays run in on a normal turn?

Propose and validate the update, resolve any contradiction, reconcile against ground truth, then act. Compaction and recovery are conditional — they fire only when the token budget trips or a turn fails. Running validation and reconciliation before acting is what keeps wrong state from reaching the user.

Who should own the state schema and override rules?

A single named standard owner, ideally an engineer who has shipped a conversational system. Centralizing the schema, anchor-fact policy, and override rules prevents the divergence that happens when every team improvises its own conventions.

When exactly should compaction fire?

When the context token count crosses a defined threshold, not on every turn. Compacting reflexively adds model-call cost that can exceed the bloat it prevents. Tie it to budget and always run it after contradiction resolution.

What should happen when a turn fails mid-update?

Roll back to the last valid snapshot and either retry idempotently or ask the user to clarify. Never commit a partially applied update. Designing updates to be idempotent makes safe retries possible.

How is this playbook different from just following best practices?

Best practices tell you what is good; the playbook tells you the trigger, owner, and sequence for each action. The sequencing — validate before commit, resolve contradictions before compaction, reconcile before acting — is what turns scattered practices into a reliable system.

Key Takeaways

Run six named plays: initialize, validate, resolve contradictions, reconcile, compact, and recover.
The per-turn sequence is validate, resolve, reconcile, then act — order is the point.
Validation before commit and reconciliation against ground truth prevent the most production incidents.
Compaction and recovery are conditional plays tied to budget thresholds and failures.
Assign clear ownership: runtime owns lifecycle plays, the pipeline owns update plays, a standard owner governs schema and rules.

Play 1: Initialize State

The first play fires when a conversation begins.

Trigger and Owner

Trigger: a new session starts. Owner: the conversation runtime.

The Play

Create an empty state object from the canonical schema.
Seed it with known facts — authenticated user identity, account tier, entitlements pulled from your application.
Establish anchor fields that may never be compacted away.

Starting with a schema-shaped, pre-seeded object means every later play has a consistent structure to operate on.

Play 2: Propose and Validate an Update

This play fires on every user turn.

Trigger and Owner

Trigger: a new user message arrives. Owner: the update pipeline.

The Play

The model reads the current state plus recent turns and proposes a structured update.
Your code validates the proposed update against the schema.
Reject malformed updates with a repair prompt instead of committing them.

Validation before commit is the single play that prevents the most production incidents. The cost of skipping it is detailed in When Tracked Conversation State Quietly Breaks Your Agent.

Play 3: Resolve Contradictions

This play fires when an update conflicts with existing state.

Trigger and Owner

Trigger: a proposed update changes a value that is already set. Owner: the update pipeline.

The Play

Apply explicit override rules: newer and more specific usually wins, but locked constraints resist casual revision.
Record what was overwritten in an audit trail.
For high-stakes conflicts, confirm with the user rather than guessing.

Play 4: Reconcile With Ground Truth

This play fires on every consequential turn.

Trigger and Owner

Trigger: any turn that drives an action or a factual claim. Owner: the update pipeline.

The Play

Compare model-tracked state against your application's real data — cart, account, order status.
When they diverge, trust the application and correct the state.
Log the divergence so you can find systematic drift.

This play is what keeps the assistant honest, and it pairs directly with the testing discipline in A Repeatable Process for Carrying State Between Turns.

Play 5: Compact the Conversation

This play fires when the context approaches its budget.

Trigger and Owner

Trigger: token count crosses a threshold. Owner: the conversation runtime.

The Play

Keep recent turns verbatim, summarize the middle, reduce the distant past to durable facts.
Exclude all anchor facts from the lossy pass.
Verify the summary preserves commitments before discarding the raw turns.

Compaction fires after contradiction resolution so you never summarize a value that is about to be overwritten.

Play 6: Recover From a Failed Turn

This play fires when a turn errors or returns garbage.

Trigger and Owner

Trigger: a model call fails, times out, or returns an invalid update. Owner: the conversation runtime.

The Play

Roll back to the last valid state snapshot.
Retry idempotently, or ask the user to clarify.
Never carry forward a partially applied or invalid update.

Sequencing and Ownership Summary

The order is the playbook's real content.

The Standard Sequence

Who Owns What

Conversation runtime: initialization, compaction, recovery.
Update pipeline: validation, contradiction resolution, reconciliation.
A named standard owner: the schema, the anchor-fact policy, and the override rules — the same governance role described in Standardizing Stateful Prompts Across Every Conversation Designer.

Adapting the Plays to Your Context

The plays are a template, not a straitjacket. Tune them to your stakes and scale.

Dial Up or Down by Risk

Add Plays for New Capabilities

Running the Playbook in Practice

A playbook only earns its keep if people actually follow it under pressure.

Make the Plays Observable

Rehearse the Failure Plays

A Worked Example

Concrete sequencing is easier to grasp through a single scenario. Consider a support assistant handling a billing dispute.

How the Plays Fire in Order

Where the Playbook Earns Its Keep

Frequently Asked Questions

What order should the plays run in on a normal turn?

Who should own the state schema and override rules?

When exactly should compaction fire?

What should happen when a turn fails mid-update?

Roll back to the last valid snapshot and either retry idempotently or ask the user to clarify. Never commit a partially applied update. Designing updates to be idempotent makes safe retries possible.

How is this playbook different from just following best practices?

Key Takeaways

Run six named plays: initialize, validate, resolve contradictions, reconcile, compact, and recover.
The per-turn sequence is validate, resolve, reconcile, then act — order is the point.
Validation before commit and reconciliation against ground truth prevent the most production incidents.
Compaction and recovery are conditional plays tied to budget thresholds and failures.
Assign clear ownership: runtime owns lifecycle plays, the pipeline owns update plays, a standard owner governs schema and rules.

Running Stateful Conversations Without Losing the Thread

Play 1: Initialize State

Trigger and Owner

The Play

Play 2: Propose and Validate an Update

Trigger and Owner

The Play

Play 3: Resolve Contradictions

Trigger and Owner

The Play

Play 4: Reconcile With Ground Truth

Trigger and Owner

The Play

Play 5: Compact the Conversation

Trigger and Owner

The Play

Play 6: Recover From a Failed Turn

Trigger and Owner

The Play

Sequencing and Ownership Summary

The Standard Sequence

Who Owns What

Adapting the Plays to Your Context

Dial Up or Down by Risk

Add Plays for New Capabilities

Running the Playbook in Practice

Make the Plays Observable

Rehearse the Failure Plays

A Worked Example

How the Plays Fire in Order

Where the Playbook Earns Its Keep

Frequently Asked Questions

What order should the plays run in on a normal turn?

Who should own the state schema and override rules?

When exactly should compaction fire?

What should happen when a turn fails mid-update?

How is this playbook different from just following best practices?

Key Takeaways

Agency Script Editorial

Related Articles

Rolling Out AI Hallucinations Across a Team

A Model Behind an API Is Only Potential

Case Study: Large Language Models in Practice

Ready to certify your AI capability?

Running Stateful Conversations Without Losing the Thread

Play 1: Initialize State

Trigger and Owner

The Play

Play 2: Propose and Validate an Update

Trigger and Owner

The Play

Play 3: Resolve Contradictions

Trigger and Owner

The Play

Play 4: Reconcile With Ground Truth

Trigger and Owner

The Play

Play 5: Compact the Conversation

Trigger and Owner

The Play

Play 6: Recover From a Failed Turn

Trigger and Owner

The Play

Sequencing and Ownership Summary

The Standard Sequence

Who Owns What

Adapting the Plays to Your Context

Dial Up or Down by Risk

Add Plays for New Capabilities

Running the Playbook in Practice

Make the Plays Observable

Rehearse the Failure Plays

A Worked Example

How the Plays Fire in Order

Where the Playbook Earns Its Keep

Frequently Asked Questions

What order should the plays run in on a normal turn?

Who should own the state schema and override rules?

When exactly should compaction fire?

What should happen when a turn fails mid-update?