AGENCYSCRIPT
CoursesEnterpriseBlog
👑FoundersSign inJoin Waitlist
AGENCYSCRIPT

Governed Certification Framework

The operating system for AI-enabled agency building. Certify judgment under constraint. Standards over scale. Governance over shortcuts.

Stay informed

Governance updates, certification insights, and industry standards.

Products

  • Platform
  • Certification
  • Launch Program
  • Vault
  • The Book

Certification

  • Foundation (AS-F)
  • Operator (AS-O)
  • Architect (AS-A)
  • Principal (AS-P)

Resources

  • Blog
  • Verify Credential
  • Enterprise
  • Partners
  • Pricing

Company

  • About
  • Contact
  • Careers
  • Press
© 2026 Agency Script, Inc.·
Privacy PolicyTerms of ServiceCertification AgreementSecurity

Standards over scale. Judgment over volume. Governance over shortcuts.

On This Page

Play 1: Initialize StateTrigger and OwnerThe PlayPlay 2: Propose and Validate an UpdateTrigger and OwnerThe PlayPlay 3: Resolve ContradictionsTrigger and OwnerThe PlayPlay 4: Reconcile With Ground TruthTrigger and OwnerThe PlayPlay 5: Compact the ConversationTrigger and OwnerThe PlayPlay 6: Recover From a Failed TurnTrigger and OwnerThe PlaySequencing and Ownership SummaryThe Standard SequenceWho Owns WhatAdapting the Plays to Your ContextDial Up or Down by RiskAdd Plays for New CapabilitiesRunning the Playbook in PracticeMake the Plays ObservableRehearse the Failure PlaysA Worked ExampleHow the Plays Fire in OrderWhere the Playbook Earns Its KeepFrequently Asked QuestionsWhat order should the plays run in on a normal turn?Who should own the state schema and override rules?When exactly should compaction fire?What should happen when a turn fails mid-update?How is this playbook different from just following best practices?Key Takeaways
Home/Blog/Running Stateful Conversations Without Losing the Thread
General

Running Stateful Conversations Without Losing the Thread

A

Agency Script Editorial

Editorial Team

·January 16, 2021·8 min read
dialogue state management in promptsdialogue state management in prompts playbookdialogue state management in prompts guideprompt engineering

A playbook is different from a guide. A guide explains concepts; a playbook tells you what to do, when to do it, and who is responsible. This is the operating playbook for keeping dialogue state coherent across a long conversation — a set of named plays, the triggers that fire each one, the owner accountable for it, and the sequence that ties them together.

Treat it as something you can hand to a team and have them execute without you in the room. Each play below is small and self-contained. The value is in the sequencing: knowing that contradiction handling fires before compaction, that validation fires before commit, and that reconciliation fires on every consequential turn. Run the plays out of order and the system still breaks in familiar ways.

The plays assume the architecture established elsewhere in this series — a canonical state object owned by your code, with the model proposing updates. If that foundation is unfamiliar, start with Tracking Conversation State When Prompts Get Complicated and come back.

Play 1: Initialize State

The first play fires when a conversation begins.

Trigger and Owner

Trigger: a new session starts. Owner: the conversation runtime.

The Play

  • Create an empty state object from the canonical schema.
  • Seed it with known facts — authenticated user identity, account tier, entitlements pulled from your application.
  • Establish anchor fields that may never be compacted away.

Starting with a schema-shaped, pre-seeded object means every later play has a consistent structure to operate on.

Play 2: Propose and Validate an Update

This play fires on every user turn.

Trigger and Owner

Trigger: a new user message arrives. Owner: the update pipeline.

The Play

  • The model reads the current state plus recent turns and proposes a structured update.
  • Your code validates the proposed update against the schema.
  • Reject malformed updates with a repair prompt instead of committing them.

Validation before commit is the single play that prevents the most production incidents. The cost of skipping it is detailed in When Tracked Conversation State Quietly Breaks Your Agent.

Play 3: Resolve Contradictions

This play fires when an update conflicts with existing state.

Trigger and Owner

Trigger: a proposed update changes a value that is already set. Owner: the update pipeline.

The Play

  • Apply explicit override rules: newer and more specific usually wins, but locked constraints resist casual revision.
  • Record what was overwritten in an audit trail.
  • For high-stakes conflicts, confirm with the user rather than guessing.

Play 4: Reconcile With Ground Truth

This play fires on every consequential turn.

Trigger and Owner

Trigger: any turn that drives an action or a factual claim. Owner: the update pipeline.

The Play

  • Compare model-tracked state against your application's real data — cart, account, order status.
  • When they diverge, trust the application and correct the state.
  • Log the divergence so you can find systematic drift.

This play is what keeps the assistant honest, and it pairs directly with the testing discipline in A Repeatable Process for Carrying State Between Turns.

Play 5: Compact the Conversation

This play fires when the context approaches its budget.

Trigger and Owner

Trigger: token count crosses a threshold. Owner: the conversation runtime.

The Play

  • Keep recent turns verbatim, summarize the middle, reduce the distant past to durable facts.
  • Exclude all anchor facts from the lossy pass.
  • Verify the summary preserves commitments before discarding the raw turns.

Compaction fires after contradiction resolution so you never summarize a value that is about to be overwritten.

Play 6: Recover From a Failed Turn

This play fires when a turn errors or returns garbage.

Trigger and Owner

Trigger: a model call fails, times out, or returns an invalid update. Owner: the conversation runtime.

The Play

  • Roll back to the last valid state snapshot.
  • Retry idempotently, or ask the user to clarify.
  • Never carry forward a partially applied or invalid update.

Sequencing and Ownership Summary

The order is the playbook's real content.

The Standard Sequence

Per turn: propose and validate, resolve contradictions, reconcile with ground truth, then act. Compaction runs when the budget threshold trips; recovery runs only on failure. Initialization runs once at session start.

Who Owns What

  • Conversation runtime: initialization, compaction, recovery.
  • Update pipeline: validation, contradiction resolution, reconciliation.
  • A named standard owner: the schema, the anchor-fact policy, and the override rules — the same governance role described in Standardizing Stateful Prompts Across Every Conversation Designer.

Adapting the Plays to Your Context

The plays are a template, not a straitjacket. Tune them to your stakes and scale.

Dial Up or Down by Risk

A low-stakes internal assistant can skip ground-truth reconciliation on most turns and compact loosely. A regulated, customer-facing system should reconcile aggressively, confirm contradictions explicitly, and keep a complete audit trail. Run the same plays, but adjust how strictly each one fires based on what a wrong answer costs you. The myths that lead teams to under-invest here are unpacked in What People Get Wrong About Stateful Prompt Design.

Add Plays for New Capabilities

As you add tools, sub-tasks, or persistent user memory, add corresponding plays — a focus-switch play when you introduce multiple concurrent tasks, a consent-check play when you introduce cross-session memory. The playbook grows with the product rather than being rewritten.

Running the Playbook in Practice

A playbook only earns its keep if people actually follow it under pressure.

Make the Plays Observable

Instrument each play so you can see it firing: log validations, reconciliation corrections, compaction events, and rollbacks. When something goes wrong, the logs tell you which play misbehaved instead of leaving you to guess from a transcript. Observability turns the playbook from a document into a diagnosable system.

Rehearse the Failure Plays

Initialization and validation run constantly, so they get tested by normal traffic. Recovery and contradiction handling fire rarely, which means they rot quietly. Deliberately exercise them — inject a failed turn, feed a contradictory sequence — so the rarely-used plays still work the day you actually need them. This rehearsal discipline mirrors the deterministic-replay testing in A Repeatable Process for Carrying State Between Turns.

A Worked Example

Concrete sequencing is easier to grasp through a single scenario. Consider a support assistant handling a billing dispute.

How the Plays Fire in Order

Initialization seeds the state with the authenticated customer, their plan, and their open invoice — all pulled from the application, all marked as anchors. As the customer explains the problem, each turn proposes and validates an update, recording the disputed charge and the resolution they want. When the customer says the charge is for a plan they cancelled, contradiction resolution checks that claim against ground truth via reconciliation: the application shows the cancellation was processed after the billing date, so the model's tentative state is corrected rather than accepted. Twenty turns in, compaction summarizes the back-and-forth but pins the invoice ID, the cancellation date, and the resolution offered.

Where the Playbook Earns Its Keep

If reconciliation had not fired, the assistant might have agreed the customer was wrongly charged based purely on their account of events. If compaction had not protected anchors, the agreed resolution could have evaporated before it was applied. The plays, in sequence, are what keep a high-stakes conversation both empathetic and correct — the kind of reliability the career case in Conversation State Skills That Make You Hard to Replace is built on.

Frequently Asked Questions

What order should the plays run in on a normal turn?

Propose and validate the update, resolve any contradiction, reconcile against ground truth, then act. Compaction and recovery are conditional — they fire only when the token budget trips or a turn fails. Running validation and reconciliation before acting is what keeps wrong state from reaching the user.

Who should own the state schema and override rules?

A single named standard owner, ideally an engineer who has shipped a conversational system. Centralizing the schema, anchor-fact policy, and override rules prevents the divergence that happens when every team improvises its own conventions.

When exactly should compaction fire?

When the context token count crosses a defined threshold, not on every turn. Compacting reflexively adds model-call cost that can exceed the bloat it prevents. Tie it to budget and always run it after contradiction resolution.

What should happen when a turn fails mid-update?

Roll back to the last valid snapshot and either retry idempotently or ask the user to clarify. Never commit a partially applied update. Designing updates to be idempotent makes safe retries possible.

How is this playbook different from just following best practices?

Best practices tell you what is good; the playbook tells you the trigger, owner, and sequence for each action. The sequencing — validate before commit, resolve contradictions before compaction, reconcile before acting — is what turns scattered practices into a reliable system.

Key Takeaways

  • Run six named plays: initialize, validate, resolve contradictions, reconcile, compact, and recover.
  • The per-turn sequence is validate, resolve, reconcile, then act — order is the point.
  • Validation before commit and reconciliation against ground truth prevent the most production incidents.
  • Compaction and recovery are conditional plays tied to budget thresholds and failures.
  • Assign clear ownership: runtime owns lifecycle plays, the pipeline owns update plays, a standard owner governs schema and rules.

Search Articles

Categories

OperationsSalesDeliveryGovernance

Popular Tags

prompt engineeringai fundamentalsai toolsthe difference between AIMLagency operationsagency growthenterprise sales

Share Article

A

Agency Script Editorial

Editorial Team

The Agency Script editorial team delivers operational insights on AI delivery, certification, and governance for modern agency operators.

Related Articles

General

Prompt Quality Decides Whether AI Earns Its Keep

Prompt quality is the single biggest variable in whether AI delivers real work or expensive noise. The model matters, the platform matters — but the prompt you write determines whether you get a first

A
Agency Script Editorial
June 1, 2026·10 min read
General

Counting the Real Cost of Every Token You Send

Tokens and context windows sit at the intersection of AI capability and operational cost—yet most business cases treat them as technical footnotes. That's a mistake that costs real money. Every time y

A
Agency Script Editorial
June 1, 2026·10 min read
General

Rolling Out AI Hallucinations Across a Team

Most teams discover AI hallucinations the hard way — a confident-sounding wrong answer makes it into a client deliverable, a legal brief, or a published report. The damage isn't just to the output; it

A
Agency Script Editorial
June 1, 2026·11 min read

Ready to certify your AI capability?

Join the professionals building governed, repeatable AI delivery systems.

Explore Certification