AGENCYSCRIPT
CoursesEnterpriseBlog
đź‘‘FoundersSign inJoin Waitlist
AGENCYSCRIPT

Governed Certification Framework

The operating system for AI-enabled agency building. Certify judgment under constraint. Standards over scale. Governance over shortcuts.

Stay informed

Governance updates, certification insights, and industry standards.

Products

  • Platform
  • Certification
  • Launch Program
  • Vault
  • The Book

Certification

  • Foundation (AS-F)
  • Operator (AS-O)
  • Architect (AS-A)
  • Principal (AS-P)

Resources

  • Blog
  • Verify Credential
  • Enterprise
  • Partners
  • Pricing

Company

  • About
  • Contact
  • Careers
  • Press
© 2026 Agency Script, Inc.·
Privacy PolicyTerms of ServiceCertification AgreementSecurity

Standards over scale. Judgment over volume. Governance over shortcuts.

On This Page

Example One: The Checkout Assistant That Forgot PaymentWhat state needed trackingWhat made it workExample Two: The Scheduling Agent and Pronoun DriftWhere it failed firstThe fix that heldExample Three: The Troubleshooting Bot That LoopedWhat state needed trackingWhy naming attempted steps matteredExample Four: The Multi-Step Intake FormSlot filling done wellPatterns Across All Four ExamplesThe recurring success factorsA Fifth Example: The Returning User Across SessionsWhat broke in the naive versionWhat made the cross-session version workWhy this generalizesWhat Separated Success From FailureThe diagnostic patternFrequently Asked QuestionsHow much state should I put in the prompt?Should state live in the prompt or in application code?What is the difference between conversation history and dialogue state?How do I handle a user changing an earlier answer?Do small assistants need formal state management?How do I debug state-related bugs?Key Takeaways
Home/Blog/When a Prompt Forgets the User Already Paid: State Examples
General

When a Prompt Forgets the User Already Paid: State Examples

A

Agency Script Editorial

Editorial Team

·January 24, 2021·8 min read
dialogue state management in promptsdialogue state management in prompts examplesdialogue state management in prompts guideprompt engineering

A support chatbot confidently asks a customer for their order number. The customer provided it three turns ago. The model lost it. From the user's perspective, the assistant has the memory of a goldfish, and trust evaporates in a single exchange. This is the failure mode that dialogue state management exists to prevent, and it shows up far more often than teams expect.

Dialogue state management is the practice of explicitly tracking what has been established, decided, or collected across the turns of a conversation, then feeding that state back into each prompt so the model reasons from current reality rather than guessing. In a single-shot prompt, there is no state to manage. In a multi-turn assistant, state is the difference between a coherent agent and a confused one.

The examples below are drawn from common patterns in production assistants: a checkout flow, a scheduling agent, a troubleshooting bot, and a multi-step form. For each, we walk through what state needed tracking, how the prompt represented it, and the specific decision that determined whether the interaction held together.

A note on how to read these: pay less attention to the domains and more to the shape of the fix. The same three or four moves recur across wildly different assistants, which is the real lesson. Once you can spot those moves, you can apply them to a domain none of these examples cover.

Example One: The Checkout Assistant That Forgot Payment

A retail assistant guides users through selecting a product, confirming shipping, and paying. The hard part is not any single step. It is remembering that the user already completed payment so the assistant does not re-ask or, worse, double-charge.

What state needed tracking

  • cart_items: the products selected
  • shipping_confirmed: boolean
  • payment_status: one of pending, authorized, completed
  • order_id: assigned once payment succeeds

What made it work

The team injected an explicit state block at the top of every prompt:

CURRENT ORDER STATE:
- payment_status: completed
- order_id: 48213

Because payment_status was a named field rather than something the model had to infer from chat history, the assistant stopped asking for payment the moment the field flipped to completed. The lesson: derive nothing the application already knows. If your backend has the truth, put the truth in the prompt verbatim.

Example Two: The Scheduling Agent and Pronoun Drift

A scheduling agent books meetings. A user says "move it to Thursday." The model has to resolve "it" to the meeting discussed two turns ago. Without state, the model often resolves the wrong referent, especially after the conversation branches.

Where it failed first

In the naive version, the prompt simply appended raw conversation history. When the user discussed two possible meetings before deciding, "it" became ambiguous and the agent rescheduled the wrong one roughly a fifth of the time in testing.

The fix that held

The team added a focused_entity field updated after each user turn. When a user named a specific meeting, that meeting became the focus. Pronouns resolved against the focus, not against the entire transcript. This mirrors the discipline covered in A Reusable Model for Tracking Dialogue State in Prompts: name the entity in focus instead of asking the model to re-derive it every turn.

Example Three: The Troubleshooting Bot That Looped

A technical support bot walks users through fixes. Its worst behavior was looping — suggesting "restart the router" after the user already reported that step done.

What state needed tracking

  • steps_attempted: a list
  • steps_succeeded: a list
  • current_hypothesis: the suspected root cause

Why naming attempted steps mattered

By maintaining steps_attempted, the prompt could instruct the model: "Never suggest a step already in steps_attempted." The loop disappeared. The broader principle, explored in Concrete Scenarios That Reveal Whether Your Dialogue State Holds, is that negative constraints anchored to explicit state are more reliable than hoping the model notices repetition on its own.

Example Four: The Multi-Step Intake Form

An onboarding assistant collects company name, team size, use case, and budget across a natural conversation rather than a rigid form. The challenge: users provide fields out of order and sometimes revise earlier answers.

Slot filling done well

The prompt maintained a slots object:

SLOTS:
- company_name: "Northwind"
- team_size: null
- use_case: "content drafting"
- budget: null

The instruction was simple: ask only for slots that are null, and confirm any slot the user revises. When a user changed their use case mid-conversation, the assistant updated the slot and re-confirmed downstream answers that depended on it. This out-of-order tolerance is what separates a conversational intake from a glorified form.

Patterns Across All Four Examples

Looking across the scenarios, the successes share a structure. Each represented state as named fields, kept the application as the source of truth, and used the state to constrain the model rather than to merely inform it.

The recurring success factors

  • Explicit beats implicit. Every reliable example put state in a labeled block, never relying on the model to re-read history.
  • Constrain with state. The most valuable use of state was telling the model what not to do — do not re-ask, do not re-suggest, do not re-charge.
  • One source of truth. When the backend knew a fact, the prompt repeated the backend's value rather than letting the model reconstruct it.

For teams weighing whether to build this themselves, Tooling That Tracks Conversation State Across Prompt Turns covers when a framework earns its keep.

A Fifth Example: The Returning User Across Sessions

The four scenarios above all lived inside a single conversation. The harder case is state that has to survive a user leaving and coming back days later. A subscription assistant faced exactly this: a user started a plan change on Monday, abandoned it, and returned Thursday expecting the assistant to pick up where they left off.

What broke in the naive version

The assistant treated each session as a blank slate. On Thursday it greeted the returning user as if they had never spoken, forcing them to re-explain the plan change they had already half-configured. Users experienced this as the assistant having amnesia between visits, which is even more jarring than forgetting mid-conversation.

What made the cross-session version work

The team persisted the state object to durable storage keyed by user, not just to the in-memory session. On return, the assistant rendered the saved state into the opening prompt:

RETURNING USER STATE:
- pending_action: plan_change
- new_plan: "Pro"
- step_remaining: confirm_billing

The assistant then opened with "Last time you were upgrading to Pro and had one step left — want to finish that?" The difference between a forgettable bot and a memorable one was a storage key and a single rendered block.

Why this generalizes

Cross-session state is the same render-and-constrain discipline applied to a longer time horizon. Nothing about the technique changes; only the lifetime of the storage does. This is also the bridge to agentic memory, where state persists not just across sessions but across entirely separate tasks the user pursues over time.

What Separated Success From Failure

Stepping back across all five scenarios, the failures were never caused by a weak model. The model was capable of the right behavior in every case. The failures came from the surrounding system asking the model to remember things it had no reliable way to remember.

The diagnostic pattern

  • If the assistant re-asks, a fact that should be in rendered state is missing from the prompt.
  • If the assistant repeats an action, an attempted-actions list is absent or not being checked.
  • If the assistant contradicts a decision, a finalized state value is not being treated as sticky.
  • If the assistant forgets across visits, state is living in session memory instead of durable storage.

Each symptom points to a specific, fixable gap rather than to a vague need for a better prompt. That precision is what makes these examples useful as a debugging reference rather than just illustrations.

Frequently Asked Questions

How much state should I put in the prompt?

Only what the current turn needs to behave correctly. Dumping the entire conversation history into every prompt is wasteful and degrades performance once the context grows. Track named fields and inject the relevant ones.

Should state live in the prompt or in application code?

The source of truth should live in application code or a database. The prompt receives a rendered snapshot of that state each turn. The model never owns the canonical state; it consumes a copy.

What is the difference between conversation history and dialogue state?

History is the raw transcript of everything said. State is the distilled, structured summary of what matters now — collected slots, decisions made, the entity in focus. State is derived from history but is far smaller and more actionable.

How do I handle a user changing an earlier answer?

Treat revisions as first-class. When a user updates a slot, overwrite it and re-confirm any downstream values that depended on it. The intake-form example above shows this pattern in action.

Do small assistants need formal state management?

A two-turn assistant rarely does. The need scales with conversation length and the cost of errors. A checkout flow needs it badly; a one-shot summarizer does not.

How do I debug state-related bugs?

Log the exact state block injected into each prompt alongside the model's response. Most state bugs are visible the instant you can see what the model actually received versus what you assumed it received.

Key Takeaways

  • Dialogue state management prevents the assistant from forgetting, re-asking, and looping across turns.
  • The strongest examples represent state as named, labeled fields injected into every prompt.
  • Use state to constrain behavior — do not re-ask, do not re-suggest — not just to inform it.
  • Keep the canonical state in application code; the prompt gets a rendered snapshot each turn.
  • Treat user revisions as first-class events that overwrite slots and re-confirm dependents.
  • When debugging, log the literal state block the model received so assumptions become visible.

Search Articles

Categories

OperationsSalesDeliveryGovernance

Popular Tags

prompt engineeringai fundamentalsai toolsthe difference between AIMLagency operationsagency growthenterprise sales

Share Article

A

Agency Script Editorial

Editorial Team

The Agency Script editorial team delivers operational insights on AI delivery, certification, and governance for modern agency operators.

Related Articles

General

Prompt Quality Decides Whether AI Earns Its Keep

Prompt quality is the single biggest variable in whether AI delivers real work or expensive noise. The model matters, the platform matters — but the prompt you write determines whether you get a first

A
Agency Script Editorial
June 1, 2026·10 min read
General

Counting the Real Cost of Every Token You Send

Tokens and context windows sit at the intersection of AI capability and operational cost—yet most business cases treat them as technical footnotes. That's a mistake that costs real money. Every time y

A
Agency Script Editorial
June 1, 2026·10 min read
General

Rolling Out AI Hallucinations Across a Team

Most teams discover AI hallucinations the hard way — a confident-sounding wrong answer makes it into a client deliverable, a legal brief, or a published report. The damage isn't just to the output; it

A
Agency Script Editorial
June 1, 2026·11 min read

Ready to certify your AI capability?

Join the professionals building governed, repeatable AI delivery systems.

Explore Certification