AGENCYSCRIPT
CoursesEnterpriseBlog
đź‘‘FoundersSign inJoin Waitlist
AGENCYSCRIPT

Governed Certification Framework

The operating system for AI-enabled agency building. Certify judgment under constraint. Standards over scale. Governance over shortcuts.

Stay informed

Governance updates, certification insights, and industry standards.

Products

  • Platform
  • Certification
  • Launch Program
  • Vault
  • The Book

Certification

  • Foundation (AS-F)
  • Operator (AS-O)
  • Architect (AS-A)
  • Principal (AS-P)

Resources

  • Blog
  • Verify Credential
  • Enterprise
  • Partners
  • Pricing

Company

  • About
  • Contact
  • Careers
  • Press
© 2026 Agency Script, Inc.·
Privacy PolicyTerms of ServiceCertification AgreementSecurity

Standards over scale. Judgment over volume. Governance over shortcuts.

On This Page

Before You Write the PromptConfirm the model actually has sources to citeAssign stable identifiers to every sourceWriting the InstructionState the citation format explicitlyRequire a citation for every factual claimForbid invented sources in plain languageControlling the OutputDemand quoted spans for high-stakes claimsMake uncertainty visibleThe Review PassSpot-check citations against the actual textClose the loop on failuresEdge Cases Worth a Checklist ItemHandle claims that span multiple sourcesDecide what to do with common knowledgePreserve citations through downstream editsFrequently Asked QuestionsDo I need a retrieval system to get reliable citations?Why do models invent citations even when told not to?How many citations should one answer contain?Can I automate the verification step?What is the single highest-impact item on this list?Key Takeaways
Home/Blog/Make the Model Show Its Receipts: A Citation Checklist
General

Make the Model Show Its Receipts: A Citation Checklist

A

Agency Script Editorial

Editorial Team

·November 15, 2020·7 min read
instructing models to cite sourcesinstructing models to cite sources checklistinstructing models to cite sources guideprompt engineering

A language model will happily produce a confident paragraph with a footnote attached to nothing. The footnote looks like proof, but it points at a fact the model invented or a document it never read. For agency teams shipping research summaries, client-facing briefs, and retrieval-augmented assistants, that gap between the appearance of a citation and an actual verifiable source is where reputations get damaged.

The fix is rarely a single magic instruction. It is a set of small, deliberate constraints applied consistently across the prompt, the retrieval layer, and the review pass. This article gives you a checklist you can paste into your own runbook. Each item includes the reason it earns a spot, because a checklist nobody understands is a checklist nobody follows.

Treat the items below as defaults you adapt, not commandments. Some apply only when you have a retrieval system attached; others apply to any model you can prompt. Mark the ones that fit your workflow and revisit the list whenever a citation slips through that should not have.

Before You Write the Prompt

Confirm the model actually has sources to cite

A model cannot cite what it cannot see. If you are asking for citations without supplying documents, you are inviting the model to fabricate plausible-looking references. Decide up front whether this is a closed-book task (the model answers from training data and should say so) or an open-book task (the model answers from provided context and must cite it).

  • Attach the source material as context, or wire up retrieval, before demanding citations.
  • If no sources are available, instruct the model to say so rather than guess.

Assign stable identifiers to every source

Give each document a short, unambiguous label such as [S1], [S2], or a filename. Models are far more reliable at reproducing a token they were handed than at reconstructing a long URL or a full bibliographic string from memory.

  • Number sources in the order you supply them.
  • Keep identifiers short so the model copies them exactly.

Writing the Instruction

State the citation format explicitly

Do not assume the model knows you want inline brackets, footnotes, or a reference list. Spell it out, and show one example. Ambiguity here produces inconsistent output that is painful to parse downstream.

  • Specify inline markers like [S1] immediately after the claim they support.
  • Provide a single worked example in the prompt so the model matches the shape.

Require a citation for every factual claim

The instruction that does the heavy lifting is a rule that ties each non-obvious statement to a source. Phrase it as a hard requirement, not a suggestion. This mirrors the discipline discussed in A Citation Discipline You Can Actually Reuse, where structure beats one-off wording.

  • Tell the model: every factual sentence must end with at least one source marker.
  • Allow uncited sentences only for reasoning, transitions, or clearly labeled opinion.

Forbid invented sources in plain language

Models pattern-match on the format of citations, which is exactly why they fabricate them. A direct prohibition, stated in the negative, measurably reduces invented references.

  • Add: do not invent sources; only cite from the provided list.
  • Instruct the model to flag any claim it cannot support rather than fabricate one.

Controlling the Output

Demand quoted spans for high-stakes claims

For numbers, dates, names, and anything a client might act on, ask the model to include a short verbatim quote from the source alongside the citation. A quote is far easier to verify than a bare reference and exposes paraphrase drift.

  • Request a quoted snippet of fewer than 25 words for each critical fact.
  • Verifying the quote against the source becomes a five-second check.

Make uncertainty visible

A model that must signal when support is weak gives your reviewers a map of where to look. Without this, every sentence looks equally trustworthy.

  • Ask the model to mark low-confidence or partially supported claims.
  • Treat any flagged claim as unpublishable until a human confirms it.

The Review Pass

Spot-check citations against the actual text

Automated generation does not remove the human checkpoint; it relocates it. The cheapest insurance against a public error is a reviewer who opens two or three cited sources and confirms they say what the model claims. This connects to the measurement habits in Counting What a Good Citation Actually Looks Like.

  • Verify a sample of citations on every output, all of them on high-stakes work.
  • Log every miss so you can tighten the prompt that produced it.

Close the loop on failures

A citation that slips through is a free lesson if you capture it. Feed recurring failure patterns back into your prompt, your retrieval filters, or your reviewer guidance. Teams that skip this step relearn the same mistake every week, a pattern explored in The Usual Ways Citation Prompts Quietly Fail.

  • Keep a short log of citation failures and their root cause.
  • Update the prompt template when the same failure appears twice.

Edge Cases Worth a Checklist Item

Handle claims that span multiple sources

Some statements draw on two or three documents at once, and a single marker undersells the support. Instruct the model to attach every relevant identifier rather than picking one, so a reviewer can see the full basis for the claim. This matters most for synthesized conclusions, which are exactly the claims a reader is likely to challenge.

  • Allow and request multiple markers when a claim rests on several sources.
  • Treat a synthesized conclusion with a single source as a flag to inspect.

Decide what to do with common knowledge

Not every sentence needs a citation. Widely known facts and your own reasoning do not require a source, and forcing citations onto them produces noise that buries the citations that matter. Draw the line explicitly in your instruction so the model knows what to leave uncited.

  • Exempt genuinely common knowledge and clearly labeled reasoning from the citation rule.
  • Keep the exemption narrow so it does not become a loophole for unsupported claims.

Preserve citations through downstream edits

A citation is only useful if it survives the editing and formatting steps that follow generation. Teams often strip markers during cleanup and ship unsupported prose. Make citation preservation an explicit step in your production process, not an afterthought a copy editor quietly undoes.

  • Confirm that markers and quotes survive every formatting and editing pass.
  • Re-verify a sample after final formatting, since edits can break the link.

Frequently Asked Questions

Do I need a retrieval system to get reliable citations?

Not always, but it helps enormously. Without retrieval, the model cites from training data, which it cannot reliably reproduce or verify. With retrieval, you control exactly which documents are available, and citations point at real text you can check. For any task where accuracy matters, supplying the sources yourself is the more dependable path.

Why do models invent citations even when told not to?

Citations are a learned format, and models reproduce formats fluently whether or not the underlying fact is real. A bare prohibition reduces fabrication but does not eliminate it. Pairing the prohibition with stable source identifiers and a requirement to quote verbatim spans makes invention much harder, because the model has to point at text that either exists or does not.

How many citations should one answer contain?

Enough that every factual claim is supported, and no more. Over-citing dilutes signal and makes review harder; under-citing leaves claims unsupported. A practical target is one source marker per factual sentence, with multiple markers when a claim draws on several documents.

Can I automate the verification step?

Partially. You can automate checks that a cited identifier exists in the source list and that a quoted span appears verbatim in the named document. What you cannot fully automate is judging whether the source genuinely supports the claim's meaning. Keep a human in the loop for high-stakes output.

What is the single highest-impact item on this list?

Assigning stable identifiers to your sources and supplying them as context. Once the model is citing from a known, labeled set rather than its memory, almost every other item becomes enforceable. It turns citation from an act of recall into an act of copying.

Key Takeaways

  • Models fabricate citations because citation is a format they imitate; constrain the format and the source set, not just the wording.
  • Supply labeled sources as context before demanding citations, and forbid the model from inventing references.
  • Require a source marker for every factual claim and verbatim quotes for high-stakes facts.
  • Make uncertainty visible so reviewers know exactly where to look.
  • Keep a human verification pass and log every miss to tighten the prompt over time.

Search Articles

Categories

OperationsSalesDeliveryGovernance

Popular Tags

prompt engineeringai fundamentalsai toolsthe difference between AIMLagency operationsagency growthenterprise sales

Share Article

A

Agency Script Editorial

Editorial Team

The Agency Script editorial team delivers operational insights on AI delivery, certification, and governance for modern agency operators.

Related Articles

General

Prompt Quality Decides Whether AI Earns Its Keep

Prompt quality is the single biggest variable in whether AI delivers real work or expensive noise. The model matters, the platform matters — but the prompt you write determines whether you get a first

A
Agency Script Editorial
June 1, 2026·10 min read
General

Counting the Real Cost of Every Token You Send

Tokens and context windows sit at the intersection of AI capability and operational cost—yet most business cases treat them as technical footnotes. That's a mistake that costs real money. Every time y

A
Agency Script Editorial
June 1, 2026·10 min read
General

Rolling Out AI Hallucinations Across a Team

Most teams discover AI hallucinations the hard way — a confident-sounding wrong answer makes it into a client deliverable, a legal brief, or a published report. The damage isn't just to the output; it

A
Agency Script Editorial
June 1, 2026·11 min read

Ready to certify your AI capability?

Join the professionals building governed, repeatable AI delivery systems.

Explore Certification