Most teams approach grounding as a bag of tactics: chunk the documents, retrieve some passages, write a careful instruction. The tactics are sound, but without a structure to hang them on, it is hard to know which one to reach for when answers go wrong. A framework fixes that. It gives you named stages, so when something fails you can point to the stage responsible instead of flailing across the whole pipeline.
This article introduces the SOURCE model, a reusable structure for any grounded system. The name is a mnemonic for its six stages: Select, Organize, Unite, Restrict, Cite, and Evaluate. Each stage has a job, a common failure, and a clear handoff to the next. The model is deliberately simple, because a framework you cannot remember is one you will not use.
Walk through the six stages once and you will have a map. After that, every grounding decision you face has a home, and every failure has an address. That addressing is the whole point of a framework: it converts the vague feeling that something is wrong into a precise question about which stage is responsible, and a precise question is one you can actually answer.
Select: Getting the Right Facts In Front of the Model
The Job of This Stage
Select is retrieval: given a question, find the passages most likely to contain the answer. Everything downstream depends on this stage doing its job, because the model can only reason over what Select hands it.
The Failure to Watch
The classic failure is returning passages that do not contain the answer. Catch it by inspecting retrieved chunks directly, before the model runs. If Select fails, no later stage can recover. This is why retrieval inspection leads the workflow in Build a Grounded Prompt Pipeline in Eight Concrete Steps. Because Select sits at the head of the chain, time spent strengthening it returns more than time spent anywhere else, and teams that obsess over prompt wording while neglecting Select are optimizing the wrong stage.
Organize: Shaping Documents So Retrieval Can Find Them
The Job of This Stage
Organize covers chunking and indexing, the preparation that makes Select possible. How you split documents determines whether retrieval can find coherent, answer-bearing passages at all.
The Failure to Watch
Splitting on raw character counts cuts ideas in half, so retrieval returns fragments. Organize correctly by splitting on natural boundaries with slight overlap. A weak Organize stage quietly sabotages a strong Select.
Unite: Assembling Context Into a Coherent Prompt
The Job of This Stage
Unite combines the retrieved passages, the instruction, and the question into a single prompt. Its job is arrangement: keeping context lean, marking it clearly, and ordering passages so the strongest sits where the model weights it most.
The Failure to Watch
The common error is uniting too much, stuffing in twenty passages when four would do. Excess context dilutes attention and raises cost. Unite favors precision over volume.
Restrict: Keeping the Model Inside the Evidence
The Job of This Stage
Restrict is the instruction that confines the model to the supplied context and permits it to decline when the answer is absent. It is the guardrail that stops the model from blending in training knowledge.
The Failure to Watch
Skipping Restrict lets the model fabricate fluently, presenting guesses as sourced facts. The fix is one or two explicit sentences. This guardrail is the corrective for the most damaging mistake in 7 Common Mistakes with Grounding Prompts with Retrieved Context.
Cite: Making Every Claim Traceable
The Job of This Stage
Cite requires the model to attribute each claim to the specific chunk that supports it. Citation turns answers from opaque assertions into verifiable statements.
The Failure to Watch
Omitting Cite hides fabrication, because an invented claim looks identical to a sourced one in fluent prose. With Cite in place, a claim with no matching source stands out immediately. The trust this builds is explored in Grounding Prompts with Retrieved Context: Best Practices That Actually Work.
Evaluate: Measuring Whether the System Actually Works
The Job of This Stage
Evaluate runs a standing set of real questions with known answers after every change, converting impressions into measurements. It is the stage that tells you whether a tweak helped.
The Failure to Watch
Evaluating on a single happy example breeds false confidence that collapses on real traffic. Build a varied test set and change one variable at a time so each result is attributable.
Applying the Whole Model
Diagnose by Stage
When a grounded answer is wrong, walk the stages in order. Did Select return the right chunks? Did Organize give it good material to work with? Did Unite arrange them well? Was Restrict in place? Did Cite expose anything? Did Evaluate catch the regression? The first stage that fails is your fix.
Improve One Stage at a Time
Because the stages are distinct, you can strengthen them independently. Improving Select rarely requires touching Restrict. This separation is what makes the model a working tool rather than a slogan, and it keeps your tuning disciplined.
Where the Stages Interact
Upstream Stages Constrain Downstream Ones
The stages are separable for diagnosis but not independent in effect. A weak Organize stage caps how good Select can ever be, because retrieval cannot find a coherent passage that chunking never produced. Likewise, no amount of careful Unite or Restrict can rescue an answer when Select handed over the wrong material. This is why diagnosis walks from the earliest stage forward: an upstream failure makes everything after it look broken, and fixing a downstream stage while an upstream one is failing wastes effort.
Restrict and Cite Reinforce Each Other
Restrict and Cite are technically separate, but they compound. Restrict tells the model to stay within the evidence; Cite forces it to show which evidence it used. Together they create a feedback loop you can inspect: if a cited claim does not actually appear in the cited chunk, you have caught a Restrict violation that Cite made visible. Neither stage alone gives you that. Run them as a pair and you gain a self-checking property that pure instruction cannot provide.
Evaluate Watches the Whole Chain
Evaluate is the only stage that sees the end-to-end result, which makes it your detector for problems that no single upstream stage reveals. A regression that emerges only from the interaction of two stages, a chunking change that subtly degrades retrieval, for instance, surfaces in Evaluate before it surfaces anywhere else. Treat the standing test set as the integration test for the entire SOURCE pipeline, not merely a check on the model's wording.
Frequently Asked Questions
Do the stages have to run in this exact order?
The order reflects dependency: Organize enables Select, Select feeds Unite, and so on. You build them in roughly this order, but at run time Select through Cite happen together for each question, with Organize and Evaluate as surrounding activities.
Where do most failures land in the SOURCE model?
In Select, by a wide margin. Retrieval returning the wrong passages is the most common root cause, which is why the model puts it first and why inspecting it is the first diagnostic step.
Is this framework tied to any particular tools?
No. SOURCE describes the work, not the technology. It applies whether you use keyword search or vector retrieval, a hosted model or a local one. The stages stay the same as tools change.
How is Restrict different from Cite?
Restrict keeps the model inside the supplied context; Cite makes its use of that context traceable. Restrict prevents fabrication, Cite reveals it. You want both, because each catches what the other misses.
Key Takeaways
- SOURCE breaks grounding into six nameable stages: Select, Organize, Unite, Restrict, Cite, and Evaluate.
- Each stage has a distinct job and a characteristic failure, so problems get an address instead of vague blame.
- Select, retrieval quality, is where most failures originate and where diagnosis should begin.
- Restrict and Cite work as a pair: one keeps the model in the evidence, the other makes its use traceable.
- Diagnose by walking the stages in order and improve them one at a time for disciplined, attributable progress.