Scattered techniques for getting numbers right out of a language model work, but a loose collection of tips is hard to apply consistently under pressure. A named structure fixes that. FRAME is a five-stage method that organizes the moves that matter into an order you can follow every time: Frame the problem, Reason in steps, Arithmetic offloaded, Measure the result, Encode for reuse. Each stage addresses a distinct way numerical work goes wrong, and together they form a sequence you can run without rethinking the approach from scratch.
The value of a named structure is not novelty — most of its parts will be familiar — but consistency. When you have a method with named stages, you can teach it, audit against it, and notice which stage you skipped when something fails. This piece defines each stage, explains what it protects against, and says when it matters most, because not every task needs every stage at full strength.
Use FRAME as a default operating structure for numerical prompting. On simple tasks you will lean on the first two stages; on consequential ones you will run all five. Knowing which stage to emphasize for a given task is itself part of the skill.
F: Frame the Problem
The first stage handles the errors that happen before any calculation.
What This Stage Does
Framing means stating every quantity with its unit, defining what each number refers to, and specifying the expected answer format. It removes the ambiguity that produces correct answers to the wrong question.
When It Matters Most
Framing matters most when the problem comes from a human in natural language, where implicit assumptions hide. A precisely framed problem is half-solved, and a poorly framed one cannot be rescued by any later stage. The framing discipline is detailed in Build a Repeatable Workflow for Math You Can Rely On.
R: Reason in Steps
The second stage forces the model to think out loud rather than guess.
What This Stage Does
Reasoning in steps means requiring the model to show each intermediate calculation before the final answer. This turns one hard prediction into several easier ones and produces an audit trail you can inspect.
When It Matters Most
This stage is nearly always on. The only time to suppress visible reasoning is for trivial arithmetic under tight latency or token constraints, and never for compound problems. It is the workhorse stage of the method. The mechanism behind why it works is in Getting Language Models to Do Math They Can Actually Trust.
A: Arithmetic Offloaded
The third stage moves exact computation away from the model.
What This Stage Does
Offloading means having the model set up the calculation — write the formula, supply the inputs — while a deterministic tool or code performs the arithmetic exactly. It plays to the model's strength (reasoning) and away from its weakness (computation).
When It Matters Most
This stage matters whenever exact values are consequential and the operation can be expressed as code or a function. Without tools, the offload becomes computing the final arithmetic in a spreadsheet yourself. For large, unusual, or compounding numbers, this stage is where most arithmetic errors disappear.
M: Measure the Result
The fourth stage verifies before you trust.
What This Stage Does
Measuring means checking the result against plausibility, obvious constraints, and — for high-stakes figures — an independent recomputation. It is the gate that stops a wrong number from reaching anyone.
When It Matters Most
Measurement scales with stakes. A quick plausibility glance suffices for casual work; a full independent recomputation is warranted for figures with money or credibility attached. The mistakes this stage guards against are catalogued in 7 Mistakes That Wreck Numerical Reasoning Prompts.
E: Encode for Reuse
The fifth stage turns a working sequence into a durable asset.
What This Stage Does
Encoding means saving the prompt structure that reliably handled a class of task — the framing language, the step instruction, the verification ask — so you run a tested process next time instead of improvising. It also means noting which version is the high-stakes routine and which is the lightweight one.
When It Matters Most
Encoding pays off for recurring numerical tasks, which is most of them in practice. The compounding benefit of reused patterns is a theme of Field Practices That Make Model Math Dependable.
Applying FRAME by Task Difficulty
The method flexes to the task rather than demanding full effort everywhere.
Light Tasks
For a simple, low-stakes calculation, run Frame and Reason, glance at the result, and move on. The remaining stages add overhead a throwaway estimate does not justify.
Heavy Tasks
For a consequential, multi-step calculation, run all five stages at full strength: careful framing, visible reasoning, tool-based arithmetic, independent measurement, and an encoded pattern for next time. Worked applications of this full sequence appear in Where Numerical Reasoning Prompts Earn Their Keep.
Diagnosing Failures by Stage
One of the quiet benefits of a named structure is that it turns debugging from guesswork into a checklist. When a number comes back wrong, you can ask which stage let it through.
Tracing an Error to Its Stage
Each kind of failure maps to a stage you can inspect:
- A correct answer to the wrong question points at the Frame stage — the problem was ambiguous or under-specified.
- A plausible approach with a wrong final number points at Reason or Arithmetic — either a reasoning slip or the model computing exact values it should have offloaded.
- A confidently wrong figure that nobody caught points at Measure — the verification gate was missing or too weak for the stakes.
- The same error recurring across tasks points at Encode — no proven pattern was captured, so each run improvised and re-introduced risk.
This mapping is what makes the method more than a mnemonic. A loose pile of tips gives you nowhere to look when something breaks; a staged structure tells you where to start. The failure-mode catalogue in 7 Mistakes That Wreck Numerical Reasoning Prompts lines up cleanly against these stages.
Strengthening the Weak Stage
Once you know which stage failed, the fix is targeted rather than a blind rewrite. A framing failure means tightening the problem statement; an arithmetic failure means adding tool offload; a measurement failure means adding a verification gate proportional to stakes. You repair the specific stage rather than reworking the entire prompt, which is faster and less likely to introduce new errors.
Frequently Asked Questions
How is FRAME different from just listing good techniques?
The techniques are largely the same; the difference is structure and order. A named, sequenced method lets you apply the moves consistently, teach them, and diagnose failures by identifying which stage was skipped. A loose list of tips is easy to forget under pressure, whereas a method with named stages becomes a routine you can run reliably.
Do I have to run all five stages every time?
No. The method is meant to flex with stakes. Light tasks use Frame and Reason plus a quick check; heavy tasks run all five at full strength. Applying every stage to a throwaway estimate wastes effort, and skipping framing on a consequential task invites error. Matching stage emphasis to the task is part of using the method well.
Which stage do people most often skip?
Measure, the verification stage. It depends on discipline and gets dropped exactly when people are busy, which is when errors are most likely. Building measurement into the routine as a named stage, rather than leaving it to memory, is precisely why a structured method helps — it makes the easily-forgotten step a standing part of the process.
What if I cannot offload arithmetic to a tool?
The Arithmetic stage still applies; it just changes form. Without code execution, you have the model produce the formula and inputs clearly, then compute the exact arithmetic yourself in a spreadsheet or calculator. The principle — keep exact computation out of the model's head — holds regardless of whether the deterministic step is automated or manual.
How does the Encode stage actually save time?
Recurring numerical tasks have similar shapes, so a prompt structure that worked once will likely work again. Saving it means you run a proven sequence instead of reinventing the phrasing and risking a fresh error. Noting which saved version is the high-stakes routine versus the lightweight one also lets you match effort to consequence instantly.
Key Takeaways
- FRAME organizes numerical prompting into five named stages: Frame, Reason, Arithmetic offloaded, Measure, Encode.
- A named structure adds consistency, teachability, and the ability to diagnose failures by the stage that was skipped.
- Framing prevents pre-calculation errors, and reasoning in steps is the near-always-on workhorse stage.
- Offloading arithmetic and measuring the result address the model's computational weakness and the risk of trusting unverified figures.
- The method flexes by stakes: light tasks use the first stages, consequential tasks run all five, and encoding turns working sequences into reusable assets.