The DIVET Model for Generating Hypotheses With AI

Ad hoc prompting produces ad hoc results. When you run hypothesis sessions often enough, you start to notice that the productive ones share a hidden structure, and the frustrating ones skip part of it. A named model makes that structure explicit, so you can run it deliberately rather than rediscovering it each time.

This article introduces DIVET, a five-stage model for AI-assisted hypothesis generation: Define, Inventory, Vary, Examine, Test. Each stage has a clear job and a clear handoff to the next. The point of naming it is not novelty; it is repeatability. Once a process has a name and stages, you can teach it, audit it, and improve it.

D: Define the Problem

Everything begins with a precise problem definition. This stage produces the input that determines the quality of everything downstream.

What Define Produces

A complete definition includes the observation with real numbers, the timeframe, recent changes on your side, what you have ruled out, and what a solved problem looks like. The output of this stage is a problem statement specific enough that the model can tailor hypotheses to your actual situation.

Skipping or rushing Define is the most common cause of disappointing sessions. The discipline here matches the framing step in A Sequential Process for Drafting Testable Ideas With AI.

I: Inventory the Possibilities

The Inventory stage is pure breadth. You prompt the model for as many distinct hypotheses as it can produce, without judging any of them.

The job here is coverage, not quality. Ask for fifteen or more candidates and explicitly request both obvious and non-obvious explanations. Resist all urge to evaluate; that comes later. A large, unfiltered inventory ensures the useful but non-obvious hypothesis, the one you would never have thought of, makes it onto the list. This breadth-first discipline is core to Opinionated Habits That Make Hypothesis Prompts Pay Off.

V: Vary the Angles

A raw inventory often clusters around one or two themes. The Vary stage deliberately breaks that clustering.

Forcing Genuine Diversity

Categories: Ask for hypotheses grouped by domain, such as behavior, technical, external, and measurement.
Perspectives: Prompt for how different stakeholders would explain the problem.
Comfort: Explicitly request uncomfortable hypotheses that implicate your own decisions.
The null: Always add the hypothesis that the effect is noise or an artifact.

Vary turns a narrow list into a genuinely diverse one. This is the stage that surfaces the explanations your biases were filtering out, and it is the one most people skip.

E: Examine and Sharpen

Now you switch from generation to judgment. The Examine stage converts loose ideas into precise, testable statements.

For each promising hypothesis, rewrite it to name its mechanism, the causal chain by which the cause produces the effect, and attach a feasible test method. Any hypothesis that cannot be made testable gets reframed or set aside. The output of Examine is a shortlist of sharp, checkable propositions rather than a long list of vague ideas. The difference between a claim and a mechanism is covered in Seven Ways Hypothesis Prompts Quietly Go Wrong.

T: Test in Priority Order

The final stage moves from thinking to learning. You prioritize the shortlist and start checking.

Prioritizing the Shortlist

Score each hypothesis on impact if true and cost to test. Favor candidates that are both high impact and cheap to check, and run the cheapest decisive tests first to eliminate options quickly. Pick three to investigate, gather evidence, and update your beliefs. Then, crucially, loop back to Define with what you learned, because results reshape the problem. The prioritization logic is expanded in Weighing the Competing Ways to Prompt for Hypotheses.

When to Use the Full Model

DIVET is a complete model, but not every problem needs all five stages run formally.

For a quick, low-stakes question, you might run Define and Inventory lightly and skip the formal Vary stage. For a high-stakes problem where a wrong conclusion is expensive, run every stage deliberately and document each. The model scales with the stakes. The key insight is that the stages always happen in some form; the question is only how much rigor each one gets.

How DIVET Handles Failure

A good model does not just describe the happy path; it tells you what to do when a session goes sideways. DIVET's stage structure is also a diagnostic tool, because most failures trace to a specific stage.

Diagnosing by Stage

When a session disappoints, you can usually locate the breakdown:

Generic hypotheses point to a weak Define stage; the problem statement lacked specifics.
Repetitive, clustered ideas point to a skipped or shallow Vary stage.
Interesting but untestable ideas point to a missing Examine stage; you never attached mechanisms and test methods.
Lots of testing with little learning points to a weak Test stage; you tested low-impact or expensive hypotheses first instead of cheap, decisive ones.

This diagnostic property is the practical payoff of naming the stages. Instead of vaguely concluding that a session was unproductive, you can pinpoint which stage failed and rerun just that part. The specific failure symptoms map closely to the issues in Seven Ways Hypothesis Prompts Quietly Go Wrong.

DIVET in a Concrete Pass

To make the model tangible, walk through how it would run on a real problem: a support team whose ticket resolution time suddenly increased.

In Define, you write the exact numbers, the timeframe, recent changes like a new ticketing tool, and what you have ruled out. In Inventory, you prompt for fifteen explanations spanning staffing, tooling, ticket mix, and measurement. In Vary, you force categories and explicitly ask for explanations where the new tool or a process change is at fault, plus a null hypothesis that the metric definition changed. In Examine, you rewrite the strongest candidates with mechanisms, "resolution time rose because the new tool reclassified some tickets, inflating the measured duration," and attach test methods. In Test, you score by impact and cost, check the cheap measurement hypothesis first, and loop back to Define with what you learn. The same end-to-end arc plays out in How a Stalled Trial Funnel Got Diagnosed by AI Prompts.

Why a Staged Model Beats Ad Hoc Prompting

It is fair to ask whether a named, five-stage model is worth the overhead compared to just talking to the model and seeing what comes out. The answer comes down to what structure buys you.

Ad hoc prompting works fine when you are skilled and the problem is easy, because an experienced operator runs the stages implicitly without thinking about them. The trouble is that implicit stages get skipped under pressure, and the skips are invisible. You do not notice that you never diversified or never attached a test method; you just end up with a disappointing result and no idea why. A named model makes each stage a deliberate, checkable act. It converts a fuzzy skill into a teachable procedure, which matters most when the work is shared across a team or when the stakes make a missed stage expensive. The same argument for structure over improvisation drives the decision rule in Weighing the Competing Ways to Prompt for Hypotheses, and the failure modes a staged approach prevents are listed in Seven Ways Hypothesis Prompts Quietly Go Wrong.

Frequently Asked Questions

Why does the model need a name?

A named model is teachable, auditable, and repeatable. When a process has discrete stages with clear handoffs, a team can run it consistently, review where a session went wrong, and improve specific stages. Unnamed ad hoc prompting cannot be examined the same way.

Which stage do people skip most often?

Vary. After building an inventory, people jump straight to evaluating it, missing the deliberate diversification that surfaces non-obvious and uncomfortable hypotheses. Skipping Vary is why so many sessions only return ideas the user already had.

Can I run DIVET in a single conversation?

Yes. The stages are conceptual, not separate tools. You run them as a sequence of prompts within one session, switching from generation mode in Inventory and Vary to judgment mode in Examine and Test.

How is Examine different from Test?

Examine sharpens hypotheses into precise, testable statements and attaches a test method, all on paper. Test is where you actually run those checks against real data. Examine is preparation; Test is execution and learning.

Does the loop back to Define ever end?

It ends when you have a confident answer or when further investigation is not worth the cost. Each loop narrows the problem using new evidence. Most problems resolve in one or two loops; complex ones may take several.

Key Takeaways

DIVET is a five-stage model: Define, Inventory, Vary, Examine, Test.
Define produces a specific problem statement; Inventory casts a wide, unjudged net.
Vary forces genuine diversity through categories, perspectives, comfort, and a null hypothesis.
Examine sharpens ideas into testable statements with mechanisms; Test prioritizes and checks them.
Run the full model for high-stakes problems and a lighter version for quick questions, then loop back to Define.

D: Define the Problem

Everything begins with a precise problem definition. This stage produces the input that determines the quality of everything downstream.

What Define Produces

Skipping or rushing Define is the most common cause of disappointing sessions. The discipline here matches the framing step in A Sequential Process for Drafting Testable Ideas With AI.

I: Inventory the Possibilities

The Inventory stage is pure breadth. You prompt the model for as many distinct hypotheses as it can produce, without judging any of them.

V: Vary the Angles

A raw inventory often clusters around one or two themes. The Vary stage deliberately breaks that clustering.

Forcing Genuine Diversity

Categories: Ask for hypotheses grouped by domain, such as behavior, technical, external, and measurement.
Perspectives: Prompt for how different stakeholders would explain the problem.
Comfort: Explicitly request uncomfortable hypotheses that implicate your own decisions.
The null: Always add the hypothesis that the effect is noise or an artifact.

Vary turns a narrow list into a genuinely diverse one. This is the stage that surfaces the explanations your biases were filtering out, and it is the one most people skip.

E: Examine and Sharpen

Now you switch from generation to judgment. The Examine stage converts loose ideas into precise, testable statements.

T: Test in Priority Order

The final stage moves from thinking to learning. You prioritize the shortlist and start checking.

Prioritizing the Shortlist

When to Use the Full Model

DIVET is a complete model, but not every problem needs all five stages run formally.

How DIVET Handles Failure

Diagnosing by Stage

When a session disappoints, you can usually locate the breakdown:

Generic hypotheses point to a weak Define stage; the problem statement lacked specifics.
Repetitive, clustered ideas point to a skipped or shallow Vary stage.
Interesting but untestable ideas point to a missing Examine stage; you never attached mechanisms and test methods.
Lots of testing with little learning points to a weak Test stage; you tested low-impact or expensive hypotheses first instead of cheap, decisive ones.

DIVET in a Concrete Pass

To make the model tangible, walk through how it would run on a real problem: a support team whose ticket resolution time suddenly increased.

Why a Staged Model Beats Ad Hoc Prompting

It is fair to ask whether a named, five-stage model is worth the overhead compared to just talking to the model and seeing what comes out. The answer comes down to what structure buys you.

Frequently Asked Questions

Why does the model need a name?

Which stage do people skip most often?

Can I run DIVET in a single conversation?

How is Examine different from Test?

Does the loop back to Define ever end?

Key Takeaways

DIVET is a five-stage model: Define, Inventory, Vary, Examine, Test.
Define produces a specific problem statement; Inventory casts a wide, unjudged net.
Vary forces genuine diversity through categories, perspectives, comfort, and a null hypothesis.
Examine sharpens ideas into testable statements with mechanisms; Test prioritizes and checks them.
Run the full model for high-stakes problems and a lighter version for quick questions, then loop back to Define.

The DIVET Model for Generating Hypotheses With AI

D: Define the Problem

What Define Produces

I: Inventory the Possibilities

V: Vary the Angles

Forcing Genuine Diversity

E: Examine and Sharpen

T: Test in Priority Order

Prioritizing the Shortlist

When to Use the Full Model

How DIVET Handles Failure

Diagnosing by Stage

DIVET in a Concrete Pass

Why a Staged Model Beats Ad Hoc Prompting

Frequently Asked Questions

Why does the model need a name?

Which stage do people skip most often?

Can I run DIVET in a single conversation?

How is Examine different from Test?

Does the loop back to Define ever end?

Key Takeaways

Agency Script Editorial

Related Articles

Rolling Out AI Hallucinations Across a Team

A Model Behind an API Is Only Potential

Case Study: Large Language Models in Practice

Ready to certify your AI capability?

The DIVET Model for Generating Hypotheses With AI

D: Define the Problem

What Define Produces

I: Inventory the Possibilities

V: Vary the Angles

Forcing Genuine Diversity

E: Examine and Sharpen

T: Test in Priority Order

Prioritizing the Shortlist

When to Use the Full Model

How DIVET Handles Failure

Diagnosing by Stage

DIVET in a Concrete Pass

Why a Staged Model Beats Ad Hoc Prompting

Frequently Asked Questions

Why does the model need a name?

Which stage do people skip most often?

Can I run DIVET in a single conversation?

How is Examine different from Test?

Does the loop back to Define ever end?

Key Takeaways

Agency Script Editorial

Related Articles

Rolling Out AI Hallucinations Across a Team

A Model Behind an API Is Only Potential

Case Study: Large Language Models in Practice

Ready to certify your AI capability?