When Valid JSON Is Still the Wrong Answer

The danger of structured output is not that it fails loudly. It is that it succeeds quietly while being wrong. A model returns a perfectly formed object, your validator passes it, your code accepts it, and a subtly incorrect value flows into a database, a report, or a transaction. Nothing alerts. The structure was never the problem.

This is the central irony of the topic: the better your enforcement, the more confident you become, and the more confidence you place in values that may be meaningless. Enforced structure solves the easy, visible failures and can mask the hard, invisible ones. Teams that understand the syntax-versus-meaning distinction still routinely build systems that assume conformance equals correctness.

This article surfaces the non-obvious risks of structured output, the governance gaps they create, and the mitigations that actually address them rather than addressing the symptom.

The Risk of Confident Wrongness

Valid Shape, Invalid Meaning

The headline risk is semantic error inside valid structure. A date in the wrong format passes a string field. A total that does not match its line items passes a number field. An invented value passes a required field because the model would rather fabricate than leave it blank. Schema validation catches none of these, and strict decoding makes you more sure of the answer, not more right.

Mitigation: layer business-rule validation on top of schema validation. Encode the relationships that must hold — sums, ranges, referential consistency, closed value sets — and reject violations explicitly. Our Best Practices That Actually Work piece details this layered guard, and the Real-World Examples and Use Cases collection shows where semantic errors typically hide.

Fabrication to Satisfy the Schema

A schema that marks a field required pressures the model to produce a value even when the input does not contain one. Strict enforcement can make this worse, because the model cannot return malformed output as a signal of uncertainty; it must comply, and compliance can mean invention.

Mitigation: make fields optional where the data may genuinely be absent, and provide an explicit unknown or not-present value the model can choose. Give it a legitimate way to say "I don't know" so it does not fabricate one.

Governance Gaps That Hide in the Structure

No Single Owner of the Schema

When schemas are defined inline across many services, no one owns the contract. A field's meaning drifts, two teams interpret the same key differently, and there is no review gate when someone changes a required field. The structure looks governed because it is typed; it is not.

Mitigation: put schemas under version control with review, ideally in a shared registry, so the contract has an owner and a change history. The Rolling Out Structured Output Across a Team piece covers this governance directly.

Silent Model Upgrades

A provider updates the model behind the same API name, and behavior shifts. Conformance can regress, or worse, semantic quality can degrade while conformance stays perfect — the structure still validates, the meaning quietly worsens. Without re-evaluation, you discover this from a customer.

Mitigation: pin model versions where the provider allows it, and maintain an evaluation set you re-run on every upgrade before promoting it.

Untracked Reliability

Many teams never measure conformance or silent-failure rates, so they have no idea how often their structured output is wrong. The absence of alarms is mistaken for the absence of problems.

Mitigation: instrument conformance, repair rate, and business-rule failures, as covered in our Metrics That Matter piece. You cannot govern what you do not measure.

Security and Data Risks

Injection Through Extracted Content

When structured output extracts values from untrusted input — a user's email, an uploaded document — those values can carry injection payloads into downstream systems. A field that looks like a name might be a SQL fragment or a prompt-injection instruction aimed at the next model in the chain.

Mitigation: treat extracted values as untrusted input regardless of the clean structure. Sanitize, escape, and parameterize exactly as you would any external data. Structure is not sanitization.

Over-Trusting Output in Automated Actions

The most consequential risk is wiring structured output directly into an irreversible action — a payment, a deletion, a sent message — on the assumption that valid structure means safe to execute. A confidently wrong value then causes a real, unrecoverable effect.

Mitigation: gate irreversible actions behind business-rule validation and, where the stakes warrant it, human confirmation. Reserve full automation for actions you can reverse or for output you have measured to a known reliability.

Operational Risks That Surface Over Time

Some risks do not appear at launch. They accumulate, which makes them easy to discount until they bite.

Schema Drift Without a Migration Path

As requirements evolve, schemas change — a field added, a type tightened, an enum extended. Without a versioning and migration discipline, old records validated under the previous schema and new records validated under the current one coexist with no record of which is which. Months later, a query that assumes one shape silently mishandles the other.

Mitigation: version your schemas explicitly and tag stored records with the schema version they were validated against, so a consumer always knows which contract a record honors. The Rolling Out Structured Output Across a Team piece covers the governance that makes this routine.

Cost Creep From Defensive Bloat

A team burned by a failure tends to respond by adding fields, examples, and retries everywhere, which quietly inflates token cost and latency across the whole system. The cure for one incident becomes a tax on every call.

Mitigation: treat schema and prompt size as a budget. Add constraints where a measured failure justifies them, and periodically trim fields and examples that no longer earn their keep. Let the data, not the last scary incident, drive what you add.

Over-Reliance on a Single Provider

Building your reliability story entirely on one vendor's enforcement mechanism is a risk that stays invisible until that vendor changes pricing, deprecates a feature, or degrades a model behind the same API name.

Mitigation: keep a thin abstraction between your application and the provider's specific format, and maintain a validation layer that does not depend on any one vendor's guarantees, so switching is a swap rather than a rewrite.

Building a Risk-Aware Posture

You do not eliminate these risks; you manage them in proportion to stakes. A practical posture:

Classify each use case by consequence of a wrong value, and apply heavier validation and review where the consequence is high.
Default to optional fields and explicit unknowns so the model is never forced to fabricate.
Validate meaning, not just shape, on every consequential path.
Treat extracted values as untrusted and sanitize before downstream use.
Measure reliability continuously so a regression surfaces from your dashboard, not your customer.

A risk-aware posture is not paranoia; it is proportion. The A Framework for Structured Output and JSON Mode piece gives you a structured way to classify use cases by stakes and apply the right weight of validation and review to each, so you spend your governance effort where a wrong value actually costs something.

Frequently Asked Questions

If strict enforcement guarantees conformance, what is left to worry about?

Meaning. Strict enforcement guarantees the output matches the schema's shape and types, not that the values are correct, sensible, or non-fabricated. The dangerous failures — wrong totals, invented fields, out-of-range values — all live inside perfectly valid structure and require business-rule validation to catch.

How does required-field enforcement cause fabrication?

When a field is required and the input lacks the information, the model cannot return malformed output as a signal of uncertainty under strict mode, so it produces a plausible value instead. Making such fields optional, or offering an explicit unknown value, gives the model a truthful way out and prevents invention.

Why are extracted values a security concern if the structure is clean?

Clean structure says nothing about the content of the values. A well-formed string field can contain an injection payload pulled from untrusted input. Treat every extracted value as external, untrusted data and sanitize it before it touches a database, a shell, or another model.

What is the biggest mistake teams make with structured output risk?

Wiring valid output directly into irreversible actions on the assumption that conformance means correctness. A confidently wrong value then causes an unrecoverable effect. Gate consequential actions behind semantic validation and, where warranted, human confirmation.

How do we govern schemas that are scattered across services?

Move them into version control with review, ideally a shared registry, so each contract has an owner and a change history. Inline schemas drift in meaning and change without a gate, which is a governance gap masquerading as type safety.

Key Takeaways

The core risk is confident wrongness: valid structure containing incorrect or fabricated meaning that no schema check catches.
Required fields can pressure models to fabricate; prefer optional fields and explicit unknown values.
Govern schemas with version control and ownership, and re-evaluate on every model upgrade to catch silent regressions.
Treat extracted values as untrusted input and sanitize before downstream use; structure is not sanitization.
Gate irreversible actions behind business-rule validation, and measure reliability so regressions surface from your dashboard.

This article surfaces the non-obvious risks of structured output, the governance gaps they create, and the mitigations that actually address them rather than addressing the symptom.

The Risk of Confident Wrongness

Valid Shape, Invalid Meaning

Fabrication to Satisfy the Schema

Governance Gaps That Hide in the Structure

No Single Owner of the Schema

Silent Model Upgrades

Mitigation: pin model versions where the provider allows it, and maintain an evaluation set you re-run on every upgrade before promoting it.

Untracked Reliability

Many teams never measure conformance or silent-failure rates, so they have no idea how often their structured output is wrong. The absence of alarms is mistaken for the absence of problems.

Mitigation: instrument conformance, repair rate, and business-rule failures, as covered in our Metrics That Matter piece. You cannot govern what you do not measure.

Security and Data Risks

Injection Through Extracted Content

Mitigation: treat extracted values as untrusted input regardless of the clean structure. Sanitize, escape, and parameterize exactly as you would any external data. Structure is not sanitization.

Over-Trusting Output in Automated Actions

Operational Risks That Surface Over Time

Some risks do not appear at launch. They accumulate, which makes them easy to discount until they bite.

Schema Drift Without a Migration Path

Cost Creep From Defensive Bloat

Over-Reliance on a Single Provider

Building a Risk-Aware Posture

You do not eliminate these risks; you manage them in proportion to stakes. A practical posture:

Classify each use case by consequence of a wrong value, and apply heavier validation and review where the consequence is high.
Default to optional fields and explicit unknowns so the model is never forced to fabricate.
Validate meaning, not just shape, on every consequential path.
Treat extracted values as untrusted and sanitize before downstream use.
Measure reliability continuously so a regression surfaces from your dashboard, not your customer.

Frequently Asked Questions

If strict enforcement guarantees conformance, what is left to worry about?

How does required-field enforcement cause fabrication?

Why are extracted values a security concern if the structure is clean?

What is the biggest mistake teams make with structured output risk?

How do we govern schemas that are scattered across services?

Key Takeaways

The core risk is confident wrongness: valid structure containing incorrect or fabricated meaning that no schema check catches.
Required fields can pressure models to fabricate; prefer optional fields and explicit unknown values.
Govern schemas with version control and ownership, and re-evaluate on every model upgrade to catch silent regressions.
Treat extracted values as untrusted input and sanitize before downstream use; structure is not sanitization.
Gate irreversible actions behind business-rule validation, and measure reliability so regressions surface from your dashboard.

When Valid JSON Is Still the Wrong Answer

The Risk of Confident Wrongness

Valid Shape, Invalid Meaning

Fabrication to Satisfy the Schema

Governance Gaps That Hide in the Structure

No Single Owner of the Schema

Silent Model Upgrades

Untracked Reliability

Security and Data Risks

Injection Through Extracted Content

Over-Trusting Output in Automated Actions

Operational Risks That Surface Over Time

Schema Drift Without a Migration Path

Cost Creep From Defensive Bloat

Over-Reliance on a Single Provider

Building a Risk-Aware Posture

Frequently Asked Questions

If strict enforcement guarantees conformance, what is left to worry about?

How does required-field enforcement cause fabrication?

Why are extracted values a security concern if the structure is clean?

What is the biggest mistake teams make with structured output risk?

How do we govern schemas that are scattered across services?

Key Takeaways

Agency Script Editorial

Related Articles

Rolling Out AI Hallucinations Across a Team

A Model Behind an API Is Only Potential

Case Study: Large Language Models in Practice

Ready to certify your AI capability?

When Valid JSON Is Still the Wrong Answer

The Risk of Confident Wrongness

Valid Shape, Invalid Meaning

Fabrication to Satisfy the Schema

Governance Gaps That Hide in the Structure

No Single Owner of the Schema

Silent Model Upgrades

Untracked Reliability

Security and Data Risks

Injection Through Extracted Content

Over-Trusting Output in Automated Actions

Operational Risks That Surface Over Time

Schema Drift Without a Migration Path

Cost Creep From Defensive Bloat

Over-Reliance on a Single Provider

Building a Risk-Aware Posture

Frequently Asked Questions

If strict enforcement guarantees conformance, what is left to worry about?

How does required-field enforcement cause fabrication?

Why are extracted values a security concern if the structure is clean?

What is the biggest mistake teams make with structured output risk?

How do we govern schemas that are scattered across services?

Key Takeaways

Agency Script Editorial

Related Articles

Rolling Out AI Hallucinations Across a Team

A Model Behind an API Is Only Potential

Case Study: Large Language Models in Practice

Ready to certify your AI capability?