AGENCYSCRIPT
CoursesEnterpriseBlog
đź‘‘FoundersSign inJoin Waitlist
AGENCYSCRIPT

Governed Certification Framework

The operating system for AI-enabled agency building. Certify judgment under constraint. Standards over scale. Governance over shortcuts.

Stay informed

Governance updates, certification insights, and industry standards.

Products

  • Platform
  • Certification
  • Launch Program
  • Vault
  • The Book

Certification

  • Foundation (AS-F)
  • Operator (AS-O)
  • Architect (AS-A)
  • Principal (AS-P)

Resources

  • Blog
  • Verify Credential
  • Enterprise
  • Partners
  • Pricing

Company

  • About
  • Contact
  • Careers
  • Press
© 2026 Agency Script, Inc.·
Privacy PolicyTerms of ServiceCertification AgreementSecurity

Standards over scale. Judgment over volume. Governance over shortcuts.

On This Page

What SR 11-7 RequiresPillar 1: Model Development, Implementation, and UsePillar 2: Model ValidationPillar 3: Ongoing MonitoringApplying SR 11-7 to AI ModelsExplainability ChallengeTesting AI ModelsValidation of AI ModelsExtending Beyond SR 11-7Fairness and EthicsData Lineage and ProvenanceThird-Party Model GovernanceAutomated Decision-MakingImplementing Model Risk Management in Your AgencyFor Agencies Serving Financial ServicesFor Agencies Serving Non-Financial ClientsYour Next Step
Home/Blog/Model Risk Management Frameworks (SR 11-7 and Beyond)
Governance

Model Risk Management Frameworks (SR 11-7 and Beyond)

A

Agency Script Editorial

Editorial Team

·March 20, 2026·13 min read
model risk managementSR 11-7ai model validationmodel governance framework

A regional bank hired an AI agency to build a commercial real estate loan pricing model. The agency's data scientists built an excellent model—strong predictive performance, clean code, well-documented. When the bank submitted the model for internal review before deployment, the model risk management team rejected it. Not because the model was bad, but because the documentation did not meet the bank's SR 11-7 requirements. The model development documentation lacked a conceptual soundness assessment. The validation was performed by the same team that built the model (not independent). There was no ongoing monitoring plan. The limitations section was a single paragraph instead of the detailed analysis the bank's regulators expected. The agency spent eight additional weeks—$95,000 in unbilled work—rewriting documentation, conducting independent validation, and building a monitoring plan. The model itself did not change. Only the governance around it changed. That experience taught the agency a fundamental lesson: in regulated industries, the governance around a model is as important as the model itself.

SR 11-7 is the Federal Reserve's "Guidance on Model Risk Management," issued jointly by the OCC and the Federal Reserve in 2011. Despite being over a decade old, it remains the definitive framework for model risk management in financial services—and its principles apply far beyond banking. Understanding SR 11-7 is essential for any AI agency serving regulated industries, and the framework's concepts are increasingly relevant for AI governance in all sectors.

What SR 11-7 Requires

SR 11-7 defines a model as "a quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates." This definition clearly encompasses AI and machine learning models.

The guidance establishes three pillars of model risk management.

Pillar 1: Model Development, Implementation, and Use

Conceptual soundness. The model must be built on a sound theoretical and empirical basis. For AI models, this means:

  • The choice of modeling approach must be justified. Why was a gradient boosted tree used instead of a neural network, or vice versa? What are the trade-offs?
  • The model's assumptions must be documented and their validity assessed
  • The features used must be justified. Why was each feature included? Are there features that were excluded, and why?
  • The model's limitations must be documented—what the model cannot do, where it is likely to fail, and what conditions would invalidate its use

Data quality. The data used to develop and run the model must be accurate, complete, and appropriate:

  • Data sources must be documented, including their reliability and any known quality issues
  • Data preparation and transformation steps must be documented
  • Data quality assessments must be performed and documented
  • Ongoing data quality monitoring must be in place

Testing. The model must be thoroughly tested before deployment:

  • In-sample testing (performance on training data)
  • Out-of-sample testing (performance on holdout data not used in training)
  • Out-of-time testing (performance on data from a different time period)
  • Sensitivity analysis (how do changes in inputs affect outputs?)
  • Stress testing (how does the model perform under extreme conditions?)
  • Benchmarking (how does the model perform compared to simpler alternatives?)

Documentation. All aspects of model development must be documented at a level sufficient for a knowledgeable third party to understand, replicate, and evaluate the model:

  • Model purpose and intended use
  • Data description and preparation
  • Methodology description and justification
  • Testing results and analysis
  • Limitations and conditions for use
  • Implementation details

Pillar 2: Model Validation

Model validation is the independent assessment of model quality and appropriateness. SR 11-7 requires validation to be performed by parties independent of the development team.

Independence. Validators must be independent of the model development team. They should have no vested interest in the model's approval. For AI agencies, this means:

  • The team that validates the model should not be the team that built it
  • If the agency is too small for separate teams, use external validators
  • Independence must be both actual and perceived

Validation scope. Validation must cover:

  • Conceptual soundness review: Is the modeling approach appropriate? Are the assumptions reasonable? Are the features justified?
  • Process verification: Was the model developed according to documented procedures? Were development standards followed?
  • Outcomes analysis: Does the model perform as expected? Are performance metrics within acceptable ranges? How does the model perform across different segments?
  • Benchmarking: How does the model compare to alternative approaches? Would a simpler model perform comparably?
  • Sensitivity analysis: How sensitive is the model to changes in inputs, parameters, and assumptions?
  • Stability analysis: Is the model's performance stable over time? Does it degrade under different conditions?

Validation report. The validation must produce a written report that documents:

  • The scope and approach of the validation
  • Findings, both positive and negative
  • An overall assessment of model fitness for use
  • Conditions or restrictions on use (if applicable)
  • Required remediations before deployment (if applicable)
  • Recommended monitoring and ongoing validation activities

Pillar 3: Ongoing Monitoring

Models are not static. Their performance changes over time as the world changes around them. SR 11-7 requires ongoing monitoring to detect and address model degradation.

Performance monitoring. Track model accuracy, error rates, and other performance metrics in production:

  • Compare production performance to development and validation benchmarks
  • Track performance over time to identify trends and degradation
  • Break performance down by relevant segments (geography, customer type, product line)
  • Set thresholds that trigger action when performance degrades

Outcomes analysis. Compare model predictions to actual outcomes:

  • Back-test predictions against realized results
  • Track prediction accuracy by segment and over time
  • Investigate systematic over- or under-prediction

Stability monitoring. Monitor for changes in model inputs and behavior:

  • Track input data distributions for drift
  • Monitor feature importance stability
  • Track output distribution changes
  • Identify changes in the model's operating environment that may affect validity

Periodic revalidation. Conduct full revalidation on a regular schedule:

  • Annual revalidation for high-risk models
  • Revalidation triggered by material changes (new data, model updates, use case changes)
  • Revalidation triggered by monitoring alerts (significant performance degradation, drift)

Applying SR 11-7 to AI Models

SR 11-7 was written before modern AI and machine learning were widely used in financial services. Applying its principles to AI models requires interpretation and extension.

Explainability Challenge

Traditional statistical models (linear regression, logistic regression) are inherently interpretable. You can look at the coefficients and understand how each input affects the output. Complex AI models (deep neural networks, large ensemble models) are not inherently interpretable. This creates tension with SR 11-7's requirements for conceptual soundness and documentation.

Practical approaches:

  • Use inherently interpretable models where regulatory requirements demand it
  • Where complex models are used, implement post-hoc explainability methods (SHAP, LIME, integrated gradients)
  • Document the trade-off between model complexity and interpretability, and justify the choice
  • Provide both global explanations (how the model generally works) and local explanations (why the model made a specific decision)
  • Acknowledge explainability limitations honestly in documentation

Testing AI Models

AI models require additional testing beyond what SR 11-7 originally contemplated.

Bias testing. Test for disparate impact across protected demographic categories. This is not explicitly required by SR 11-7 but is required by fair lending regulations and is expected by examiners.

Adversarial testing. Test the model's robustness to adversarial inputs—deliberately crafted inputs designed to fool the model. This is particularly important for models that process external data.

Edge case testing. AI models can behave unpredictably at the boundaries of their training data. Test extensively with edge cases, extreme values, and unusual combinations.

Stability testing. Test how the model's behavior changes across time periods, market conditions, and population segments. AI models can be more sensitive to distribution shift than traditional models.

Validation of AI Models

Validating AI models requires specialized skills and approaches.

Replication challenges. AI model training can be non-deterministic (producing slightly different results each time). Validation must account for this by assessing whether differences from replication are within acceptable bounds.

Data leakage assessment. Validate that the model does not benefit from data leakage—information that would not be available at the time of prediction. Data leakage is a common issue in AI model development and can dramatically inflate apparent performance.

Overfitting assessment. Evaluate whether the model is overfitting to training data. Compare training performance to validation performance, use cross-validation, and assess performance on truly out-of-sample data.

Feature importance validation. Validate that the features the model relies on are conceptually sound and that the model's feature importance aligns with domain expertise. A model that produces good predictions for the wrong reasons will fail when conditions change.

Extending Beyond SR 11-7

While SR 11-7 provides an excellent foundation, AI governance requires additional dimensions that the guidance does not fully address.

Fairness and Ethics

SR 11-7 focuses on model accuracy and reliability but does not explicitly address fairness and ethics. Modern AI governance requires:

  • Systematic bias testing across protected categories
  • Fairness metrics integrated into model validation
  • Ethical review of model use cases
  • Ongoing fairness monitoring in production

Data Lineage and Provenance

SR 11-7 addresses data quality but does not specifically require comprehensive data lineage. For AI models, especially those using large and complex datasets, data lineage is essential for:

  • Regulatory compliance (demonstrating data provenance)
  • Debugging (tracing problems to their data source)
  • Impact analysis (understanding what changes when data changes)

Third-Party Model Governance

SR 11-7 addresses vendor model risk but was written before the era of third-party AI APIs and foundation models. Modern governance must additionally address:

  • Governance of models you do not own or control
  • Model behavior changes from provider updates
  • Data privacy in third-party model interactions
  • License and intellectual property considerations

Automated Decision-Making

SR 11-7 focuses on models as decision-support tools but does not fully address the governance of automated decision-making. When AI models make decisions without human intervention, additional governance is needed:

  • Clear criteria for when automated decisions are appropriate
  • Human override mechanisms
  • Appeal processes for affected individuals
  • Enhanced monitoring for automated systems

Implementing Model Risk Management in Your Agency

For Agencies Serving Financial Services

If your clients are banks, insurance companies, or other regulated financial institutions, SR 11-7 compliance is not optional. Build it into your standard delivery.

Development standards. Create development standards that produce SR 11-7-compliant documentation as a natural byproduct of the development process. If your engineers document as they go, using standard templates, the final documentation package comes together with minimal additional effort.

Validation capability. Build or access independent validation capability. This might mean:

  • A separate validation team within your agency (for larger agencies)
  • Partnerships with independent validation firms
  • Clear processes that ensure development and validation independence

Monitoring packages. Include ongoing monitoring as a standard deliverable, not an optional add-on. Design monitoring dashboards, define thresholds, and document monitoring procedures as part of every model delivery.

For Agencies Serving Non-Financial Clients

Even if your clients are not in financial services, the SR 11-7 framework is valuable. Adapt it by:

Scaling proportionally. Apply the full framework to high-risk AI systems and a simplified version to lower-risk systems. Every AI system benefits from documentation, testing, and monitoring—the rigor should match the risk.

Focusing on what matters. Not every element of SR 11-7 applies to every AI system. Focus on the elements that create the most value: documentation, independent review, and ongoing monitoring.

Using it as a differentiator. If your non-financial clients are not expecting SR 11-7-level governance, delivering it anyway differentiates your agency and prepares clients for future regulatory requirements.

Your Next Step

Take the model documentation for your most recent AI project and evaluate it against SR 11-7's three pillars. Does the development documentation include conceptual soundness justification, data quality assessment, and comprehensive testing? Was the model validated independently? Is there an ongoing monitoring plan? Score each dimension as green (meets the standard), yellow (partially meets), or red (does not meet). The reds are your immediate priorities. The yellows are your near-term priorities. Address them before your next delivery, and build the standards into your development process so that future projects meet the bar from the start.

Search Articles

Categories

OperationsSalesDeliveryGovernance

Popular Tags

prompt engineeringai fundamentalsai toolsthe difference between AIMLagency operationsagency growthenterprise sales

Share Article

A

Agency Script Editorial

Editorial Team

The Agency Script editorial team delivers operational insights on AI delivery, certification, and governance for modern agency operators.

Related Articles

Governance

Complete EU AI Act Compliance Guide — What Every AI Agency Needs to Know and Do

The EU AI Act is the most comprehensive AI regulation on the planet. Here is exactly what it requires from AI agencies, which of your systems are affected, and a step-by-step compliance roadmap you can start executing today.

A
Agency Script Editorial
March 21, 2026·15 min read
Governance

HIPAA Compliance Guide for AI in Healthcare — Building AI Systems That Protect Patient Data

Healthcare AI is booming, but one HIPAA violation can end your agency. Here is the complete guide to building HIPAA-compliant AI systems, from BAAs to technical safeguards to breach response.

A
Agency Script Editorial
March 21, 2026·15 min read
Governance

Question 14 Cost a Chicago Agency Its Fortune 500 Deal

ISO 27001 certification is becoming a prerequisite for enterprise AI contracts. Here is the complete implementation guide from gap analysis to certification audit, tailored for AI agencies.

A
Agency Script Editorial
March 21, 2026·14 min read

Ready to certify your AI capability?

Join the professionals building governed, repeatable AI delivery systems.

Explore Certification