NAIC AI Systems Evaluation Tool Pilot: A Guide for Insurers

I’ve been saying for a while now that good AI needs great governance—not just for compliance, but because it’s how we build systems we can actually trust. Well, US insurance regulators are officially moving from "thinking" to "doing."

The NAIC is launching its AI Systems Evaluation Tool Pilot starting in March 2026, and it’s a big milestone for our industry. If you’re running an insurance company, this isn’t just another "check the box" exercise—it’s a window into the future of how your operations might be evaluated.

Fast Facts: The 2026 NAIC AI Systems Evaluation Pilot

Timeline: March 2026 – September 2026.
Participating States (12): CA, CO, CT, FL, IA, LA, MD, PA, RI, VT, VA, and WI.
Core Focus: AI Governance Frameworks, Data Integrity (Exhibit D), and Third-Party Risk.
Integration: Reviews will be part of Market Conduct and Financial Exams.

What are the Regulator Goals?

Regulators aren't trying to stifle innovation; they’re trying to understand it. The Pilot is designed to:

Bridge the gap: Determine if the Tool helps insurers clearly explain their AI governance and helps regulators actually understand how those systems are applied.
Refine the toolkit: Use the experience to understand how AI regulatory oversight could work and identify where regulators need more training. One really important point, the “Tool” is not a literal system or application, it is a template to guide the questions.
Focus on what matters: They are following the idea of proportionality—spending more time on high-risk systems that could lead to consumer or financial problems, and less time on low-risk, back-office systems.

My Take: The Good and the "Work in Progress"

I’m generally an optimist about this because I believe it will be a huge win for the industry if regulators can establish a common pattern for how they ask questions and perform these exams. Right now, the lack of a standardized "language" for AI governance creates a lot of confusion... having a shared playbook should eventually make things more predictable for everyone involved.

Most of what’s in the Tool—like the focus on governance frameworks—is just fundamentally good business. If you want to use AI to change the world (or even just your underwriting), you should want to have a clear policy, a defined role for your Board, and a handle on your risks.

However, we have to stay grounded in the practical reality of these requirements:

Data Hygiene vs. AI Risk: I expect Exhibit D in particular to be challenging. It asks for a lot of detail across many potential data elements—lineage, quality checks, and exhaustive documentation. To me, there is a slippery slope between more generalized data governance and an AI-centric evaluation. If we aren't careful, there is a risk of a lot of effort that could happen without clear value or results, for regulators and industry.
The "Third-Party" Puzzle: There’s still a lot of fog around how or if regulators might evaluate vendors directly. We know insurers are ultimately responsible for their own compliance, but the "how" of vendor oversight—especially what regulatory capacity exists or what an evaluation would comprise of—is still being sorted.
Inconsistency: While the goal of this is to develop and maintain some regulatory consistency, each jurisdiction has the authority to modify or adapt the questions to meet their specific needs. This means "compliance" might look slightly different depending on where you are domestic.

3 Steps to Prepare for a potential AI System Evaluation

If you are a domestic in one of the 12 states with a financial exam coming up, you should expect to interact with this Pilot; however, states are also planning to ask non-domestics as well in the context of market conduct exams.

Be Informed: Review the work being done by the Big Data and Artificial Intelligence Working Group now. Do you have a written AI program? Is your Board involved?
Inventory Everything: Make sure you actually know where your AI is living. You can't govern what you can't see. Make sure you consider the full definition of AI in the Tool and other regulatory guidance about AI - your inventory needs to go beyond GenerativeAI use cases.
High-Risk = High-Impact: While Exhibit C says “High-Risk”, don’t assume regulators only view direct consumer impact as high-risk. They do not. Exhibit A asks about your material financial impact systems as well. Regulators are equally interested in how AI might impact solvency or financial risk.

At Monitaur, we’ve built our entire platform to map directly to these expectations. We can help you produce the necessary reporting or manage the "high-risk" details exactly to the NAIC's specs.

AI has the potential to make our lives so much better... we just have to be brave enough to govern it properly.

NAIC AI Systems Evaluation Tool Pilot: A Guide for Insurers

Fast Facts: The 2026 NAIC AI Systems Evaluation Pilot

My Take: The Good and the "Work in Progress"

3 Steps to Prepare for a potential AI System Evaluation

Subscribe to our newsletter