Can Your AI Explain Itself? Black Box AI vs. Glass Box AI

Michael Berman

6 min read

Apr 2, 2026

Financial organizations are moving fast on AI — using it to answer regulatory questions, prepare for exams, surface policy language, and support compliance work. Before your organization goes further, one question needs a clear answer: if an examiner asks how you reached a conclusion, can you show your work?

That's not hypothetical, especially for the 22% of financial organizations that have already adopted AI in compliance, according to the 2026 Future of Compliance Survey. It's the question that determines whether AI becomes an asset or a liability you didn't see coming.

The Black Box Problem

A black box AI system produces an output without explaining how it got there. You ask a question, you get an answer, and the trail ends. No source citations. No reasoning log. No record of what the model prioritized, what it may have gotten wrong, or whether the answer was appropriate for your organization's specific regulatory environment.

For general productivity tasks, that trade-off may be acceptable. For compliance work, it isn't.

Compliance decisions carry legal, operational, and reputational consequences. The outputs that flow from them shape how your organization manages risk, responds to examinations, trains staff, and demonstrates adherence to the regulations that govern your existence. When those outputs can't be explained, you don't just have an AI problem — you have a documentation problem, a governance problem, and potentially an examination problem.

What Examiners Expect

Regulatory guidance on AI has been building for years across the OCC, FDIC, CFPB, Federal Reserve, SEC, NCUA, Fannie Mae, and Freddie Mac. While agencies haven't issued a single unified AI standard, the underlying expectations are consistent and increasingly explicit. They come down to three things: explainability, accountability, and auditability.

Explainability

Examiners expect organizations to be able to explain the basis for risk and compliance decisions. When AI is involved, that expectation extends to the AI layer. Saying "the AI told us" is not sufficient. The organization must be able to articulate what inputs informed the output, what sources were consulted, and why the conclusion is defensible.

Accountability

Human accountability doesn't disappear because AI was involved. Third-party risk management guidance is consistent: management remains responsible for the adequacy of risk management processes regardless of whether technology assisted in executing them. Someone at the organization must own the output — and that ownership requires enough transparency to evaluate what was produced. (An issue since 72% of financial institutions are only partially aware of which vendors are using AI, and not a single organization feels “extremely confident” managing AI-related risks, according to the State of Third-Party Risk Management 2026 Survey Report.)

Auditability

Auditors and examiners need a trail. They need to follow the chain from question to answer, from input to output, and from AI-generated material to the human decision it influenced. Organizations that can't reconstruct that trail during an exam are in the same position as organizations that can't produce meeting minutes or risk committee notes. The absence of records is itself a finding.

The AI Documentation Standard

Adequate documentation in an AI-assisted compliance environment goes beyond saving a copy of the output. The standard should capture five elements:

What was asked: the full query or prompt, with relevant context
What was submitted: any documents provided to the AI as part of generating the response
What was returned: the complete response, not a summary or excerpt
What sources were cited: specific regulatory texts, guidance documents, or institutional policies, with enough specificity to verify
What human review occurred: who reviewed the output, what they evaluated it against, and whether and how it was modified before use

In practice, that means different things depending on the work. When AI is used to research a regulatory question, retain the original query, the full response, the sources cited, the reviewer's name, and any adjustments made before the output was relied upon — stored in a system of record accessible for examination and audit, not buried in a chat log or a personal folder. When AI assists in exam preparation, an audit, or a control assessment, document the scope of the review, the specific areas where AI output was incorporated, and the human review process that validated those outputs.

Organizations that maintain this standard are positioned to demonstrate that AI was a tool in the hands of qualified professionals, not a substitute for professional judgment. Those that don't have created a gap between what they did and what they can prove they did. In an examination context, that gap is the risk.

Glass Box AI: What It Means and Why It Matters

The alternative to black box AI is what we call glass box AI. Glass box AI is a system that doesn’t just produce answers but shows its work. It surfaces the sources that informed the response and provides enough transparency into its reasoning that a trained compliance professional can evaluate the output, verify its accuracy, and document the basis for any decision that follows.

Glass box AI is a design philosophy built for regulated financial organizations. In a compliance context, the value of an AI tool isn’t just the quality of its answers — it’s how well those answers can be reviewed, challenged, corrected, and documented by the people accountable for the outcome.

Category	Black Box AI	Glass Box AI
Explainability	No reasoning trail; output appears without context	Sources and reasoning surfaced with every response; work is shown
Source citations	None provided; no way to verify which regulations informed the answer	Specific regulatory texts and guidance documents cited and verifiable
Auditability	No retrievable record of inputs, outputs, or decision chain	Full input/output logging in a system of record accessible for exam and audit
Accountability	"The AI told us" — insufficient transparency to evaluate or own the output	Human review protocol between AI output and reliance; ownership is documented
Regulatory calibration	Generic answers from undifferentiated public data; no charter-type awareness	Calibrated to the organization's specific regulatory environment and jurisdiction
Error detection	Wrong in ways difficult to detect without SME expertise; confident but unverifiable	Outputs can be reviewed, challenged, corrected, and documented by qualified professionals
Documentation standard	Outputs buried in chat logs or personal folders; no structured retention	Captures query, documents submitted, full response, sources cited, and human review
Exam readiness	Gap between what was done and what can be proven; absence of records is a finding	Demonstrates AI was a tool in qualified hands, not a substitute for judgment
Governance risk	Creates documentation, governance, and examination problems simultaneously	Built for regulated environments; transparency is the design philosophy
Bottom line	Compliance management failure — will be evaluated as one	Positioned for AI-related exam scrutiny with controls, documentation, and accountability

Before deploying any AI-assisted compliance tool, an organization should be able to answer these questions:

Can the system identify the specific regulatory sources it drew upon?
Does it produce responses that are verifiable against those sources?
Can the organization log and retain inputs and outputs in a retrievable format?
Is the system calibrated to the organization's specific regulatory environment, or is it producing generic answers from undifferentiated public data?
Has the organization established a human review protocol between AI output and reliance?

If any of these can't be answered affirmatively, the organization is carrying meaningful risk.

The General LLM Problem

General purpose large language models (LLMs) are trained on broad public data and optimized to produce confident, fluent responses. They are not trained on your organization's policies. They don't know your charter type, your supervisory history, or the specific findings previously raised in your examinations. They can't distinguish between guidance that was superseded and guidance that is current. And they have no mechanism for knowing what they don't know — which means they will often produce authoritative-sounding answers to questions where the correct answer is nuanced, jurisdiction-specific, or simply different from what the model learned.

The danger isn't that these tools are always wrong. The danger is that they're wrong in ways that are difficult to detect without subject matter expertise — and that their outputs leave no traceable record of how the answer was produced.

Using a general-purpose AI tool for compliance work without explainability controls is accepting work from an unaccountable assistant — without adequate review and with no record of how the conclusion was reached. That’s not an AI governance failure. It’s a compliance management failure, and it will be evaluated as one.

The Bottom Line

The organizations best positioned for the next wave of AI-related exam scrutiny aren't necessarily the ones that moved slowest. They're the ones that were thoughtful about which AI they use, how they use it, and what controls they built around it.

That means demanding glass box capability from AI tools deployed in compliance and risk functions. It means treating AI-assisted work with the same documentation rigor as any other significant compliance activity. And it means ensuring human accountability is a documented reality and not just a policy statement.

Examiners aren't opposed to AI. They're opposed to AI that can't be explained, verified, or connected to human judgment. The question isn't whether to use it. The question is whether, when an examiner asks how you reached a conclusion, you can open the box and show them exactly what happened.

When an examiner asks how you reached a compliance conclusion, "the AI said so" isn't an answer. Nquiry is built for exactly that moment — purpose-built for financial services compliance research, with cited sources and a defensible audit trail baked in.

Subscribe to the Nsight Blog

All Topics Compliance Management Risk Management Product Insight Lending Compliance Banks Credit Unions News & Updates Regulatory Compliance Management Third-Party Risk Management Business Continuity Regulatory Updates Audits & Findings Best Practices RIAs/Wealth Management Artificial Intelligence

Can Your AI Explain Itself? Black Box AI vs. Glass Box AI

The Black Box Problem

What Examiners Expect

Explainability

Accountability

Auditability

The AI Documentation Standard

Glass Box AI: What It Means and Why It Matters

The General LLM Problem

The Bottom Line

Subscribe to the Nsight Blog

Share this

You May Also Like

Using AI in Financial Services: Best Practices and Red Flags

Emerging Securities Risks: What Investment Advisers and Firms Need to Know in 2026

Ncontracts Releases 2026 State of Third-Party Risk Management Survey Report