Model Explainability: What Regulators Are Asking About Your Fraud Detection

Model explainability in fraud detection has moved from a research concern to a practical compliance requirement on a timeline that has caught some payment processors and their technology vendors unprepared. The pressure comes from multiple directions simultaneously: card network rules programs that require fraud methodology documentation, banking regulators who are extending adverse action notice requirements to automated fraud decisions, and acquiring banks who need to demonstrate to card networks that their subacquirers (ISOs, payment facilitators) have adequate fraud controls in place.

The explainability question regulators and acquiring banks are now asking is not abstract: it's specific, operational, and tied to documentation requirements that processors need to satisfy with actual artifacts, not policy language. This post covers what the specific ask looks like in practice, where the documentation gaps typically are, and what an explainability architecture needs to produce to satisfy current inquiry.

What card network programs actually require

Visa's Fraud Monitoring Program and Mastercard's Excessive Chargeback Program both include documentation requirements for processors that enter monitored status — and increasingly, for processors seeking to demonstrate good-faith fraud controls even before they reach monitoring thresholds. The specific documentation ask varies by program and by the processor's tier, but the common elements are:

A description of the fraud detection methodology in use — not a vendor name, but a description of what signals are evaluated and what the decision logic is at a high level
Evidence that the methodology is being actively maintained and updated — retraining cadence, rules review cadence, or equivalent
A process for individual case review: if a specific transaction was declined by the fraud model, what information is available to explain why, and what is the process for a merchant to dispute a block?
For higher-tier monitoring: statistical evidence of model effectiveness (false positive rate, fraud detection rate, or chargeback ratio trend data)

These are network-level requirements enforced through acquirer relationships. If the processor can't produce this documentation, the acquirer takes on regulatory exposure. That's why acquiring banks — which bear ultimate liability for their sponsored processors' card network standing — are increasingly including fraud methodology documentation requirements in their ISA (Independent Sales Organization) and PayFac agreements.

The adverse action problem in automated fraud decisions

A separate explainability requirement comes from the Fair Credit Reporting Act (FCRA) and the Equal Credit Opportunity Act (ECOA) applied to certain fraud-adjacent decisions. When a payment processor declines a transaction or suspends a merchant's processing privileges based partly on credit-file data or data from a consumer reporting agency, adverse action notice requirements may apply. The CFPB has published guidance making clear that automated decision systems that use consumer report data must be able to produce the principal reasons for adverse action — not "the model said so," but specific, human-readable factors.

We're not saying that every declined payment transaction triggers an adverse action notice requirement — most card-not-present declines don't involve consumer report data and don't implicate FCRA. But account-level fraud decisions — suspending a consumer's access to their neobank account, declining a BNPL application, or freezing a merchant's funds — increasingly do involve consumer report data, and those decisions need explainable factor output, not black-box scores.

This is where the distinction between "explainable for regulator audit" and "explainable for consumer notice" matters. Regulator audit explainability can be documented at the model level (here is how the model works, here are the feature categories). Consumer adverse action explainability requires per-decision factor output (for this specific decision, the top factors were X, Y, Z) in language that a non-technical person can understand.

What model explainability actually requires technically

The technical implementation of explainability for a gradient-boosted ensemble model — the architecture most production fraud scoring systems use — centers on per-prediction feature attribution. The standard approaches are:

SHAP (SHapley Additive exPlanations). SHAP values decompose the model's prediction into the contribution of each feature for that specific prediction. For a fraud score of 78 on a specific transaction, SHAP output might show: velocity_15min: +22.3, device_first_seen: +18.1, amount_vs_median: +9.4, account_age: -4.2. The positive values increased the fraud score; the negative values decreased it. SHAP is mathematically principled (based on cooperative game theory), model-agnostic, and produces locally accurate attributions — meaning the feature contributions actually sum to the model output. It's computationally heavier than alternatives, but TreeSHAP (the implementation for tree-based models like XGBoost and LightGBM) is fast enough for production use.

Feature importance ranking. A simpler approach that produces the top-N features driving a specific prediction, without the signed attribution magnitude. Less precise than SHAP but computationally cheaper and sufficient for "top reasons" adverse action notice output. Most production GBM frameworks (XGBoost, LightGBM, CatBoost) expose prediction-level feature contribution methods natively.

{
 "transaction_id": "txn_8f3k2p9x",
 "fraud_score": 82,
 "decision": "review",
 "top_signals": [
 {"feature": "velocity_15min", "contribution": "+23.1", "value": 47},
 {"feature": "device_first_seen", "contribution": "+17.4", "value": true},
 {"feature": "ip_subnet_card_count","contribution": "+11.8", "value": 31},
 {"feature": "account_age_days", "contribution": "-5.2", "value": 312}
 ]
}

This response structure — returned by the fraud scoring API with every decision — is what regulators and acquiring banks are asking for. It allows a processor to answer the question "why was transaction X flagged?" with a specific, auditable answer rather than "our model scored it high."

Model card documentation: what to include

A model card — a structured documentation artifact for a machine learning model, a practice originally proposed in Mitchell et al. 2019 and now widely used in production ML systems — provides the regulator-level documentation that network programs and acquiring banks require. The minimum content for a fraud detection model card that satisfies current documentation requirements:

Model type and architecture: e.g. "gradient boosting ensemble (LightGBM), 500 trees, max depth 7"
Training data description: what data was used to train, what the fraud/non-fraud label source was, the time window of training data, any known limitations in training data coverage
Feature categories: what types of signals the model uses — velocity, behavioral, device, identity — without necessarily disclosing specific feature names that could be gamed
Performance metrics on held-out evaluation data: precision, recall, F1 at the deployed threshold; AUC-ROC; and importantly, false positive rate at the deployed operating point
Known limitations: attack types the model handles less well, geographic or demographic segments where calibration is weaker, situations where the model should be supplemented by manual review
Retraining policy: how often the model is retrained, what triggers off-cycle retraining, how the champion-challenger framework works

We're not claiming that publishing a model card makes a processor fully compliant with every applicable requirement — the regulatory landscape is evolving and specific compliance determinations require legal counsel. What a model card provides is a documented baseline that demonstrates the processor understands and can articulate their fraud detection methodology. That's the minimum bar most acquiring banks and card network reviewers are looking for.

The explainability gap at mid-market processors

The gap that most mid-market processors face is not that they lack fraud controls — they have rules engines, they have thresholds, they block transactions. The gap is documentation and per-decision attribution. A rules engine that blocks card-number-appears-in-known-fraud-list can explain itself ("this card was previously associated with a confirmed fraud event") easily. A gradient-boosted model that produces a score of 74 based on 40 features produces a decision that can only be explained if per-prediction attribution is implemented and the output is logged.

The processors that are currently getting documentation requests from their acquiring banks are typically mid-market processors who adopted ML-based fraud scoring as a black-box vendor service, without requiring that the vendor expose per-prediction attribution in the API response. When the acquiring bank asks "why was this merchant's transaction declined?", the processor has a score but no explanation. That gap is what the current documentation pressure is surfacing.

For Fraudhalo's implementation of per-decision attribution and the documentation it produces, see the Model Card page for the full model specification and performance metrics. For processors evaluating how to respond to acquiring bank documentation requests, the How It Works page covers the full decision response structure including the top_signals attribution output.