Transparency & Explainability

Fraudhalo Model Card

We believe fraud detection should be explainable. This document describes how the Fraudhalo scoring model works, what features it uses, its performance characteristics, and its known limitations.

Model Overview

What the model does.

Model typeGradient boosting ensemble (LightGBM)
Prediction targetFraud probability score, 0–100. Higher = higher fraud risk.
Decision outputallow / review / block, configurable threshold per merchant profile
Training dataTransaction sequences, behavioral patterns, identity graph data. No raw PAN data. No biometric data.
Training scopeCard testing, account takeover, synthetic identity, BNPL fraud vectors
Retraining cadenceWeekly, on confirmed fraud labels and dispute data
Current model versiongbm_v3_2025w38

Feature Architecture

Feature categories used by the model.

Fraudhalo model feature categories: velocity, graph, and identity signals

Velocity Features

  • txn_count_1m
  • txn_count_5m
  • card_velocity_1h
  • amount_sum_15m
  • decline_rate_5m
  • bin_probe_rate_1h

Graph Features

  • device_accounts_7d
  • device_graph_edges
  • ip_account_overlap
  • cross_device_card_fan
  • account_linkage_score

Identity Features

  • addr_consistency
  • name_ssn_match
  • phone_linkage
  • thin_file_indicator
  • addr_velocity_30d

Performance Metrics

Precision, recall, and F1 at threshold.

Metrics represent internal evaluation on a held-out validation dataset. Live performance varies by merchant profile, fraud mix, and transaction volume.

Score ThresholdDecisionPrecisionRecallF1
≥ 80block0.910.740.82
50–79review0.760.880.82
< 50allow0.980.960.97

Evaluation dataset: 2.8M transactions, 38% fraud prevalence in flagged queue. Default thresholds shown; adjustable per merchant profile via pilot onboarding.

Retraining Policy

Champion-challenger retraining framework.

Training data sources

  • Confirmed fraud labels from customer dispute data
  • Manual review outcomes from risk analyst queue
  • Chargeback reason codes (CB disputed by cardholder)
  • Raw PAN data (never collected or stored)
  • Biometric data (not used in any signal)

Deployment process

  • 1. Challenger model trained on previous 90-day window
  • 2. Shadow scoring: challenger runs alongside champion without affecting decisions
  • 3. Challenger promoted when AUC-ROC improves by ≥ 0.5% on holdout
  • 4. Champion retired after 2-week overlap monitoring

Known Limitations

What this model does not do well.

Transparency about limitations is as important as performance claims. Risk engineering buyers should factor these into their evaluation.

Novel attack pattern lag

New fraud patterns that have not appeared in training data may take 1–2 weeks before detection accuracy improves following retraining. The weekly cadence is designed to minimize this window.

International transaction calibration

International transactions have fewer calibration signals in the training dataset. Detection accuracy on non-US issuer cards may be lower than on domestic US cards.

New merchant cold start

New merchant profiles require 48–72 hours of transaction history before merchant-specific baseline signals are calibrated. Default conservative thresholds apply during cold start.

High-velocity legitimate patterns

Some legitimate merchants have high-velocity patterns (subscription billing, micro-transactions) that resemble card testing signals. Merchant profile configuration mitigates this but may require tuning.

Explainability

Every decision includes its top 3 contributing signals.

Fraudhalo returns a human-readable signal breakdown with every scoring response, supporting regulator inquiry response and fraud analyst review.

# Example scoring response with explainability
{
 "score": 87,
 "decision": "block",
 "signals": [
 {
 "feature": "card_velocity_1h",
 "value": 23,
 "contribution": "high",
 "label": "23 distinct cards from this device in 1 hour"
 },
 {
 "feature": "device_accounts_7d",
 "value": 8,
 "contribution": "high",
 "label": "device seen on 8 accounts in past 7 days"
 },
 {
 "feature": "amount_pattern",
 "value": "probe_sequence",
 "contribution": "medium",
 "label": "amount sequence matches card probe pattern"
 }
 ],
 "model_version": "gbm_v3_2025w38",
 "latency_ms": 67
}

Questions about our model?

Talk to our risk team. We can walk through model specifics, threshold configuration, and the explainability output format in a technical review call.

Talk to our risk team