AI-Powered Fraud Detection for Mid-Market Commercial Bank
Moving From Rules-Based Alerts to Behavioral Analytics That Catch What Static Thresholds Miss
0%
0%
0.0M
0ms
The Client
Mid-Market Commercial Bank
This mid-market commercial bank serves approximately 1.6 million retail and business customers through a network of 280 branches. The bank processes roughly 9 million transactions daily across debit, credit, ACH, wire, and mobile payment channels. Their fraud operations team of 34 analysts had been fighting an increasingly losing battle against sophisticated fraud schemes using a rules-based detection system implemented in 2016.
The Challenge
The Problem
The legacy fraud detection system operated on approximately 1,200 static rules. If a transaction exceeded a dollar threshold, triggered a geographic anomaly, or matched a known fraud pattern, it generated an alert. The problem was twofold.
First, the rules were blunt instruments. A customer who travels frequently for business would trigger geographic alerts constantly. A fraudster who understood the threshold structure could structure transactions just below detection limits. Second, at peak, the system generated 14,000 alerts per day. With 34 analysts working across two shifts, each analyst was expected to review roughly 200 alerts per shift. Analysts developed shortcuts — bulk-dismissing certain alert categories — that inevitably allowed real fraud through.
The bank needed a system that could learn transaction patterns at the individual customer level, score transactions in real time (before authorization), and be explainable for regulatory audits.
Our Approach
4 Phases. 22 weeks.
Built an ensemble AI model combining XGBoost, LSTM, and Graph Neural Networks with explainable AI (SHAP) for regulatory compliance. Deployed on real-time streaming infrastructure scoring transactions in 12ms — before authorization.
Data Assessment & Feature Engineering
4 weeksIngested 26 months of historical transaction records (5.8 billion rows), customer profiles, device fingerprints, and case management records. Constructed over 340 behavioral features across six categories: velocity, amount deviation, geographic, temporal, merchant, and channel features.
The most predictive fraud signals were not individual transaction attributes but changes in behavioral sequences — a pattern that rules-based systems fundamentally cannot detect.
Model Development & Validation
6 weeksDeveloped an ensemble model: XGBoost for structured feature scoring, LSTM for transaction sequence patterns, and a Graph Neural Network for relationship topology analysis. Implemented SHAP explainability values for every flagged transaction.
Every flagged transaction comes with a human-readable explanation: 'This transaction was flagged because the purchase amount is 4.2x the customer average, the device is new, and 3 similar transactions occurred within 12 minutes.'
Real-Time Scoring Infrastructure
6 weeksBuilt a Kafka-based streaming architecture with Flink processing and Redis feature store. The ensemble model runs on GPU-accelerated NVIDIA Triton instances, scoring transactions end-to-end in 12ms average — well within the 150ms authorization window.
Training-serving skew was eliminated by codifying feature definitions in a shared repository consumed by both batch training (Spark) and real-time serving (Flink) pipelines.
Deployment, Monitoring & Model Ops
6 weeksSix weeks of shadow mode comparing AI decisions against the legacy rules engine on real production traffic. Built model monitoring dashboards tracking precision, recall, score distribution drift, and feature drift with automated alerts.
The system passed its first OCC regulatory examination. The examiner specifically noted the quality of model explainability documentation.
The Results
Performance That Speaks
Metric
Before
After
Change
Fraud Detection Rate (True Positive)
62%
91%
False Positive Rate
4.1%
1.35%
Daily Alert Volume
14,000
3,800
Analyst Investigation Coverage
40% of alerts
97% of alerts
Avg. Alert Investigation Time
8.2 minutes
3.4 minutes
Fraud Losses (annualized)
$11.6M
$7.4M
Authorization Latency Impact
N/A (post-auth)
12ms avg. (pre-auth)
Model Explainability Audit
N/A
Passed OCC examination
The $4.2 million reduction in annual fraud losses was the headline number for the board. But the operational transformation was equally significant — analysts went from investigating 40% of alerts to covering 97%, improving job satisfaction and reducing team turnover by 30%.
Technology
The Stack
Reflections
What This Project Taught Us
The model development was perhaps 30% of the total effort. The remaining 70% was infrastructure, integration, explainability, monitoring, regulatory documentation, and the organizational change management required to transition a 34-person team from rules-based to AI-augmented workflows.
The feature store architecture proved to be the single most important technical decision. By ensuring training and serving used identical feature computation logic, we eliminated an entire category of production bugs that plague ML systems.
Regulators are not opposed to AI in fraud detection. They are opposed to AI they cannot audit. Explainability is not a nice-to-have — it is a deployment prerequisite. We designed it from day one, not as a retrofit.
Ready?
Ready to transform your digital experience?
Flynaut builds enterprise-grade digital experiences for brands that refuse to compromise.