Global financial fraud losses reached $485 billion in 2023. That number comes from Nasdaq's Global Financial Crime Report, and it includes payment fraud, identity fraud, money laundering, and account takeover. In the US alone, banks reported $10.3 billion in direct fraud losses in 2024, according to FinCEN.
Every major bank runs fraud detection models. Every one of them still bleeds billions. The models work on individual transactions. They check if this purchase, from this card, at this merchant, looks anomalous compared to this cardholder's history. And they catch a lot. Visa's AI systems block an estimated $25 billion in fraud annually.
But the fraud they miss is the fraud that matters most. Organized fraud rings. Synthetic identity schemes. Money mule networks. These attacks exploit relationships between entities that single-transaction models cannot see.
Why banking data is different
A typical bank's core data spans 30-50 interconnected tables. Customers. Accounts (checking, savings, credit, investment). Cards. Transactions. Merchants. Wire transfers. Counterparties. Loans. Branches. Call center interactions. Online sessions. Device fingerprints. IP addresses.
Every table is connected to multiple others through foreign keys. A customer has accounts. Accounts generate transactions. Transactions flow to merchants. Merchants are categorized. Other customers transact at the same merchants. Those customers have their own accounts, their own transaction patterns, their own risk profiles.
This is a graph. A dense, temporal, heterogeneous graph where nodes are entities of different types and edges are relationships with timestamps and dollar amounts attached. Traditional ML flattens this structure into a single row per entity, losing the network information that carries the strongest predictive signals.
transactions — sample banking data
| txn_id | account_id | merchant | amount | timestamp | device_id |
|---|---|---|---|---|---|
| T-4001 | A-110 | Gas Station #412 | $42.80 | 2025-03-01 08:14 | D-7701 |
| T-4002 | A-112 | Wire Transfer Out | $9,800 | 2025-03-01 09:30 | D-7701 |
| T-4003 | A-115 | Wire Transfer Out | $9,750 | 2025-03-01 09:45 | D-7703 |
| T-4004 | A-118 | Wire Transfer Out | $9,900 | 2025-03-01 10:02 | D-7701 |
| T-4005 | A-110 | Online Purchase | $189.00 | 2025-03-01 14:22 | D-7701 |
Highlighted: three wire transfers just under $10,000 within 32 minutes. Individually unremarkable. In the graph, accounts A-110, A-112, and A-118 share device D-7701, and all three target the same offshore entity.
fraud_detection_comparison — flat vs graph model
| Metric | Flat Model (LightGBM) | Graph Model (KumoRFM) | Impact |
|---|---|---|---|
| True Positive Rate | 72% | 89% | +24% more fraud caught |
| False Positive Rate | 97% | 42% | 55% fewer false alerts |
| Investigation Cost/Month | $48.5M | $21.4M | $27.1M saved |
| Ring Detection Rate | < 5% | 68% | Rings visible in graph |
| Time to Ring Detection | 12-18 months (manual) | 2-5 days | 90%+ faster |
Graph models reduce false positives by 55% while catching 24% more fraud. The biggest impact: detecting organized fraud rings that flat models structurally cannot see.
Fraud detection: from transactions to networks
Consider a fraud ring operating across a bank's customer base. Five accounts, opened at different branches over six months, with different names and identification documents. Individually, each account looks normal. Standard demographics, typical transaction patterns, unremarkable balances.
But in the graph, these accounts share a shipping address. Two share a device fingerprint. Three have received wire transfers from the same offshore entity. Their transaction patterns, when viewed as a network, form a distinctive hub-and-spoke topology that matches known laundering patterns.
A flat model scoring each transaction independently sees five normal accounts. A graph-based model sees a fraud ring.
The false positive problem
Current rule-based and flat ML fraud systems generate enormous numbers of false positives. The average bank's fraud detection system has a false positive rate of 95-98%, meaning 95-98 out of every 100 flagged transactions are legitimate. Each false positive requires human review, costing $15-50 per investigation.
For a bank processing 100 million transactions per month with a 2% flag rate, that is 2 million flagged transactions. At 97% false positive rate, 1.94 million of those are legitimate. At $25 per investigation, the bank spends $48.5 million per month reviewing transactions that should never have been flagged.
Graph-based models reduce false positives by 40-60% because they can distinguish between genuinely suspicious network patterns and normal coincidences. A customer transacting at an unusual merchant might look suspicious in isolation. But if the graph shows that the customer recently moved to a new city (address change, branch visit, utility payment) and the merchant is near their new address, the context explains the behavior.
Credit risk: beyond the credit score
Credit scoring is the original ML use case in banking. FICO scores, logistic regression, gradient-boosted trees. The models are mature and well-validated. But they operate on flat features: payment history, credit utilization, length of credit history, credit mix, new credit inquiries.
borrowers — two applicants with identical flat features
| borrower_id | FICO | credit_util | payment_history | account_age_yrs |
|---|---|---|---|---|
| B-601 | 680 | 42% | 98% on-time | 6 |
| B-602 | 680 | 41% | 97% on-time | 5 |
Nearly identical credit profiles. A flat model scores them within 2 points of each other.
transaction_network — what the graph reveals
| borrower_id | employer_payroll | employer_trend | peer_default_rate | counterparty_risk |
|---|---|---|---|---|
| B-601 | TechCorp ($12K/mo) | Stable (3 years) | 4% (below average) | Low (utility + grocery) |
| B-602 | StartupXYZ ($11K/mo) | Declining (-22% in 6 months) | 18% (3x average) | High (cash advances + pawn) |
B-602's employer payroll is declining 22% over 6 months. Borrowers with similar transaction graphs (same merchant categories, same employer trajectory) default at 18%, 3x the average. The flat credit score sees none of this.
Relational models add this dimension: the borrower's financial network. Who do they transact with? What is the risk profile of their counterparties? Are their income sources connected to stable or volatile entities? Has the default rate among borrowers with similar transaction patterns increased recently?
These network signals are particularly powerful for thin-file borrowers (those with limited credit history). A borrower with a FICO score of 650 might have a very different risk profile depending on whether their transaction graph resembles that of borrowers who defaulted or borrowers who stayed current. For the 800 million people globally who are "credit invisible," network signals are often the only signals available.
Portfolio-level risk
Beyond individual scoring, graph-based models reveal concentration risks that traditional portfolio analytics miss. If 500 borrowers in your portfolio are all connected through a single employer, and that employer's transaction volume has declined 40% over three months, you have a correlated default risk that no individual credit score reflects.
This is precisely the kind of systemic risk that caused the 2008 financial crisis: correlated defaults driven by shared network exposures that were invisible to models scoring borrowers independently.
Anti-money laundering: the compliance burden
Banks spend an estimated $274 billion annually on financial crime compliance globally, according to LexisNexis. AML compliance alone accounts for $35-40 billion in the US. Most of this spending goes to armies of compliance analysts manually reviewing suspicious activity reports (SARs).
Current AML systems rely on rules: flag transactions over $10,000, flag structured transactions just below $10,000, flag transfers to high-risk jurisdictions. These rules are decades old and widely known to criminals. They generate massive false positive volumes (estimated at 95-99%) while missing sophisticated laundering schemes that operate within the rules.
wire_transfers — layering pattern
| transfer_id | from_account | to_account | amount | timestamp | jurisdiction |
|---|---|---|---|---|---|
| W-201 | Corp-A (US) | Shell-1 (Cayman) | $9,900 | Mar 1, 09:00 | Cayman Islands |
| W-202 | Shell-1 (Cayman) | Shell-2 (BVI) | $9,700 | Mar 1, 14:30 | British Virgin Islands |
| W-203 | Shell-2 (BVI) | Shell-3 (Panama) | $9,500 | Mar 2, 10:15 | Panama |
| W-204 | Shell-3 (Panama) | Corp-B (US) | $9,200 | Mar 3, 11:00 | US |
| W-205 | Corp-B (US) | Real Estate LLC | $9,100 | Mar 3, 15:30 | US |
Classic layering: $9,900 moves through 4 shell entities in 3 days, losing $700 in fees, before reaching a real estate purchase. Every individual transfer is under $10,000 and passes standard AML rules.
aml_rule_based_vs_graph — detection comparison
| Transfer | Rule-Based Flag | Graph-Based Flag | Reason |
|---|---|---|---|
| W-201 | No (under $10K) | Yes (0.91) | Start of 4-hop layering chain |
| W-202 | No (under $10K) | Yes (0.88) | Shell-to-shell, rapid succession |
| W-203 | No (under $10K) | Yes (0.92) | 3rd jurisdiction in 2 days |
| W-204 | No (under $10K) | Yes (0.89) | Circular: same beneficial owner |
| W-205 | No (under $10K) | Yes (0.94) | Terminal node: real estate purchase |
Every transfer passes the $10,000 rule. The graph model detects the chain topology: 5 accounts, 4 jurisdictions, decreasing amounts, 3-day velocity. Classic layering pattern.
Graph-based AML detects laundering by analyzing the flow of money through the network. Layering (moving money through multiple accounts to obscure its origin) creates distinctive graph patterns: rapid sequential transfers, fan-out-fan-in topologies, and circular flows. These patterns are invisible when you examine transactions individually but obvious when you view the network.
Traditional financial AI
- Scores individual transactions or customers
- Flat feature tables with manual aggregations
- 95-98% false positive rate in fraud detection
- Rule-based AML with known circumventions
- Misses organized fraud rings and network patterns
Graph-based financial AI
- Analyzes the full entity-transaction-counterparty network
- Learns directly from relational graph structure
- 40-60% reduction in false positives
- Detects layering, structuring, and ring patterns
- Captures multi-hop signals invisible to flat models
Customer retention and cross-sell
Fraud and risk get the headlines, but customer retention may be the highest-ROI application of relational AI in banking. Acquiring a new banking customer costs 5-7x more than retaining an existing one. For large banks, a 1% improvement in retention translates to $100M-500M in preserved annual revenue.
Traditional churn models use account-level features: balance trends, transaction frequency, product holdings, support interactions. Graph-based models add the relational dimension: has the customer's transaction pattern shifted to a competitor bank? Have customers with similar financial profiles recently churned? Is the customer's employer showing signs of downsizing (reduced payroll deposits across connected accounts)?
Cross-sell models benefit similarly. A customer who just received a large wire transfer, opened a brokerage account, and has transaction patterns similar to customers who purchased investment advisory services is a strong cross-sell candidate. That signal spans four tables and three relationship types. No flat model captures it.
PQL Query
PREDICT fraud_probability FOR EACH transactions.txn_id WHERE transactions.timestamp > '2025-03-01'
One query scores every transaction against the full account-merchant-device-counterparty graph. The model detects ring patterns, structuring, and shared-device clusters in a single pass.
Output
| txn_id | fraud_prob | risk_type | network_signal |
|---|---|---|---|
| T-4001 | 0.04 | Low | Normal merchant pattern |
| T-4002 | 0.92 | Ring | Shared device + structured amount + offshore target |
| T-4003 | 0.88 | Ring | Same target entity + timing cluster |
| T-4004 | 0.94 | Ring | 3-account hub via device D-7701 |
| T-4005 | 0.11 | Low | Consistent with account history |
The foundation model shift
Building custom graph models for each banking use case (fraud, credit risk, AML, churn, cross-sell) requires separate data pipelines, separate feature engineering, and separate model training. At a large bank, this means 5-10 separate ML teams working on 5-10 separate pipelines, each costing $1M-3M per year to maintain.
A relational foundation model changes this equation. KumoRFM connects directly to the bank's data warehouse, understands the full relational schema, and serves any prediction task without task-specific feature engineering or model training.
On the RelBench benchmark, which includes financial transaction data, KumoRFM zero-shot outperforms supervised GNNs that were specifically trained on each task. The same model that detects fraud also predicts churn, scores credit risk, and identifies cross-sell opportunities.
For a bank, this means: one platform, one data connection, one governance framework, and dozens of prediction tasks served from the same model. The time from business question to production prediction drops from 6-12 months to days. The cost per use case drops from millions to a fraction of a single data scientist's salary.
The banks that adopt this approach first will not just save money on ML infrastructure. They will be able to ask questions about their data that were previously too expensive to answer. And in an industry where every basis point of risk pricing and every percentage point of fraud detection translates directly to the bottom line, that speed advantage compounds.