Berlin Tech Meetup: The Future of Relational Foundation Models, Systems, and Real-World Applications

Register now:
Learn14 min read

Fraud Detection with Machine Learning: Rules, Trees, and Graphs

Fraud costs the global economy $485 billion per year. Detection has evolved through three eras, each finding fraud the previous one missed. Here's where we are now and what's next.

TL;DR

  • 1Global fraud losses reached $485 billion in 2023 (Nasdaq). Detection has evolved through three eras: rules engines (50-60%), tree-based ML (70-80%), and graph ML (85-95% on organized fraud).
  • 2Modern fraud is organized: synthetic identity rings, coordinated account takeovers, and multi-hop money laundering chains. These patterns are invisible at the transaction level.
  • 3Rules-based systems flag 90-95% legitimate transactions as suspicious. Graph-based approaches reduce false positives by 40-60% by using relational context across entities.
  • 4The feature engineering problem is worse for fraud: features must be computed in real-time (50-100ms), patterns change quarterly, and schemas span 8-15 interconnected tables.
  • 5Foundation models change the economics: instead of building graph infrastructure from scratch (a multi-year project), connect your transaction database and get graph-based fraud scores immediately.

Global fraud losses reached $485 billion in 2023, according to Nasdaq's Global Financial Crime Report. Payment fraud alone accounts for $32 billion in card-not-present losses. And these numbers are growing: digital transaction volumes are increasing 15% year-over-year while fraud techniques become more sophisticated.

The technology used to detect fraud has evolved through three distinct eras. Each era solved a category of fraud that the previous one could not see. Understanding this evolution is critical, because most enterprises are still stuck in era two while fraudsters have moved to era three tactics.

Era 1: Rules engines (1990s-2010s)

The first generation of fraud detection used hand-written rules. If-then logic encoded known fraud patterns:

  • If transaction amount > $5,000 and country is not home country, flag
  • If more than 3 transactions within 10 minutes, flag
  • If card-not-present and shipping address differs from billing, flag
  • If new account and transaction exceeds $1,000 within first 24 hours, flag

These rules work for the specific patterns they encode. A fraudster using a stolen card for a $10,000 purchase from an unusual country will get caught. The problem is that rules only detect patterns someone has already seen and thought to codify.

The limitations

Rules-based systems have three structural weaknesses:

  • Reactive, not proactive. Every rule was written in response to a fraud pattern that already succeeded. There is always a lag between a new technique and the rule that catches it.
  • High false positive rates. Simple threshold rules flag 90-95% legitimate transactions. A bank processing 10 million transactions per day might flag 500,000, of which 475,000 are false positives. Each manual review costs $15-25, adding up to millions per year in wasted investigation.
  • Brittle under adaptation. Fraudsters test rules by probing limits. If the threshold is $5,000, they run transactions at $4,999. If velocity is checked over 10 minutes, they space transactions 11 minutes apart. Rules are easy to reverse-engineer and circumvent.

Here is what suspicious transaction data looks like in practice. The fraud is invisible at the transaction level but obvious in the graph.

accounts

account_idholdertypeopenedbranch
BA-1001Greenfield LLCBusiness Checking2024-08-14Miami
BA-1002Sandra KeyesPersonal Checking2025-01-22Miami
BA-1003Oceanview TradingBusiness Checking2025-02-03Tampa
BA-1004Marcus AveryPersonal Savings2022-06-10Atlanta

Four accounts across three branches. Nothing suspicious in isolation.

transactions

txn_idfrom_accountto_accountamountdatetype
TX-7701BA-1004BA-1001$9,8002025-10-01Wire
TX-7702BA-1001BA-1002$9,7002025-10-02Wire
TX-7703BA-1002BA-1003$9,5002025-10-03Wire
TX-7704BA-1003Offshore Corp$9,2002025-10-04Intl Wire
TX-7705BA-1004BA-1001$9,9002025-10-08Wire
TX-7706BA-1001BA-1002$9,8502025-10-09Wire

Highlighted: a layering chain. Funds flow BA-1004 to BA-1001 to BA-1002 to BA-1003 to offshore, each just below the $10K reporting threshold. Each individual wire looks routine. The 4-hop path reveals money laundering.

shared_attributes

attributevalueaccounts
Phone number(305) 555-0147BA-1001, BA-1002
IP address198.51.100.42BA-1001, BA-1002, BA-1003
Registered agentCorpServ IncBA-1001, BA-1003

Highlighted: three 'independent' accounts share the same IP address. Combined with the transaction chain, this reveals a single actor operating a laundering network.

Era 2: Tree-based ML (2010s-2020s)

The second era replaced hand-written rules with statistical models, primarily gradient boosted trees (XGBoost, LightGBM). Instead of codifying known patterns, these models learn patterns from historical transaction data.

How it works

A data scientist engineers features from transaction data:

  • Transaction amount, currency, merchant category
  • Time since last transaction, transaction frequency (1h, 24h, 7d)
  • Distance from last transaction location
  • Ratio of current amount to average amount for this customer
  • Device fingerprint, IP geolocation, browser metadata
  • Historical fraud rate for this merchant, this BIN, this country pair

These features go into a flat table (one row per transaction), and a gradient boosted tree learns the statistical boundary between fraud and non-fraud. The model can discover non-linear combinations that no human would write as a rule: transactions that are individually unremarkable but collectively anomalous given a customer's history.

What it improved

Tree-based models improved detection rates from 50-60% to 70-80% and reduced false positive rates by 30-50% compared to rules alone. They can detect novel patterns (not just codified ones) and adapt to new fraud techniques through retraining.

What it still misses

Tree-based models analyze each transaction (or each customer) in isolation. Here is what the era-2 model actually sees for the laundering chain shown above.

flat_feature_table (what XGBoost sees per transaction)

txn_idamounttxn_typesender_age_dayssender_txn_count_30damount_vs_avg
TX-7701$9,800Wire1,20431.2x
TX-7702$9,700Wire42721.1x
TX-7703$9,500Wire31021.0x
TX-7704$9,200Intl Wire31011.0x

Each transaction looks normal in isolation: amounts are just below $10K but within 1.2x of the sender's average. Transaction counts are low. The flat table gives no indication that these four wires form a chain (BA-1004 to BA-1001 to BA-1002 to BA-1003 to offshore), or that three of the four accounts share an IP address.

This is the critical blind spot. Modern fraud is organized. A synthetic identity fraud ring creates 50 fake accounts over 6 months, builds credit on each, then maxes them all out in a coordinated burst. Each individual transaction looks normal. Each individual account looks normal. The fraud is only visible in the relational structure: these accounts share devices, phone numbers, addresses, or behavioral patterns that connect them.

A tree-based model that processes one row per transaction cannot see these connections. It is analyzing pixels when the fraud is in the picture.

Transaction-level ML

  • One row per transaction or customer
  • Features engineered from single entity
  • Cannot see cross-entity relationships
  • Misses organized fraud rings
  • 70-80% detection rate

Graph-based ML

  • Entities and relationships as a graph
  • Patterns learned across the full network
  • Detects shared devices, addresses, behaviors
  • Catches coordinated fraud and synthetic IDs
  • 85-95% detection rate on organized fraud

Era 3: Graph-based ML (2020s-present)

The third era represents the fraud detection problem as a graph. Accounts, transactions, devices, IP addresses, phone numbers, addresses, and merchants become nodes. Edges represent relationships: "sent money to," "shares device with," "same billing address," "logged in from same IP."

Graph neural networks process this structure by passing messages along edges, learning which relational patterns distinguish fraud from legitimate activity.

What graphs reveal

Graph-based fraud detection finds patterns that are invisible at the transaction level:

  • Fraud rings. A cluster of accounts that share devices, IP addresses, or phone numbers and exhibit coordinated behavior. Each account looks independent. The graph shows they are connected.
  • Synthetic identities. Fake identities built from combinations of real and fabricated information. Graph analysis reveals that a "new" identity shares a phone number with a known fraud account, or that the Social Security number was created recently and is connected to multiple applications.
  • Money laundering paths. Funds flowing through multiple accounts in patterns designed to obscure the origin. Graph traversal reveals the full path even when individual transfers look routine.
  • Account takeover chains. A compromised account is used to compromise other accounts through shared credentials or social engineering. The graph shows the propagation pattern.

Production results

Graph-based approaches have shown significant improvements in production fraud detection systems:

  • PayPal reported detecting 40% more fraud using graph-based approaches compared to transaction-level models
  • Capital One's graph-based system reduced false positives by 50% while maintaining detection rates
  • Stripe's Radar uses network-level signals to block $35 billion in fraud annually across its platform

The feature engineering problem in fraud

Fraud detection has a particularly severe version of the feature engineering bottleneck. Transaction databases are complex: accounts, transactions, devices, sessions, merchants, IP addresses, phone numbers, and address histories. A typical fraud detection schema has 8 to 15 interconnected tables.

Building features from this schema is painstaking. A data scientist must decide which tables to join, which aggregations to compute, what time windows to use, and which entity-level features matter. The Stanford RelBench study measured 12.3 hours and 878 lines of code per prediction task on simpler schemas.

For fraud, the problem is worse because the features need to be computed in real time. A fraud decision happens in 50-100 milliseconds. Features like "number of transactions from this device in the last hour" must be computed on the fly, which requires low-latency feature serving infrastructure on top of the engineering effort.

And the features go stale. Fraud patterns change quarterly as attackers adapt. A feature set that works in January may be ineffective by April. This means continuous re-engineering, not a one-time investment.

How foundation models change fraud detection

A relational foundation model like KumoRFM addresses all three of these challenges simultaneously.

No feature engineering

KumoRFM reads the raw transaction database directly, representing it as a temporal heterogeneous graph. No manual feature engineering is needed. The model discovers which patterns across accounts, transactions, devices, and merchants are predictive of fraud.

Graph-native architecture

Because the model represents data as a graph, it naturally captures the relational patterns that define organized fraud. Fraud rings, synthetic identity clusters, and money laundering paths are visible in the graph structure without anyone building explicit graph features.

Pre-trained pattern recognition

KumoRFM has been trained on thousands of diverse relational databases. It has seen fraud-like patterns (anomalous graph topology, velocity spikes, cross-entity propagation) across many different domains. This pre-training means it can detect fraud patterns zero-shot, without task-specific training data, which is critical for catching new fraud techniques before you have labeled examples.

PQL Query

PREDICT transactions.is_suspicious
FOR EACH transactions.txn_id

The model traverses the transaction graph: from each wire to the sending and receiving accounts, to shared attributes (IP, phone, registered agent), to other accounts in the cluster. It discovers the layering chain and shared-IP ring without anyone defining these as features.

Output

txn_idfraud_scoretop_signal
TX-77010.89Source of layering chain, shared IP cluster
TX-77020.93Mid-chain transfer, shared phone with BA-1001
TX-77030.95Chain continues, 3 shared-IP accounts
TX-77040.97Offshore destination, end of layering chain

Adaptation speed

When fraud patterns shift, a traditional model requires re-engineering features, retraining, and redeploying. A foundation model can be fine-tuned on new data in minutes rather than rebuilt from scratch over weeks. This compresses the response time to new fraud techniques from months to hours.

Choosing the right era for your organization

Not every organization needs graph-based fraud detection today. The right approach depends on the type of fraud you face and the complexity of your data.

Rules engines are sufficient when

  • Fraud patterns are well-known and stable
  • Transaction volumes are low enough for high false positive rates to be manageable
  • Regulatory requirements mandate explainable, auditable rules

Tree-based ML is the right step up when

  • You have historical labeled fraud data for training
  • Fraud patterns are evolving and rules cannot keep up
  • False positive rates are too high and need statistical optimization

Graph-based approaches are necessary when

  • You face organized fraud (rings, synthetic identities, coordinated attacks)
  • Your fraud losses are concentrated in network-level patterns, not individual anomalies
  • You need to detect money laundering, account takeover chains, or collusion
  • Your current models miss fraud that is only visible through entity relationships

The $485 billion question

The fraud detection industry is in a transition. Most enterprises are running era-two systems (tree-based ML on engineered features) while facing era-three threats (organized networks, synthetic identities, coordinated attacks). The technology to close this gap exists. Graph-based approaches detect 40-85% more organized fraud than transaction-level models.

The barrier has been implementation complexity. Building a graph-based fraud system from scratch requires graph database infrastructure, GNN training pipelines, and real-time graph computation. This is a multi-year, multi-million-dollar engineering project.

Foundation models change the economics. Instead of building graph infrastructure from scratch, you connect your transaction database and get graph-based fraud scores. The model has already learned the graph patterns. Your team spends time on fraud strategy, not graph engineering.

At $485 billion in annual losses, even small improvements in detection rates translate to enormous value. Moving from era-two to era-three detection is not a marginal upgrade. It is a structural one.

Frequently asked questions

What are the three eras of fraud detection?

Rules-based systems (1990s-2010s) catch known patterns using if-then logic. Tree-based ML (2010s-2020s) uses gradient boosted models to catch statistical anomalies in transaction features. Graph-based ML (2020s-present) models relationships between entities to catch organized fraud rings and synthetic identity networks that single-transaction analysis misses.

Why can't rules-based systems detect modern fraud?

Rules-based systems only catch patterns that someone has already seen and codified. They require manual updates, generate high false positive rates (typically 90-95% of flagged transactions are legitimate), and cannot detect novel fraud patterns or organized fraud networks where individual transactions look normal but the relational pattern is suspicious.

How does graph-based fraud detection work?

Graph-based approaches represent transactions, accounts, devices, and other entities as nodes in a graph, with edges representing relationships (sent money to, shares device with, same address as). Graph neural networks then learn patterns across this structure, detecting organized fraud rings, synthetic identity clusters, and money laundering networks that are invisible when analyzing individual transactions.

What is the false positive problem in fraud detection?

Traditional fraud detection systems flag 90-95% legitimate transactions as suspicious (false positives). Each false positive requires manual review, which costs $15-25 per case. For a large bank processing millions of transactions, this means millions of dollars per year spent investigating legitimate activity. Graph-based approaches reduce false positives by 40-60% because they use relational context, not just transaction features.

Can a foundation model improve fraud detection without retraining?

Yes. A relational foundation model like KumoRFM has learned fraud-relevant patterns (unusual graph topology, velocity anomalies, cross-entity propagation) from thousands of diverse databases. It can score transactions against these patterns zero-shot, without task-specific training, and deliver predictions in under a second. Fine-tuning on your specific fraud data further improves performance.

See it in action

KumoRFM delivers predictions on relational data in seconds. No feature engineering, no ML pipelines. Try it free.