Every churn model has the same dirty secret. It is very good at predicting customers who are already gone. The user who has not logged in for 60 days. The subscriber whose payment failed twice. The player who uninstalled the app. These predictions are correct and useless. By the time the model flags them, there is nothing left to save.
The customers who matter are the ambiguous ones. The subscriber who is still active but whose usage pattern just shifted. The B2B account where three of five seats have gone quiet. The retail customer whose order frequency dropped from weekly to monthly, but whose last order was their largest ever. These are the predictions that drive retention revenue. And most churn models miss them entirely.
What churn prediction is (and is not)
Churn prediction is a classification problem: for each customer, estimate the probability that they will stop being a customer within a defined time window (typically 30, 60, or 90 days). The output is a score between 0 and 1. The business sets a threshold and takes action on customers above it.
The goal is not to predict churn. It is to predict preventable churn. A customer who leaves because they moved to a different country is not preventable. A customer who leaves because a competitor offered a 20% discount and your retention team never reached out is preventable. The model's value is proportional to the number of preventable churners it identifies before they leave.
The financial stakes are not small. Customer acquisition costs 5x to 25x more than retention across most industries. In gaming, publishers spend an estimated $15 billion annually on player acquisition. The industry average shows 75% of new players churn within 24 hours and 90% within 30 days. Every percentage point of early churn prevented has an outsized impact on lifetime value.
Here is what the data looks like for a B2B SaaS company. The churn signal hides across three tables.
accounts
| account_id | company | plan | mrr | contract_end |
|---|---|---|---|---|
| ACCT-201 | Pinnacle Health | Enterprise | $4,200 | 2026-03-01 |
| ACCT-202 | Vortex Media | Business | $890 | 2026-01-15 |
| ACCT-203 | Atlas Logistics | Enterprise | $7,800 | 2026-06-01 |
user_sessions
| session_id | account_id | user_email | date | duration | features_used |
|---|---|---|---|---|---|
| SS-01 | ACCT-201 | j.chen@pinnacle.com | 2025-11-01 | 42 min | 8 |
| SS-02 | ACCT-201 | m.wells@pinnacle.com | 2025-11-01 | 31 min | 5 |
| SS-03 | ACCT-201 | r.patel@pinnacle.com | 2025-10-15 | 4 min | 1 |
| SS-04 | ACCT-201 | k.davis@pinnacle.com | 2025-09-28 | 0 min | 0 |
| SS-05 | ACCT-201 | l.garcia@pinnacle.com | 2025-09-10 | 0 min | 0 |
| SS-06 | ACCT-202 | t.lee@vortex.com | 2025-11-10 | 18 min | 3 |
Highlighted: 2 of 5 Pinnacle Health users have gone completely inactive. A flat model shows 'avg_session_duration = 15 min' for the account. The relational model sees that 40% of seats are dark.
support_tickets
| ticket_id | account_id | subject | priority | status | created |
|---|---|---|---|---|---|
| TK-301 | ACCT-201 | SSO integration broken | Critical | Open | 2025-10-28 |
| TK-302 | ACCT-201 | Export feature not working | High | Open | 2025-11-02 |
| TK-303 | ACCT-201 | Requesting contract review | Medium | Open | 2025-11-08 |
| TK-304 | ACCT-203 | Add new user seats | Low | Resolved | 2025-10-20 |
Highlighted: Pinnacle Health has 3 open tickets in 11 days, escalating from technical issues to a contract review request. This sequence is a classic pre-churn pattern.
Why flat features miss the hard cases
A typical churn model is trained on a flat feature table. One row per customer. Here is what the data scientist builds from the three tables above.
flat_feature_table (what the churn model sees)
| account_id | plan | mrr | active_users | avg_session_min | tickets_30d | days_to_renewal |
|---|---|---|---|---|---|---|
| ACCT-201 | Enterprise | $4,200 | 3 | 25.7 | 3 | 113 |
| ACCT-202 | Business | $890 | 1 | 18.0 | 0 | 36 |
| ACCT-203 | Enterprise | $7,800 | 5 | 41.2 | 0 | 173 |
Highlighted: ACCT-201 shows 3 active users and 25.7 min average session duration. Looks healthy. But the raw data shows 2 of 5 users are completely inactive (0 min sessions), and the 3 open tickets escalate from technical issues to a contract review. The flat table hides both the seat attrition pattern and the ticket escalation sequence.
The problem is what the features cannot express.
Cross-table patterns
A retail customer's churn risk depends not just on their purchase history, but on the products they bought. If those products had high return rates from other customers, or if the brands they preferred just had a quality scandal, the churn risk goes up. But that signal lives 2-3 hops away in the database: customer → orders → products → other customers' returns. No standard aggregation captures this.
Team-level dynamics
In B2B SaaS, churn is rarely an individual decision. It is a team decision. If 3 of 5 users on an account stop logging in, the remaining 2 are at extreme risk, even if their own usage looks healthy. But a per-user feature table cannot express "percentage of my teammates who have gone inactive." That requires traversing the user → account → other users path.
Temporal sequences, not aggregates
"5 orders in 30 days" is a feature. But it does not tell you whether those orders were evenly spaced (healthy cadence) or compressed into the first week followed by three weeks of silence (pre-churn pattern). The aggregate is identical. The sequence is completely different. And the sequence is what predicts churn.
PQL Query
PREDICT accounts.plan = 'Cancelled' FOR EACH accounts.account_id
The model reads accounts, user_sessions, and support_tickets as a graph. It discovers that Pinnacle Health's 40% inactive seats + 3 escalating tickets + contract review request is a high-risk combination.
Output
| account_id | churn_probability | top_signal |
|---|---|---|
| ACCT-201 | 0.86 | 40% inactive seats, escalating tickets, contract review |
| ACCT-202 | 0.33 | Small account, steady single-user engagement |
| ACCT-203 | 0.07 | Adding seats, resolved tickets, high engagement |
The relational advantage
A relational approach to churn prediction represents the database as a graph. Customers, orders, products, interactions, subscriptions, and support tickets become nodes. Foreign keys become edges. Timestamps establish ordering. The model traverses this graph to build each customer's prediction.
This changes what the model can see.
Product affinity signals
In the H&M dataset, the relational model discovered that customers who subscribed to the fashion newsletter but whose recent purchases were concentrated in sale items had higher churn rates. This is a 3-table pattern (customers → orders → products + customers → newsletter_subscriptions) that aggregation cannot express as a single feature. The model found it automatically by traversing the graph.
Community churn propagation
In gaming and social platforms, churn spreads through social graphs. When a player's guild members leave, the remaining player is more likely to leave. When a user's closest connections on a social platform go inactive, the user follows. A relational model sees this propagation directly: player → guild → other players → activity status. A flat model would need someone to engineer a "percentage of friends who churned" feature, which requires knowing to look for it in the first place.
Behavioral sequence matching
The graph preserves the full temporal sequence of events for each customer. The model can learn that the pattern "3 support tickets in a week, followed by a billing inquiry, followed by a settings change to downgrade" is a churn precursor. It matches this pattern across the customer base without anyone manually defining it as a feature.
Flat feature churn model
- One row per customer, losing all relational structure
- Predicts obvious churners (inactive, failed payments)
- Misses cross-table patterns (product quality, team dynamics)
- Temporal sequences destroyed by aggregation
- H&M benchmark: 55.21 AUROC (LightGBM)
Relational churn model
- Full graph: customers, orders, products, interactions
- Finds ambiguous churners hiding in relational patterns
- Traverses 3-4 hop paths automatically
- Preserves temporal sequences with timestamped edges
- H&M benchmark: 69.88 AUROC (RDL)
Industry-specific churn dynamics
Churn is not one problem. It manifests differently across industries, and the relational signals that predict it vary accordingly.
Gaming
Gaming has the most extreme churn profile of any industry. Studies consistently show that 75% of new mobile game players churn within 24 hours and 90% within 30 days. Publishers spend an estimated $15 billion annually on player acquisition, making early retention the single highest-leverage prediction problem in the industry.
The relational signals that predict gaming churn include: session duration trajectories (declining vs. stable), social graph density (players with active friends retain better), progression velocity (too fast or too slow both predict churn), and monetization patterns (first-purchase timing and amount are strongly predictive of long-term retention).
Retail and e-commerce
Retail churn is harder to define because there is no subscription to cancel. Churn is the absence of expected behavior: a customer who used to buy monthly has not bought in 90 days. The H&M RelBench task formalizes this as "will this customer make a purchase in the next 30 days?"
The relational signals include: product return rates of purchased items, category concentration (customers diversifying their purchases are more engaged than those narrowing), seasonal pattern alignment (a holiday shopper who misses Black Friday), and price sensitivity trends across orders.
B2B SaaS
SaaS churn is a team sport. The relational signals that matter most are at the account level, not the user level: seat utilization trends, feature adoption breadth, admin activity (declining admin logins are a leading indicator), integration usage (customers with 3+ active integrations churn at half the rate), and contract value trajectory (downgrades predict cancellation).
Financial services
Banking churn is predicted by cross-product relationships: a customer who moves their direct deposit is 6x more likely to close their account within 90 days. A customer who reduces their automatic bill pay relationships is signaling a shift to a competitor. These are relational signals that span the customer → accounts → transactions → payees path.
From prediction to intervention
A churn score without an explanation is a number that the retention team cannot act on. If the model says "this customer has an 82% probability of churning," the next question is always "why?" Without the why, the team cannot design an intervention.
Relational models provide the why automatically. The cell-level attribution traces the prediction back to specific data points: this customer is predicted to churn because their last 3 support tickets were unresolved (support_tickets table, rows 4521-4523), their team utilization dropped from 80% to 40% in the last 30 days (user_sessions table), and the product they rely on most was deprecated in the last release (product_changes table).
That is an intervention playbook, not a probability score. The customer success team knows exactly what to address: resolve the open tickets, set up a migration path for the deprecated feature, and reach out to the inactive team members. The specificity of the explanation determines the quality of the intervention.
The benchmark evidence
The RelBench benchmark includes churn-style tasks across multiple domains. The consistent finding: relational approaches outperform flat-table approaches by wide margins on tasks where cross-table patterns carry signal.
- H&M churn task: LightGBM 55.21 vs. RDL 69.88 AUROC (26.6% relative improvement)
- KumoRFM zero-shot on RelBench classification: 76.71 average AUROC vs. 62.44 for LightGBM with manual features
- KumoRFM fine-tuned: 81.14 average AUROC, a 30% relative improvement over the manual baseline
The gap is not from a better algorithm running on the same data. It is from the same algorithm seeing more data. The relational model consumes the full graph structure. The flat model consumes a shadow of it.
Getting started
If you have a relational database with customer data and you want churn predictions, the path is straightforward. Connect your data warehouse (Snowflake, BigQuery, Databricks, or Redshift). Write a PQL query: "For each customer, what is the probability of churn in the next 30 days?" The foundation model reads your schema, builds the graph, traverses it, and returns predictions with explanations.
There is no feature engineering step. No model training step. No pipeline to build. The model that scores 76.71 AUROC zero-shot on RelBench is the same model that runs on your data. If you need higher accuracy on your specific domain, fine-tuning pushes toward 81+ AUROC and takes minutes, not months.
Your database already contains the signals that predict which customers will leave. The question is whether your churn model can see them.