Which algorithm should I use to predict customer churn?

It depends on your data structure. If your customer data fits in a single flat table, XGBoost or LightGBM will give you the best accuracy-to-effort ratio - they consistently outperform logistic regression and random forest on tabular churn data. But if your data spans multiple tables (customers, orders, products, support tickets, usage logs), a graph neural network approach like KumoRFM will significantly outperform XGBoost because it reads cross-table patterns that flat-table models never see. On the SAP SALT benchmark, KumoRFM scored 91% vs 75% for expert-tuned XGBoost.

How do I build a churn prediction model on our customer data?

The traditional approach has 5 steps: (1) collect and join your customer data into a single table, (2) engineer features like recency, frequency, and monetary value, (3) split into train/test sets with temporal awareness, (4) train an XGBoost or LightGBM model, (5) evaluate with precision-recall and deploy. This works but plateaus around 65-70% accuracy because flattening your data loses cross-table signals. With KumoRFM, you skip steps 1-4 entirely: connect your data warehouse, write a one-line PQL query, and get predictions in seconds against your raw relational tables.

Why does my churn model plateau at 65-70% accuracy?

Most churn models plateau because they are trained on a single flat table that compresses rich customer behavior into aggregate numbers. A column like order_count=23 or avg_support_time=4.2min throws away the sequence, timing, and cross-entity relationships that actually predict churn. The customer who placed 5 orders last month but zero this month looks the same as the customer who places 2 orders every month - both have order_count=23 over a year. Breaking past 70% requires features from multiple hops across your relational data, which flat-table algorithms cannot access.

Is XGBoost good enough for churn prediction?

XGBoost is the right choice if two conditions are true: your data fits in one well-engineered table, and you have a data scientist who can build strong features. On flat tabular data, XGBoost and LightGBM are hard to beat. But for most real-world churn problems, the data that predicts churn lives across 4-6 tables, and the feature engineering to flatten it takes weeks. If you are spending more time on feature engineering than on modeling, the bottleneck is not your algorithm - it is your data representation.

How much data do I need to build a churn prediction model?

For a traditional XGBoost model, you typically need at least 5,000-10,000 customer records with labeled churn outcomes and 6-12 months of behavioral history to capture seasonal patterns. For KumoRFM, the data requirements are lower because the foundation model transfers knowledge from pre-training on thousands of relational datasets. It can produce useful zero-shot predictions even on smaller datasets, though accuracy improves with more data and more connected tables.

What features matter most for churn prediction?

The most predictive churn features fall into four categories: (1) engagement decline - dropping login frequency, fewer orders, shorter sessions, (2) support friction - increasing ticket volume, escalations, unresolved issues, (3) payment signals - failed charges, downgrade requests, billing disputes, and (4) peer behavior - when customers who bought similar products or share an account manager also churn. Categories 1-3 can be captured in a flat table with effort. Category 4 requires multi-table graph patterns that only relational models can discover automatically.

How is KumoRFM different from building a churn model in Python?

A Python churn model (scikit-learn, XGBoost, etc.) requires you to write SQL joins, compute features, handle temporal leakage, train the model, and maintain the pipeline. This takes 2-6 weeks per model and 878 lines of feature engineering code on average, according to the Stanford RelBench study. KumoRFM replaces that entire pipeline with a one-line PQL query. It reads your raw relational tables, discovers cross-table features automatically, and returns predictions in seconds - with 91% accuracy on enterprise benchmarks vs 75% for expert-built XGBoost models.

Can I use deep learning for churn prediction?

Standard deep learning (feed-forward networks, LSTMs) on flat churn tables rarely outperforms XGBoost. Neural networks need large datasets to shine, and most churn datasets are modest in size. The exception is graph neural networks, which operate on the relational structure of your data rather than a flat table. GNNs can capture multi-hop patterns (customer -> orders -> products -> other customers' behavior) that no flat-table algorithm can see. KumoRFM is a foundation model built on graph neural network architecture, which is why it reaches 91% on enterprise churn benchmarks.

Best Algorithm for Churn Prediction: Ranked Guide (2026) | Kumo.ai

7 churn prediction algorithms compared

Here is every algorithm worth considering for churn prediction, from simplest to most capable. Each one has a sweet spot.

1. Logistic Regression

The simplest supervised classifier and a solid starting point. Logistic regression fits a linear decision boundary between churners and non-churners. It is fast to train, easy to explain to business teams, and works well as a baseline to measure other models against.

Best for: establishing a baseline, regulated environments where every coefficient must be explainable, very small datasets where complex models overfit.
Watch out for: cannot capture non-linear relationships or feature interactions. Accuracy typically caps at 55-65% on real churn data.

2. Decision Trees

A single decision tree splits customers into groups based on feature thresholds (e.g., "if tenure < 6 months AND support_tickets > 2, predict churn"). Easy to visualize and explain. Often used in marketing teams for segment-based churn rules.

Best for: interpretability (you can draw the tree), small datasets, business rule extraction, teams without ML expertise.
Watch out for: single trees overfit easily and have lower accuracy (58-64%) than ensemble methods. Rarely used in production without bagging or boosting.

3. Random Forest

An ensemble of hundreds of decision trees, each trained on a random subset of data and features. The ensemble averages out the overfitting problem of single trees and handles non-linear relationships automatically.

Best for: quick iteration with minimal tuning, solid accuracy (60-68%) without much feature engineering, robust to noisy data and outliers.
Watch out for: slower to train than XGBoost on large datasets. Does not capture sequential or temporal patterns. Accuracy ceiling is lower than gradient boosting.

4. XGBoost / LightGBM / CatBoost

Gradient boosted trees are the gold standard for flat tabular data. They sequentially build trees that correct the errors of previous trees, producing highly accurate models with good feature engineering. XGBoost has won more Kaggle competitions than any other algorithm.

Best for: maximum accuracy on a single flat table (65-75% on churn data), production deployment (fast inference, small model files), teams with feature engineering expertise.
Watch out for: requires significant feature engineering to reach peak accuracy. Performance depends heavily on the quality of hand-crafted features. Needs hyperparameter tuning (learning rate, depth, regularization) to avoid overfitting.

5. Support Vector Machines (SVMs)

SVMs find the optimal separating boundary between churners and non-churners. They work well in high-dimensional spaces and can capture non-linear patterns with kernel functions.

Best for: small-to-medium datasets with many features, problems where the decision boundary is clear, academic benchmarks.
Watch out for: scales poorly to large datasets (training time grows quadratically). Difficult to tune kernel parameters. Accuracy (60-68%) is comparable to random forest but with more effort. Largely superseded by XGBoost in practice.

6. Deep Learning (MLPs, LSTMs, Transformers)

Neural networks can capture complex non-linear patterns. LSTMs are useful for sequential event data (login sequences, purchase timelines). On flat churn tables, they rarely beat XGBoost by more than 1-2 points.

Best for: large datasets (100K+ rows), sequential event data where order matters, teams with deep learning infrastructure already in place.
Watch out for: requires more data, more tuning, and more compute than XGBoost. Accuracy on flat tables: 66-76%. Not worth the added complexity unless you have sequential event streams or very large data.

7. Graph Neural Networks / KumoRFM

GNNs operate on the relational structure of your data, not a flattened version of it. They read patterns across customers, orders, products, support tickets, and usage logs without requiring manual joins or feature engineering. KumoRFM is a foundation model built on GNN architecture, pre-trained on thousands of relational databases.

Best for: data spanning multiple tables (which is most real-world churn data), maximum accuracy (76-91% on enterprise benchmarks), teams that want to skip feature engineering entirely.
Watch out for: requires relational data with foreign keys between tables. If your data truly fits in one flat table with no cross-table signals, XGBoost is simpler and competitive. KumoRFM is enterprise SaaS (not open-source), though the free tier at kumorfm.ai covers basic use cases.

churn_algorithm_comparison

algorithm	typical_accuracy	data_requirement	tuning_effort	best_for
Logistic Regression	55-65%	Single flat table	Minimal	Baseline, compliance, interpretability
Decision Trees	58-64%	Single flat table	Minimal	Visualization, business rules, small data
Random Forest	60-68%	Single flat table	Low	Quick iteration, robust to noise
SVM	60-68%	Single flat table	High (kernels)	Small data with many features
XGBoost / LightGBM	65-75%	Flat table + engineered features	Medium-High	Best flat-table accuracy
Deep Learning	66-76%	Large flat table or event sequences	High	Sequential data, very large datasets
GNN / KumoRFM	76-91%	Multiple relational tables	None (zero-shot)	Multi-table data, maximum accuracy

7 algorithms compared. Accuracy ranges based on published benchmarks and practitioner experience. Ranges for individual algorithms are typical estimates; GNN/KumoRFM range is from published RelBench (76.71) and SAP SALT (91%) results. All other models operate on a single flat table.

The honest take: when XGBoost is enough

XGBoost and LightGBM are excellent algorithms. If these two conditions are true, use them and stop reading:

Your churn data fits in a single, well-engineered table with all the features you need already computed.
You have a data scientist who can build strong features - recency, frequency, monetary value, engagement trends, support interactions - and maintain the pipeline over time.

On clean, flat, well-featured data, XGBoost is hard to beat. The problem is that most real-world churn data does not meet those conditions.

Why churn signals live across multiple tables

A customer's churn risk is not a property of the customer row alone. It depends on what they bought, how they used it, what support interactions they had, whether similar customers churned, and how those patterns changed over time. That information lives across 4-6 tables in a typical enterprise database:

Customers - demographics, segment, tenure, plan tier
Orders / Transactions - purchase history, frequency, recency, value
Products / SKUs - what they bought, category, margin, return rate
Support tickets - issue types, resolution time, escalations, sentiment
Usage / Engagement logs - login frequency, feature adoption, session depth
Payments - failed charges, downgrades, billing disputes

To train an XGBoost model, someone has to join all of these tables, compute aggregate features, and flatten everything into one row per customer. That flattening step destroys information.

what_flattening_destroys (churn example)

signal	flat_table_sees	relational_model_sees
Purchase pattern	order_count=23, avg_value=$142	5 orders last month, 0 this month - declining trajectory
Product mix shift	num_categories=3	Shifted from premium to budget products over 8 weeks
Support escalation	tickets=3, avg_resolve=4.2d	3 tickets in 2 weeks, last one escalated, still open
Peer churn pattern	Not visible	4 of 7 customers on same plan with same rep churned last quarter
Cross-product engagement	active=true	Stopped using 2 of 3 product modules, only logs in for billing
Account health trajectory	Static snapshot only	NPS dropped 40 points over 3 surveys while usage fell 60%

Each row shows a real churn signal. The flat table captures a number. The relational model captures the pattern, sequence, and cross-entity context that actually predicts whether this customer will leave.

How to build a churn prediction model in 5 steps

If you are building a traditional churn model, here is the standard approach. This is the right way to do it - and understanding why it plateaus will clarify when you need something different.

Collect and join your data. Write SQL to join your customer table with orders, support, usage, and payments. Apply temporal filters so you do not leak future data into training features. For 5 tables with point-in-time correctness, expect 100-200 lines of SQL.
Engineer features. Compute RFM (recency, frequency, monetary) features, engagement metrics, support interaction counts, trend features (change in order frequency over 30/60/90 days), and any domain-specific signals. This typically produces 50-200 features and takes 40-60% of total project time.
Split with temporal awareness. Use a time-based split, not a random split. Train on data before a cutoff date, validate on the period after. Random splits cause data leakage that inflates accuracy by 5-15 points and produces models that fail in production.
Train XGBoost or LightGBM. Use cross-validation to tune hyperparameters (learning rate, max depth, min child weight, subsample ratio). Evaluate with precision-recall curves, not just accuracy - because churn datasets are imbalanced and accuracy is misleading when 95% of customers do not churn.
Deploy and maintain. Set up a feature pipeline that recomputes features on a schedule, retrain the model monthly or quarterly, monitor for data drift, and maintain the SQL joins as source schemas change. Budget 20-30% of initial build time for ongoing maintenance per quarter.

Traditional churn model pipeline

Write SQL to join 4-6 tables (100-200 lines)
Engineer 50-200 features manually (40-60% of project time)
Handle temporal leakage and point-in-time correctness
Train XGBoost, tune hyperparameters via cross-validation
Deploy feature pipeline + model retraining on schedule
Maintain SQL joins when source schemas change
Total timeline: 3-8 weeks to first production model

KumoRFM approach

Connect Kumo to your data warehouse (one-time setup)
Write one PQL query defining the prediction target
KumoRFM reads raw relational tables directly
No feature engineering, no joins, no model selection
Predictions in ~1 second (zero-shot) or minutes (fine-tuned)
No feature pipeline to maintain
Total timeline: minutes to first prediction

Why most churn models plateau at 65-70% accuracy

If you follow the 5 steps above with XGBoost, you will likely land between 65% and 70% accuracy. That is not a failure of execution - it is a ceiling imposed by the data representation.

The flat table cannot contain:

Multi-hop relationships. A customer's churn risk depends on what similar customers did. "Similar" means customers who bought the same products, used the same features, or share an account manager. That is a 3-hop pattern: customer -> orders -> products -> other customers' outcomes. No amount of feature engineering on a flat table captures this.
Temporal sequences across tables. A customer whose support tickets are increasing while their order frequency is decreasing and their usage of premium features has stopped - that three-table temporal pattern is a strong churn signal. A flat table collapses each into a single number.
Network effects. When a key user at a B2B account leaves, the other users on that account often follow. When a product line gets negative reviews from multiple customers, churn accelerates across that cohort. These are graph patterns that exist in the relationships between entities, not in any single row.

How KumoRFM reaches 91% accuracy on enterprise churn data

KumoRFM is a relational foundation model - pre-trained on thousands of relational databases to understand the patterns that exist across connected tables. When it predicts churn, it does three things that flat-table algorithms cannot:

Reads raw relational tables directly. No joins, no flattening, no feature engineering. It takes your customers, orders, products, support, and usage tables as-is and constructs a graph that preserves every relationship and temporal sequence.
Discovers multi-hop patterns automatically. It finds signals like "customers who bought products in this category and then contacted support about shipping issues churned at 3x the base rate." These patterns span 3-4 tables and would require a data scientist to hypothesize and manually encode them as features.
Transfers knowledge from pre-training. Because KumoRFM was pre-trained on thousands of relational databases, it already understands common patterns (declining engagement predicts churn, support escalations predict churn, peer behavior predicts churn). It applies that knowledge zero-shot to your data, which is why it scores 91% on the SAP SALT benchmark without any task-specific training.

sap_salt_churn_benchmark

approach	accuracy	feature_engineering_time	lines_of_code
LLM + AutoML	63%	Hours (LLM-generated)	LLM-generated
PhD Data Scientist + XGBoost	75%	Weeks	878+ lines
KumoRFM (zero-shot)	91%	0	0

SAP SALT benchmark on enterprise data. KumoRFM outperforms expert data scientists with hand-tuned XGBoost by 16 percentage points - with no feature engineering and no training time.

relbench_benchmark_churn_relevant

approach	AUROC	feature_engineering	code
LightGBM + manual features	62.44	12.3 hours/task	878 lines
AutoML + manual features	~64-66	Reduced hours/task	878 lines
KumoRFM zero-shot	76.71	~1 second	0 lines
KumoRFM fine-tuned	81.14	Minutes	0 lines

RelBench benchmark (7 databases, 30 tasks, 103M rows). The 14+ AUROC point gap between LightGBM and KumoRFM zero-shot comes entirely from cross-table patterns the flat table never contains.

Predicting churn with PQL: one query, no pipeline

PQL (Predictive Query Language) is how you tell KumoRFM what to predict. Instead of building a feature pipeline, writing joins, and training a model, you write a query that looks like SQL but defines a prediction target:

PQL Query

PREDICT churn_90d
FOR EACH customers.customer_id
WHERE customers.segment = 'enterprise'
AND customers.tenure_months > 3

This single query replaces the entire traditional pipeline: the SQL joins across 4-6 tables, the feature engineering code, the model selection, and the training loop. KumoRFM reads raw customers, orders, products, support_tickets, and usage_logs tables to generate predictions.

Output

customer_id	churn_probability	confidence	top_signal
C-9201	0.89	high	Support escalation + declining order frequency
C-9202	0.14	high	Stable multi-product engagement
C-9203	0.72	medium	Peer accounts on same plan churning
C-9204	0.05	high	Expanding usage across 3 product lines
C-9205	0.91	high	Payment failure + zero logins in 30 days

When to use each algorithm: a decision framework

Do not pick an algorithm based on what performed best in someone else's blog post. Pick it based on your data and your team:

Use Logistic Regression when you need a fast baseline, full interpretability, or regulatory compliance that requires transparent coefficients. It is also the right sanity check before trying anything more complex.
Use Random Forest when you want a quick improvement over logistic regression without extensive hyperparameter tuning. Good for prototyping and for teams without deep ML experience.
Use XGBoost / LightGBM when your data is in a single flat table with well-engineered features and you want maximum accuracy on that table. This is the workhorse of production churn models on flat data.
Use Deep Learning (MLP/LSTM) only if you have sequential event data (clickstreams, session logs) and enough volume to justify the complexity. On flat churn tables, it rarely beats XGBoost.
Use KumoRFM when your data spans multiple relational tables, when you do not have weeks for feature engineering, or when you need to break past the 70% accuracy ceiling. It is the only approach that reads raw relational structure without manual joins.

Key Takeaways

1XGBoost and LightGBM are the best algorithms for churn prediction on a single flat table. If your data fits in one well-engineered table, use them. They are fast, accurate, and well-understood.
2Most real churn data spans 4-6 relational tables. Flattening it into aggregate features (order_count, avg_value, ticket_count) destroys the sequences, cross-entity patterns, and temporal trajectories that predict churn. This is why most models plateau at 65-70%.
3On enterprise benchmarks, KumoRFM scores 91% accuracy (SAP SALT) and 76.71 AUROC zero-shot (RelBench) vs 75% and 62.44 for expert-tuned XGBoost/LightGBM. The gap comes from multi-table patterns that flat-table models never see.
4The traditional 5-step churn modeling pipeline (join, feature engineer, split, train, deploy) takes 3-8 weeks and produces 878+ lines of feature code. KumoRFM replaces it with a one-line PQL query and returns predictions in seconds.
5Pick your algorithm based on your data structure, not on benchmarks. Single flat table with strong features? XGBoost. Multiple relational tables with cross-entity signals? KumoRFM. The algorithm matters less than whether it can see the data that predicts churn.

Best Algorithm for Churn Prediction: A Ranked Guide