What is the difference between a relational foundation model and a tabular foundation model?

A tabular foundation model (like TabPFN or Nexus) operates on a single flat table - one row per entity, one column per feature. A relational foundation model (like KumoRFM) operates on multiple connected tables natively, reading foreign key relationships, temporal sequences, and multi-hop patterns directly from the relational database without flattening.

Why can't I just flatten my relational data into a single table for a tabular model?

Flattening destroys three categories of signal: multi-hop relationships (customer → orders → products → returns), temporal sequences (the order in which events happened), and graph topology (how entities connect to each other). A Stanford study found that manual feature engineering explores only 4-17% of the first-order feature space. For a 5-table database with 50 columns, there are 1,200 first-order features, 719,400 pairwise interactions, and ~8,000 multi-hop features. Flattening collapses this into a handful of aggregates.

How does KumoRFM compare to TabPFN on benchmarks?

On RelBench (7 databases, 30 tasks, 103M+ rows across 51 tables), KumoRFM zero-shot achieves 76.71 AUROC averaged across 12 classification tasks. TabPFN is designed for single flat tables up to 50K samples and cannot natively process multi-table relational data. On the SAP SALT benchmark, KumoRFM scored 91% compared to LLM+AutoML at 63% and PhD+XGBoost at 75%.

What is the feature space explosion problem?

For a database with 5 tables and 50 columns, there are 1,200 possible first-order features (column-aggregation combinations), 719,400 pairwise interactions, and approximately 8,000 multi-hop features spanning 3+ table joins. Human data scientists explore only 4-17% of this space. Tabular foundation models never see it at all because they receive a pre-flattened table. Relational foundation models search this space automatically.

How long does it take to get predictions from a relational foundation model?

KumoRFM zero-shot predictions take approximately 1 second. By comparison, manual feature engineering for the same task takes an average of 12.3 hours and 878 lines of code (Stanford study). Fine-tuning KumoRFM for higher accuracy takes minutes to hours depending on dataset size, but still eliminates the feature engineering step entirely.

Relational Foundation Models vs Tabular Foundation Models: Why Flat Tables Lose Signal | Kumo.ai

Q: Does KumoRFM work on single-table data too?

Yes. KumoRFM 2.0 supports both tabular (single-table) and relational (multi-table) data. For single-table problems, it performs competitively with tabular foundation models. For multi-table problems, it significantly outperforms them because it reads relational structure that tabular models cannot access.

A new category of foundation models is emerging for structured data. TabPFN (published in Nature by PriorLabs) and Nexus (by Fundamental, valued at $1.2B with $255M in funding) call themselves tabular foundation models. KumoRFM (by Kumo.ai) calls itself a relational foundation model.

The naming difference is not marketing. It describes a deep architectural divide: tabular models operate on a single flat table. Relational models operate on multiple connected tables. This distinction determines what these models can and cannot learn from enterprise data.

Enterprise databases are not flat. A typical CRM has leads, contacts, activities, opportunities, and accounts linked by foreign keys. A banking system has customers, accounts, transactions, merchants, and products across dozens of tables. When you flatten this structure into a single table for a tabular model, you lose the multi-hop relationships that drive prediction accuracy.

foundation_model_comparison

capability	Relational FM (KumoRFM)	Tabular FM (TabPFN / Nexus)	AutoML (DataRobot / H2O)	Manual ML (XGBoost / LightGBM)
Input format	Multiple relational tables	Single flat table	Single flat table	Single flat table
Feature engineering required	None	Flattening required	Full manual pipeline	Full manual pipeline
Multi-hop pattern discovery	Native (graph transformer)	Not possible	Not possible	Manual joins only
Temporal sequence handling	Native (temporal graph)	Static snapshot only	Static snapshot only	Manual windowing
Max data scale	103M+ rows across 51 tables	~50K samples (TabPFN)	Varies by platform	Unlimited (manual)
Zero-shot prediction	Yes (~1 second)	Yes (~2.8 seconds for TabPFN)	No	No
Pre-training data	10,000s of heterogeneous databases	Synthetic + single-table datasets	N/A	N/A
AUROC on RelBench (avg)	76.71 (zero-shot), 81.14 (fine-tuned)	N/A (single-table only)	~64-66 (estimated)	62.44 (LightGBM)

Highlighted rows show the key differentiators: input format, multi-hop pattern discovery, and temporal handling. These are architectural differences, not tuning differences.

What flattening actually destroys

The core argument for relational foundation models is that flattening relational data into a single table loses predictive signal. This is not theoretical. Here is a concrete example.

Consider lead scoring in a CRM database with four tables: leads, contacts, activities, and opportunities. Lead L-302 is a real candidate for conversion.

what_tabular_models_receive (flattened row for L-302)

lead_id	emails_opened	pages_viewed	days_since_signup	company_size
L-302	4	22	30	200

After flattening to a single row, L-302 looks like a mediocre lead: few emails opened, moderate page views, small company. A tabular foundation model or XGBoost sees only this.

what_relational_models_read (raw multi-table signals for L-302)

table	data_for_L-302	signal_invisible_after_flattening
contacts	4 contacts from 3 departments active	Multi-threaded buying committee (3 departments engaged)
activities	Blog → Case study → API docs → Demo (in sequence)	Buying-stage content progression (awareness → evaluation → technical → purchase)
opportunities	Similar account closed $210K last quarter	Account similarity to past closed-won deals
accounts	Company raised Series B 30 days ago	Firmographic momentum (fresh funding = budget available)

The relational model reads all four tables directly. L-302 has a multi-threaded buying committee, a textbook content progression, account similarity to a $210K closed deal, and fresh Series B funding. None of these signals survive flattening into emails_opened=4, pages_viewed=22.

The flattening ceiling. When you flatten relational data into a single table for a tabular foundation model, you permanently destroy a multi-table signal graph, reducing it to a handful of scalar aggregates. Lead L-302’s 4 contacts from 3 departments becomes emails_opened=4. A buying-stage content progression becomes pages_viewed=22. A $210K similar account becomes invisible. This is not a penalty that a better algorithm can overcome. It is a hard ceiling created by information loss. The data is permanently destroyed -- no model, no matter how sophisticated, can recover it. The tabular model is not wrong -- it is blind.

The feature space explosion

The information loss from flattening is not a matter of laziness. It is a matter of combinatorial scale. For a database with 5 tables and 50 columns:

1,200 first-order features - column-aggregation combinations (sum, mean, count, max, min across time windows for each numeric column, per join path)
719,400 pairwise interactions - combinations of first-order features (ratios, products, differences)
~8,000 multi-hop features - patterns that span 3+ table joins (customer → orders → products → return rates)

A Stanford study found that human data scientists explore only 4-17% of the first-order feature space. They never touch the pairwise or multi-hop spaces because the combinatorics are intractable manually. Feature engineering takes 12.3 hours and 878 lines of code per task.

Tabular foundation models skip this problem entirely - not by solving it, but by requiring someone else to solve it first. They receive a pre-flattened table and operate within whatever feature space a human decided to create. Relational foundation models search the full feature space automatically because they read the raw relational structure.

Tabular Foundation Models (TabPFN, Nexus)

Operate on a single flat table
Require pre-flattened input (someone must join and aggregate)
Cannot discover multi-hop relationships across tables
Cannot preserve temporal event sequences
Limited to ~50K samples (TabPFN) or single-table scale
Search only the feature space that humans pre-built

Relational Foundation Model (KumoRFM)

Operate on multiple connected tables natively
Read raw relational databases with foreign keys directly
Discover multi-hop patterns (customer → orders → products → returns)
Preserve temporal dynamics as first-class graph structure
Tested on 103M+ rows across 51 tables (RelBench)
Search the full combinatorial feature space automatically

Benchmark results: RelBench

RelBench is the standard benchmark for relational prediction tasks: 7 databases, 30 tasks, 103 million+ rows across 51 tables. It tests whether models can extract predictive signal from multi-table relational data.

AUROC (Area Under the Receiver Operating Characteristic curve) measures how well a model distinguishes between positive and negative outcomes. An AUROC of 50 means random guessing, 100 means perfect prediction. Moving from 65 to 77 AUROC means the model correctly ranks a true positive above a true negative 77% of the time instead of 65%.

relbench_classification_benchmarks (avg across 12 tasks)

approach	AUROC	human_hours_per_task	notes
LightGBM + manual features	62.44	12.3	878 lines of feature code per task
LLM baseline (Llama 3.2 3B)	68.06	~0.5	Prompt-based, no relational reasoning
KumoRFM zero-shot	76.71	~0.001	~1 second, no feature engineering
KumoRFM fine-tuned	81.14	~0.1	10-30% improvement over zero-shot

Highlighted: KumoRFM zero-shot outperforms manually engineered features by 14+ AUROC points while requiring ~1 second instead of 12.3 hours. Fine-tuning adds another 4.4 points.

sap_salt_benchmark

approach	accuracy	context
LLM + AutoML	63%	Language model with automated model selection
PhD + XGBoost	75%	Domain expert with manual feature engineering
KumoRFM	91%	Zero-shot relational foundation model

On the SAP SALT enterprise benchmark, KumoRFM outperforms both automated and expert-manual approaches by 16-28 percentage points.

The gap between LightGBM with manual features (62.44) and KumoRFM zero-shot (76.71) is not about model architecture. LightGBM is a strong algorithm. The gap exists because KumoRFM sees the full relational structure - the multi-hop patterns, the temporal sequences, the graph topology - while LightGBM sees only the features that a human decided to build from 4-17% of the feature space.

How tabular foundation models work

To understand the architectural divide, it helps to know what tabular foundation models actually do.

TabPFN (PriorLabs)

TabPFN, published in Nature, is a prior-data fitted network. It is pre-trained on millions of synthetic datasets that mimic the statistical properties of real-world tabular data. At inference time, it takes a single flat table (up to ~50K samples), treats the training data as context, and predicts the target column in approximately 2.8 seconds. It performs well on small, single-table classification tasks.

Nexus (Fundamental)

Nexus, developed by Fundamental ($255M in funding, $1.2B valuation), calls itself a “Large Tabular Model.” Like TabPFN, it operates on a single flat table. It is pre-trained on large collections of real-world tabular datasets and uses in-context learning to generate predictions without task-specific training.

The shared limitation

Both TabPFN and Nexus assume their input is a single table with one row per entity and one column per feature. They cannot read a relational database with foreign key relationships. They cannot discover that a customer’s churn risk depends on the return rates of products they bought from merchants in a specific category. That pattern spans four tables and three join hops - it does not exist in any single table.

How the relational foundation model works

KumoRFM takes a structurally different approach. It represents your database as a temporal heterogeneous graph: each row in each table becomes a node, each foreign key relationship becomes an edge, and timestamps are preserved as temporal attributes.

A graph transformer processes this structure by passing messages along edges (foreign key relationships), learning which cross-table patterns are predictive. Multi-hop patterns propagate naturally through the graph layer by layer. KumoRFM is pre-trained on tens of thousands of heterogeneous databases, so it has already learned the universal relational patterns: recency effects, frequency dynamics, temporal decay, graph topology signals.

PQL Query

PREDICT conversion
FOR EACH leads.lead_id
WHERE leads.status = 'open'

One Predictive Query Language (PQL) statement replaces the entire pipeline: data extraction, table joins, feature engineering, model selection, and training. The relational foundation model reads raw CRM tables - leads, contacts, activities, opportunities - and generates predictions in seconds.

Output

lead_id	conversion_prob	vs_flat_table_model	signal_source
L-301	0.42	0.39 (similar)	Single-table signals sufficient
L-302	0.89	0.34 (flat model misses)	Multi-threaded buying committee + content progression
L-303	0.12	0.18 (flat model overestimates)	CTO title inflates flat score, but no activity signal
L-304	0.76	0.41 (flat model misses)	Account similarity to $210K closed deal

KumoRFM 2.0: tabular and relational

An important development: KumoRFM 2.0 supports both tabular (single-table) and relational (multi-table) data. This means it is not limited to relational problems. For single-table classification tasks, it performs competitively with tabular foundation models. For multi-table problems, it dramatically outperforms them.

This matters because real enterprise ML involves both types of problems. Some tasks genuinely are single-table (a clean CSV export, a feature store snapshot). Most tasks involve relational data. With KumoRFM 2.0, you do not need separate tools for separate problem types. One model handles both, and it automatically takes advantage of relational structure when it exists.

cost_and_time_comparison

dimension	Manual ML (LightGBM)	Tabular FM (TabPFN / Nexus)	Relational FM (KumoRFM)
Feature engineering time	12.3 hours (878 lines of code)	12.3 hours (still need to flatten)	0 hours (reads tables directly)
Model training time	1-4 hours	~2.8 seconds (TabPFN)	~1 second (zero-shot)
Multi-table signal captured	4-17% of feature space	Only what humans pre-built	Full relational feature space
Cost at 20 tasks (annual)	$650K-$900K	$500K-$750K (saves model tuning, not features)	$80K-$120K
Time to new prediction task	2-4 weeks	1-3 weeks (still need feature pipeline)	Minutes
Data scientist headcount	3-4 FTEs	2-3 FTEs (still need feature engineers)	0.5 FTE

Highlighted: feature engineering time and annual cost. Tabular foundation models save time on model training but not on the 12.3-hour feature engineering bottleneck. The cost savings of relational FMs come from eliminating that bottleneck entirely.

When tabular foundation models make sense

Tabular foundation models are not useless. They solve a real problem for a specific subset of ML tasks:

Genuinely single-table problems. If your data is already in one clean table with no relational context needed (a Kaggle dataset, a pre-built feature store export), tabular FMs provide fast, competitive predictions without model training.
Small datasets. TabPFN excels on small datasets (under 50K samples) where traditional models struggle. Its prior-data fitting approach is particularly effective when data is scarce.
Rapid prototyping on flat data. For quick directional answers on pre-aggregated data, tabular FMs give you a prediction in seconds.

The limitation is that most enterprise prediction tasks are not single-table problems. Enterprise data lives in relational databases with 5-50 connected tables. The moment your task requires signals from more than one table, a tabular foundation model requires someone to build the flattening pipeline - and you are back to the 12.3-hour, 878-line feature engineering bottleneck.

The architectural divide

The difference between tabular and relational foundation models is not incremental. It is architectural. Tabular models ask: “Given this flat table, what is the best prediction?” Relational models ask: “Given this database, what are the best predictions?”

The first question assumes someone has already flattened the relational data and selected the features. The second question starts from raw tables with foreign keys. The first question searches within a human-defined feature space. The second searches the full combinatorial space of 1,200+ first-order features, 719,400 pairwise interactions, and 8,000+ multi-hop patterns.

Tabular foundation models are a meaningful advance over manual model tuning for single-table problems. But they do not address the core bottleneck of enterprise ML: converting relational data into features. Relational foundation models eliminate that bottleneck by reading the relational structure directly.

KumoRFM was built by the team behind the ML systems at Pinterest, Airbnb, and LinkedIn: Vanja Josifovski (CEO, former CTO at Airbnb and Pinterest), Jure Leskovec (Chief Scientist, Stanford professor, co-creator of GraphSAGE), and Hema Raghavan (Head of Engineering, former Sr. Director at LinkedIn). Backed by Sequoia Capital.

Key Takeaways

1Tabular foundation models (TabPFN, Nexus) operate on a single flat table. Relational foundation models (KumoRFM) operate on multiple connected tables natively. Enterprise data lives in 5-50 related tables - the architecture determines what the model can learn.
2Flattening relational data destroys multi-hop relationships, temporal sequences, and graph topology. For a 5-table database, there are 1,200 first-order features, 719,400 pairwise interactions, and ~8,000 multi-hop features. Humans explore 4-17% of this space. Tabular models see even less.
3On RelBench, KumoRFM zero-shot (76.71 AUROC) outperforms LightGBM with manual features (62.44) by 14+ points and LLM baselines (68.06) by 8+ points. On SAP SALT, KumoRFM (91%) outperforms PhD+XGBoost (75%) and LLM+AutoML (63%).
4KumoRFM 2.0 supports both tabular and relational data. It handles single-table problems competitively while dramatically outperforming on the multi-table problems where tabular foundation models cannot operate.
5The 85% cost reduction ($650K-$900K to $80K-$120K at 20 tasks) comes from eliminating the 12.3-hour feature engineering bottleneck that tabular foundation models leave fully intact.

Relational Foundation Models vs Tabular Foundation Models: Why Flat Tables Lose Signal