Open any ML textbook, any online course, any Kaggle competition. The data arrives as a single CSV file. One row per sample, one column per feature, one target variable. The model trains on this table and produces predictions. Clean, simple, well-understood.
Now look at where enterprise data actually lives. A PostgreSQL database with 15 tables. A Snowflake warehouse with 40 tables across 6 schemas. A data lake with hundreds of Parquet files organized by domain. The data that predicts customer churn, credit default, or next purchase is spread across all of these tables, connected by foreign keys and temporal relationships.
The standard approach is to flatten this relational structure into a single table through feature engineering: write SQL joins, compute aggregations, create derived features. This produces the familiar CSV that models expect. And it destroys information in the process.
The question is: how much information is destroyed, and does it matter?
The three types of information destruction
1. Multi-hop relationships
A customer's churn risk depends not just on their own behavior but on the behavior of the products they bought, the other customers who bought those products, and the churn rates of those customers. That is a 4-hop path through the relational graph: customer → orders → products → orders (other customers) → customer outcomes.
No data scientist writes this feature. It is not that the SQL is hard. It is that no one thinks to look for it. The feature space of possible multi-hop aggregations is combinatorially large, and humans explore a tiny fraction. A typical feature engineering effort covers 1-hop relationships (direct aggregates from immediately joined tables) and occasionally 2-hop relationships. Patterns at 3-4 hops are systematically invisible.
On the RelBench benchmark, the tasks where relational models most dramatically outperform flat-table models are those where multi-hop patterns carry significant signal. The Amazon product recommendation task, which involves a customer-review-product graph with 3-hop patterns, shows a 15+ point AUROC gap between flat and relational approaches.
2. Temporal sequences
When you aggregate a customer's transaction history into "total orders in last 30 days," you destroy the sequence. Consider two SaaS customers who both logged 20 sessions in the past month:
sessions: User A (disengaging)
| session_id | date | duration_min | features_used |
|---|---|---|---|
| S-101 | Mar 1 | 42 | 5 |
| S-102 | Mar 2 | 38 | 4 |
| S-103 | Mar 3 | 35 | 4 |
| S-104 | Mar 4 | 28 | 3 |
| S-105 | Mar 5 | 22 | 2 |
20 sessions crammed into week 1, then nothing for 3 weeks. Duration and feature usage declining each day. This user is abandoning the product.
sessions: User B (deepening)
| session_id | date | duration_min | features_used |
|---|---|---|---|
| S-201 | Mar 1 | 15 | 2 |
| S-202 | Mar 8 | 22 | 3 |
| S-203 | Mar 15 | 31 | 4 |
| S-204 | Mar 22 | 40 | 5 |
| S-205 | Mar 29 | 48 | 7 |
1 session per week, steady cadence, increasing duration and feature adoption. This user is deepening engagement.
flat_feature_table (what the model sees)
| user | sessions_30d | avg_duration | avg_features_used | reality |
|---|---|---|---|---|
| User A | 20 | 33 min | 3.6 | Disengaging (churn in 2 weeks) |
| User B | 20 | 31 min | 4.2 | Deepening (expansion candidate) |
Both users show 20 sessions and similar averages. User A crammed 20 declining sessions into week 1, then disappeared. User B steadily increased engagement over 4 weeks. The flat table erased the trajectory.
These temporal patterns carry strong predictive signal. Accelerating purchase frequency predicts expansion. Decelerating frequency predicts churn. Category migration predicts lifetime value growth. Payment timing drift predicts credit default. All of these patterns exist in the raw transaction data. All of them are erased by aggregation.
The flat-table workaround is to create more granular time windows: instead of "orders last 30 days," compute "orders in days 1-7," "orders in days 8-14," "orders in days 15-21," "orders in days 22-30." This captures some temporal structure but at the cost of feature explosion. Four time windows for each of 10 aggregations across 5 tables produces 200 features. And the sequence within each window is still lost.
3. Graph topology
The structure of connections around an entity carries information that cannot be captured in a flat row. Consider two sellers on a marketplace platform, both with identical flat-table metrics:
transactions: Seller X (embedded in community)
| txn_id | seller | buyer | amount | repeat_buyer |
|---|---|---|---|---|
| T-401 | Seller X | Buyer A | $85 | Yes (3rd purchase) |
| T-402 | Seller X | Buyer B | $120 | Yes (2nd purchase) |
| T-403 | Seller X | Buyer C | $65 | Yes (5th purchase) |
| T-404 | Seller X | Buyer D | $90 | No |
Seller X has repeat buyers. Buyers A, B, and C also buy from Seller Y and Seller Z, forming a tightly connected community of trusted sellers.
transactions: Seller W (isolated)
| txn_id | seller | buyer | amount | repeat_buyer |
|---|---|---|---|---|
| T-501 | Seller W | Buyer E | $95 | No |
| T-502 | Seller W | Buyer F | $110 | No |
| T-503 | Seller W | Buyer G | $70 | No |
| T-504 | Seller W | Buyer H | $85 | No |
Seller W has no repeat buyers. None of Seller W's buyers buy from any other seller on the platform. No community embedding.
flat_feature_table (what the model sees)
| seller | txn_count | total_revenue | avg_order | unique_buyers | reality |
|---|---|---|---|---|---|
| Seller X | 4 | $360 | $90 | 4 | Trusted, community-embedded, high LTV |
| Seller W | 4 | $360 | $90 | 4 | Isolated, no repeat buyers, high churn risk |
Identical flat features: same count, same revenue, same average, same buyer count. The graph reveals Seller X is embedded in a community of repeat buyers while Seller W's buyers are one-time, isolated transactions.
Graph topology goes deeper than degree counts. The clustering coefficient (do a customer's merchants also share other customers?) indicates whether the customer is embedded in a community or operating in isolation. The path length to high-value nodes (how many hops to reach a VIP customer through shared product connections?) indicates growth potential. These are structural properties of the graph that flat features cannot represent.
Single-table ML
- One row per entity, manually engineered features
- 1-2 hop relationships captured at best
- Temporal sequences aggregated into counts and averages
- Graph topology reduced to simple degree counts
- 62.44 AUROC on RelBench benchmark
Relational ML
- Full multi-table structure preserved as a graph
- 3-4 hop patterns discovered automatically via message passing
- Temporal sequences processed in raw form
- Full graph topology captured: clustering, path lengths, community
- 75.83-81.14 AUROC on RelBench benchmark
raw relational data — customers table
| customer_id | name | signup_date | segment | region |
|---|---|---|---|---|
| C-201 | Aisha Patel | 2023-08-12 | Premium | West |
| C-202 | Tom Nguyen | 2024-01-05 | Basic | East |
| C-203 | Sarah Klein | 2023-03-22 | Premium | Central |
| C-204 | Marcus Lee | 2024-06-18 | Basic | West |
flattened_feature_table — what XGBoost sees
| customer_id | orders_90d | avg_value | support_tickets | days_inactive | churned |
|---|---|---|---|---|---|
| C-201 | 8 | $72.40 | 1 | 5 | No |
| C-202 | 1 | $45.00 | 4 | 62 | Yes |
| C-203 | 12 | $110.50 | 0 | 2 | No |
| C-204 | 2 | $38.20 | 2 | 41 | ? |
After flattening: C-202 has 4 support tickets, but the model cannot see that 3 were 'cancellation' type filed in the last 2 weeks. C-204's 2 tickets were routine billing inquiries. Same count, completely different signal.
A concrete example: predicting customer churn
Consider an e-commerce database with five tables: customers, orders, products, reviews, and support tickets. You want to predict which customers will churn in the next 90 days.
The flat-table approach
A data scientist writes SQL to produce features like: total orders (last 30/60/90 days), average order value, number of distinct product categories, total returns, number of support tickets, average review score, days since last order, days since last support ticket. After 10-15 hours, the result is a table with 50-100 features per customer. LightGBM trains in minutes and produces a decent model.
What the flat model misses
The support-then-purchase sequence. Customers who file a support ticket and then purchase within 7 days are satisfied with the resolution and unlikely to churn. Customers who file a support ticket and do not purchase within 21 days are dissatisfied and highly likely to churn. The flat model sees "1 support ticket" and "2 orders in 30 days" as separate features. The sequence and timing between them is lost.
Product quality propagation. A customer who purchased a product with a 2.1-star average review is at higher churn risk than a customer who purchased a product with a 4.5-star average, even if neither customer has left a review themselves. This is a 2-hop pattern (customer → orders → products → reviews) that the flat model would only capture if a data scientist explicitly computed "average review score of purchased products." Most do not.
Cohort behavior. If 30% of customers who bought the same product in the same week churned within 60 days, the remaining customers from that cohort face elevated risk. This is a graph-level pattern: customer → order → product → order (same product, same time) → customer outcome. The flat model cannot see it.
The relational approach
A relational model represents all five tables as a graph. Customer nodes connect to order nodes, which connect to product nodes, which connect to review nodes. Support ticket nodes connect to customer nodes. Timestamps create temporal ordering.
The graph neural network propagates information along all these connections. After 3-4 rounds of message passing, each customer node's representation contains: their own purchase history and support interactions (1-hop), the quality and review scores of their purchased products (2-hop), the behavior of other customers who bought the same products (3-hop), and the aggregate outcomes of those customers (4-hop).
The model discovers which of these patterns are predictive. No human specifies features. The accuracy improvement comes from the multi-hop and temporal patterns that flattening destroys.
When single-table ML is enough
Relational ML is not always necessary. Single-table ML is sufficient when:
- The data is genuinely single-table (sensor readings from one device, survey responses, image metadata)
- The features have been pre-engineered by domain experts who captured the key cross-table patterns
- The prediction task depends primarily on entity-level attributes rather than relational context (e.g., predicting a product's weight from its description)
- The relational structure is shallow (2 tables with a simple one-to-many relationship) and 1-hop aggregates capture most of the signal
In enterprise settings, these conditions rarely hold. The databases have 5-50 tables. The predictive patterns span multiple hops. The temporal dynamics matter. And the data science team has already tried flat-table ML and hit an accuracy ceiling.
The foundation model bridge
KumoRFM is a foundation model pre-trained on relational patterns across thousands of databases. It represents the relational database as a temporal heterogeneous graph and generates predictions directly from the raw structure.
For the churn prediction example above, the workflow is:
PQL Query
PREDICT churn_90d FOR EACH customers.customer_id
One line of PQL replaces the entire flattening process. The model reads the raw relational tables and preserves multi-hop patterns, temporal sequences, and graph topology.
Output
| customer_id | churn_90d | confidence | top_signal |
|---|---|---|---|
| C-201 | 0.09 | 0.95 | Consistent engagement, single resolved ticket |
| C-202 | 0.93 | 0.92 | 3 cancellation tickets in 14 days + declining orders |
| C-203 | 0.04 | 0.97 | High frequency, premium segment, zero tickets |
| C-204 | 0.41 | 0.86 | Moderate inactivity, routine ticket history |
The model returns a churn probability for every customer, incorporating the full relational context: multi-hop patterns, temporal sequences, and graph topology. No flattening, no feature engineering, no information destruction.
The 13-19 point accuracy gap between flat and relational ML is not a theoretical curiosity. It is revenue left on the table. Every churn prediction that the flat model gets wrong is a customer who could have been retained with the right intervention. Every credit risk prediction that the flat model gets wrong is a default that could have been avoided or a creditworthy borrower who was denied. The information destroyed by flattening has a dollar value. Relational ML recovers it.