Kumo Co-Founder Hema Raghavan Named to Inc.’s 2026 Female Founders 500

Learn more

Use case

Lead scoring powered by a relational foundation model

Which leads will convert is a million-dollar question - and most companies answer it badly. Hire an ML team, and they spend months hand-crafting features from flat tables, capturing a fraction of the signal. Try LLMs, and they tokenize your CRM data without understanding the relationships between tables. KumoRFM is the world's first foundation model for relational data. It connects directly to your data warehouse, learns from the relationships across CRM, product usage, support, and billing tables, and delivers 85%+ AUROC - with zero feature engineering, in under an hour.

Book a demo and get a free trial of the full platform: data science agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

The world's first relational foundation model

Traditional ML teams spend months on feature engineering. LLMs treat your data as tokens and miss the relationships. Kumo is neither - it's a new category.

Works on raw relational data

Connects to your database - no ETL, no flat tables

Zero feature engineering

Graph transformer discovers signals across tables automatically

Continuously learning

Models retrain as your data changes

One platform, every prediction

Churn, LTV, fraud, recommendations - same architecture

Lower infrastructure cost

Replace per-use-case pipelines with a single platform

Stanford RelBench Benchmark

Superhuman accuracy - out of the box

On Stanford's RelBench benchmark - across 30 prediction tasks on 11 relational databases - KumoRFM outperforms both LLM-based approaches and expert PhD data scientists with hand-crafted features. Before any fine-tuning.

LLM

GPT-4 + AutoML

63%

PhD Data Scientist

Manual feature engineering

75%

Kumo

Relational Foundation Model

81%
Superhuman

Average AUROC across 30 prediction tasks on 11 relational databases. Source: RelBench by Stanford (NeurIPS 2024).

In practice

A B2B company connects their CRM, product usage, support, and billing tables to Kumo. Within hours, KumoRFM delivers lead scores that already outperform their existing model - with zero feature engineering. Then they fine-tune on their Kumo Enterprise instance, incorporating proprietary signals specific to their sales cycle, pushing accuracy even further - because every percentage point matters.

No separate pipelines. No months of iteration. No waiting.

Built by pioneers in AI

Vanja Josifovski

Vanja Josifovski

CEO and Co-Founder

Former CTO at Airbnb and Pinterest

Jure Leskovec

Jure Leskovec

Co-Founder & Chief Scientist

Stanford Professor · Co-creator of RDL and GNN

Hema Raghavan

Hema Raghavan

Co-Founder & Head of Engineering

Former Sr. Director of Engineering at LinkedIn

How Kumo is different

A foundation model for predictions -no training required

You know how ChatGPT can answer questions about language without you training it first? KumoRFM works the same way -but for predictions on your business data. Connect your database, and ask questions like:

Which customers will churn next month?What should we recommend to user #4,821?Which leads will convert this quarter?What will demand look like next week?

Kumo gives you accurate answers instantly -without feature engineering, without building a separate model for each question, and without a months-long ML project. Here's why the existing approaches fall short:

LLMs (ChatGPT, etc.)
userboughtitem_42$29.992024-01LLMtreats asflat tokens???No relations

Treats your data as flat tokens

  • Numbers, dates, and IDs are just text tokens -no numeric understanding
  • Cannot represent one-to-one, one-to-many, or many-to-many relationships
  • Has no concept of which relationships are stronger or weaker
  • Cannot reliably answer predictive questions about your business data
Not built for prediction
Traditional ML
UsersOrdersItemsEventsmergeFeatureEngineering3-6 monthsLarge ML teamChurnRecsLTVFraud×4 modelsEach: 3-6 months$50K-$1M+

Requires months of engineering per model

  • Hire a large team of ML engineers and data scientists
  • Manually merge tables into flat feature tables -thousands of possible combinations
  • Build and maintain a separate model for each prediction task
  • 3–6 months and $50K–$1M+ per use case, with no guarantee of accuracy
Slow, expensive, doesn't scale
KumoRecommended
UsersOrdersItemsEventsConnectGraph TransformerChurn0.03RecsTop 5LTV$12KFraudSafe1 model

A foundation model for your relational data

  • Connect your database -Kumo automatically discovers every relationship
  • Graph transformers understand 1:1, 1:many, and many:many structures natively
  • One model delivers instant predictions across all your tasks -no feature engineering
  • Fine-tune with the enterprise platform to push accuracy even further, 20× faster
Instant, accurate, one platform

In practice: a retail company connects their customers, orders, products, and events tables to Kumo. Within hours, they get churn predictions, personalized recommendations, and demand forecasts -all from the same model, all more accurate than what their data science team built over six months. Then they fine-tune on their enterprise instance to push accuracy even further, because every percentage point matters. No separate models. No feature engineering. No waiting.

3.2x

More accurate

Than traditional ML on lead conversion

<1 hr

To production

From raw tables to deployed scores

85%+

AUROC

On real-world B2B lead conversion

0

Manual features

Discovered automatically from relational data

Trusted by innovative companies

In production today

Kumo in action

Leading companies use Kumo to drive measurable business outcomes from their existing relational data.

D
5.4xconversion lift

Lead-scoring models built on Kumo's relational deep learning delivered 5.4x improvement in lead conversion.

D
1.8%engagement lift · 30M+ users

Notification customization models fueling MAU growth across 30M+ users with graph transformers.

Read story
i
+10%conversion rate

Enhanced recommendation accuracy across 80M monthly predictions, driving measurable revenue growth.

S
UK #1retailer

Powering personalized product recommendations for one of the UK's largest and most established supermarket chains.

S
Expansionrevenue

Personalized solutions recommendations driving expansion revenue growth through relational data intelligence.

Y
+15%ad clickthroughs

New recommendation model delivering measurable uplift in ad targeting and clickthrough performance.

The problem

Two approaches. Neither works.

Hire an ML team: They spend months hand-crafting features from flat tables - company size, industry, last email open. This captures a fraction of the signal. It costs $50K to $1M+ and 6–12 weeks per model. Sales reps learn to ignore the scores within weeks. And 53–88% of these models never even reach production (Gartner, IDC).

Try LLMs: They tokenize your CRM data - treating rows and columns as text. They have no concept of primary keys, foreign keys, or the relationships between contacts, accounts, and opportunities. They miss the structural signal that actually predicts conversion.

The problem is not the algorithm. It is the data representation. Neither approach understands relational data.

The approach

KumoRFM: built for relational data.

KumoRFM connects directly to your data warehouse and operates on your relational database as-is. It learns from the relationships between contacts, accounts, opportunities, product usage events, and support interactions - no feature engineering required.

The Graph Transformer attends across multiple columns, multiple tables, and multiple hops - discovering signals that no feature engineer would think to build and no LLM could infer from tokenized text.

KumoRFM already beats human data scientists out of the box - without any training on your data. Fine-tune on Kumo Enterprise to push accuracy even further. The result: 85%+ AUROC on real-world lead conversion, with interpretable attribution for every score.

Setup takes less than an hour. No ETL. No feature store. No hiring.

How it works

01

Connect your tables

Point Kumo at your CRM, product usage, marketing, support, and billing tables. Kumo infers the relational schema automatically - connecting tables via primary key and foreign key relationships.

02

Define the prediction in PQL

Predictive Query Language reads like SQL but describes future outcomes. No ML expertise required.

PREDICT COUNT(orders.*, 0, 7) > 0
FOR EACH users.user_id
IN (0, 1, 2)

“Which users will place an order in the next 7 days?”

03

Deploy with explanations

Kumo trains a relational deep learning model, evaluates accuracy, and deploys to production. Scores update continuously. Every prediction includes human-readable explanations rooted in your actual data.

Backed by peer-reviewed research

What makes Kumo different

Not incremental tuning. A fundamentally different data representation - validated across 40+ peer-reviewed papers at NeurIPS, ICML, and KDD by the team behind PyTorch Geometric.

NeurIPS 2024ICML 2024KDD 202440+ papers
1-hop2-hop3-hopLead
NeurIPS 2024Relational Deep Learning
3+ hopsof relational depth per prediction

Multi-hop graph attention captures signals no feature table can

The Graph Transformer attends across multiple tables and multiple hops simultaneously - learning from a contact's account history, their colleagues' product usage, support ticket patterns, and billing events. Fey, Hu, Huang & Leskovec showed this produces fundamentally more accurate predictions than any flat-table approach.

Read the paper
CRMUsageSupportBillingAuto-learnedf1f2f3f4
NeurIPS 2024Relational Deep Learning
0manual features required

Features are discovered automatically from your relational data

Traditional scoring requires hand-crafting 50-100 features from joined tables - 80% of project time and $50K-$1M+ per model (McKinsey). The RDL paper proved that learning directly from relational structure eliminates this. Kumo discovers filters, correlations, and aggregations no engineer would think to build.

Read the paper
RetailFinanceHealthSaaSPre-trainedKumoRFMZero-shotNo trainingFew-shotMinutesFine-tunedHoursTransfer learns
ICML 2024KumoRFM
Zero-shotpredictions on unseen schemas

A pre-trained foundation model that transfers across schemas

KumoRFM is pre-trained on diverse relational databases - retail, finance, healthcare, SaaS. Like how GPT transfers language understanding to new tasks, KumoRFM transfers relational understanding to your data. Huang, Fey & Leskovec showed this enables accurate predictions even with limited historical data.

Read the paper
Average AUROC across RelBench63%LLM75%PhD +features81%KumoBest11 DBs, 30 tasks
NeurIPS 2024 - Datasets TrackRelBench
81%AUROC vs 75% PhD vs 63% LLM

Proven on 30 tasks across 11 databases - outperforms PhD experts

Stanford's RelBench benchmark tested predictions across 11 real-world relational databases and 30 tasks. Kumo achieved 81% average AUROC - compared to 75% from PhD data scientists with hand-crafted features and 63% from LLM-based approaches.

Read the paper

What this means in practice: a contact who works at an account where three colleagues already use your product, who opened a pricing page after a support ticket was resolved, scores differently than a cold lead with the same job title. Traditional models cannot see this. Kumo can.

Technical walkthrough

From connecting Snowflake tables to deployed, explainable lead scores - in a single session.

Proven at scale

Lead scoring is one of 55+ proven use cases

The same platform powers predictions across acquisition, retention, personalization, fraud, and more.

New customer acquisition

DatabricksRoAllstate

Customer loyalty & retention

NVIDIAEventbriteStone

Product personalization

RedditSotheby'sBET NetworksPayPal

Next best action

SnowflakeiFoodDiscord

Optimizing growth

Modern TreasuryCleveland ClinicDoorDash

Entity resolution

Sainsbury'sPendulumYieldmo

Fraud detection

CoinbaseChimeInstacart

Forecasting

CatalinaVEG

The real cost of traditional lead scoring

Industry data from McKinsey, Gartner, and Dimension Research shows why most lead scoring projects fail.

Input signals

Traditional

50–100 hand-crafted features

With Kumo

Entire relational database

Setup time

Traditional

6–12 weeks of engineering

With Kumo

Under 1 hour

Cost per model

Traditional

$50K – $1M+ (McKinsey)

With Kumo

Fraction - single platform

AUROC

Traditional

0.65–0.75

With Kumo

0.85+ (RelBench validated)

Explainability

Traditional

Black box or basic SHAP

With Kumo

Human-readable relational explanations

Maintenance

Traditional

~30% annual cost (Dimension Research)

With Kumo

Continuous automatic retraining

Peer-reviewed

Open research your team can evaluate

Kumo is built on 40+ peer-reviewed papers at NeurIPS, ICML, and KDD. The methodology is public and reproducible.

RFMZero-shotFine-tunedTransfer
ICML 2024

KumoRFM: A Relational Foundation Model for Predictive Analytics

K. Huang, M. Fey, J. Leskovec et al.

A foundation model for relational data - pre-trained across schemas, it delivers accurate predictions out of the box and improves with fine-tuning on your specific data.

Read paper
ABC
NeurIPS 2024

Relational Deep Learning: Graph Representation Learning on Relational Databases

M. Fey, W. Hu, K. Huang, J. Leskovec et al.

Introduces learning predictive models directly on relational databases, eliminating the feature engineering pipeline that has historically bottlenecked enterprise ML.

Read paper
T1T2T3T4T5+20+20+23+22+35BaselineKumo30 tasks
NeurIPS 2024 · Datasets Track

RelBench: A Benchmark for Deep Learning on Relational Databases

J. Robinson, R. Miao, K. Huang et al.

An open benchmark for evaluating relational prediction methods across 11 databases and 30 tasks. Kumo consistently outperforms traditional ML baselines.

Read paper

Share this with your team

Your pipeline data already contains the conversion signal.

See what Kumo can predict from your existing relational database.