Berlin Tech Meetup: The Future of Relational Foundation Models, Systems, and Real-World Applications

Register now:
Learn14 min read

How to Add ML Predictions to Your Databricks Lakehouse

You have Delta tables, Unity Catalog, and a data team. Now you want predictions. Here are all seven ways to get there, what each requires, and why most of them still leave you building feature pipelines by hand.

TL;DR

  • 1On the SAP SALT enterprise benchmark, KumoRFM scores 91% accuracy vs 75% for PhD data scientists with XGBoost and 63% for LLM+AutoML - with zero feature engineering and zero training time.
  • 2Databricks gives you world-class data infrastructure but does not solve the ML prediction problem out of the box. You still need to choose how predictions get built, and most options require manual feature engineering across your Delta tables.
  • 3Databricks AutoML and Genie Code automate different layers. AutoML automates model selection on a flat table. Genie Code automates writing the notebook code. Neither automates the hard part: discovering predictive patterns across multiple related Delta tables.
  • 4Kumo.ai runs as a Lakehouse App that connects to Unity Catalog, reads multiple Delta tables natively, and generates predictions with a single PQL query. No feature engineering, no notebooks, no flat-table step.
  • 5The critical distinction: Genie Code automates the workflow (writes code for you). Kumo automates the prediction (understands your relational data). These are different layers of the stack, and the prediction layer is where 80% of the effort lives.

If you are on Databricks, you already have the hardest part of data infrastructure figured out. Your data lands in Delta Lake. Unity Catalog governs access. Spark handles compute. Notebooks let your team explore and transform data.

But when it comes time to add predictive ML, the options multiply and the complexity returns. Do you use Databricks AutoML? Write custom models with MLflow? Try the new Genie Code agent? Bring in DataRobot or H2O? Build a Feature Store pipeline?

Each approach makes different trade-offs on the same fundamental question: who builds the features? The answer to that question determines whether your first prediction takes minutes or months.

The headline result: SAP SALT benchmark

The SAP SALT benchmark is an enterprise-grade evaluation where real business analysts and data scientists attempt prediction tasks on SAP enterprise data. It measures how accurately different approaches predict real business outcomes on production-quality enterprise databases with multiple related tables.

sap_salt_enterprise_benchmark

approachaccuracywhat_it_means
LLM + AutoML63%Language model generates features, AutoML selects model
PhD Data Scientist + XGBoost75%Expert spends weeks hand-crafting features, tunes XGBoost
KumoRFM (zero-shot)91%No feature engineering, no training, reads relational tables directly

SAP SALT benchmark: KumoRFM outperforms expert data scientists by 16 percentage points and LLM+AutoML by 28 percentage points on real enterprise prediction tasks.

KumoRFM scores 91% where PhD-level data scientists with weeks of feature engineering and hand-tuned XGBoost score 75%. The 16 percentage point gap is the value of reading relational data natively instead of flattening it into a single table.

databricks_ml_options_compared

OptionReads Delta TablesFeature Engineering RequiredMulti-Table NativeAutonomousTime to First PredictionBest For
Kumo.ai (Lakehouse App)Yes, natively via Unity CatalogNoneYesYesMinutesMulti-table predictions at scale
Databricks AutoMLSingle table onlyFull (joins + aggregations)NoPartial (model selection only)Days to weeksSingle-table problems with existing features
Databricks Genie CodeYes (generates code to read them)AI-generated (still flat-table)NoWorkflow onlyHours to daysAccelerating notebook-based workflows
MLflow + custom codeYes (manual Spark reads)Full (manual pipelines)NoNoWeeks to monthsFull control with experienced ML team
DataRobot on DatabricksVia connectorFull (requires flat features)NoPartial (model selection only)Days to weeksEnterprise AutoML with governance
H2O Sparkling WaterVia Spark integrationFull (manual pipelines)NoPartialWeeksSpark-native distributed training
Feature Store + AutoMLFeature Store reads DeltaFull (most complex setup)NoPartialWeeks to monthsMature orgs with dedicated ML platform team

Highlighted: Kumo.ai is the only option that reads multiple Delta tables natively and generates predictions without feature engineering. Every other approach requires building a flat feature table first.

The feature engineering divide

Every option in the table above falls into one of two categories: approaches that require you to build a flat feature table from your Delta tables, and approaches that read your relational Delta tables directly.

Six of the seven options require a flat feature table. That means someone on your team has to write the Spark SQL or PySpark to join your customers table with your orders table with your products table, compute aggregations like avg_order_value_last_90d and count_support_tickets_last_30d, handle temporal leakage, and produce one row per entity. This is the step that consumes 80% of the effort in every ML project.

Option 1: Kumo.ai as a Lakehouse App

Kumo.ai is available in the Databricks Marketplace as a Lakehouse App. The integration path is: install from marketplace, connect to Unity Catalog, write a PQL query, get predictions back as a Delta table.

What makes Kumo different from every other option is what happens under the hood. Kumo's relational foundation model reads your Delta tables as a temporal heterogeneous graph. Each row in each table becomes a node. Each foreign key becomes an edge. Timestamps are preserved. The model discovers predictive patterns across tables, time windows, and relationship hops without any feature engineering.

PQL Query

PREDICT churn
FOR EACH unity_catalog.sales.customers.customer_id
WHERE customers.status = 'active'

This PQL query reads directly from Delta tables registered in Unity Catalog. Kumo's foundation model traverses the relational structure (customers, orders, products, support_tickets) and generates churn predictions without any feature engineering, joins, or aggregations.

Output

customer_idchurn_probabilitykey_signalsconfidence
C-44010.89Order frequency declining + support tickets risingHigh
C-44020.12Stable purchase pattern + no support issuesHigh
C-44030.67Category shift + payment method changedMedium
C-44040.03Increasing order value + new product adoptionHigh

No notebooks. No feature tables. No Spark jobs to maintain. The predictions land in a Delta table that any downstream process (dashboards, reverse ETL, operational systems) can consume directly.

Option 2: Databricks AutoML

Databricks AutoML is built into the workspace. You point it at a single table, it tries multiple algorithms (LightGBM, XGBoost, sklearn, Prophet), tunes hyperparameters, and produces a notebook with the winning model. It is genuinely good at model selection.

The limitation is the input requirement: a single flat table. If your prediction depends on patterns across customers, orders, and products, you must join and aggregate those tables yourself before AutoML sees the data. AutoML automates the last 20% of the pipeline. The first 80% (feature engineering) remains manual.

Option 3: Databricks Genie Code

Genie Code is Databricks' new AI agent that generates notebook code. You describe what you want in natural language, and Genie writes PySpark, SQL, and ML code to accomplish it. It can generate feature engineering code, training scripts, and evaluation logic.

This is a genuine productivity improvement. Instead of writing feature pipelines by hand, you describe them and Genie writes the code. But the underlying approach is unchanged: Genie still produces a flat feature table and trains a single-table model. It automates the workflow (writing code, running notebooks). It does not automate the prediction (understanding relational structure).

Genie Code (automates the workflow)

  • Generates PySpark code to join tables
  • Writes feature engineering logic
  • Produces a flat feature table
  • Trains a single-table model
  • Still requires human review of generated features

Kumo.ai (automates the prediction)

  • Reads Delta tables as relational graph
  • Discovers cross-table patterns automatically
  • No flat feature table needed
  • Foundation model understands relational structure
  • Predictions in minutes with zero code review

Option 4: MLflow + custom models

MLflow is the backbone of ML operations on Databricks. It tracks experiments, versions models, manages artifacts, and handles deployment. If you have a strong ML team that wants full control, MLflow + custom PySpark/sklearn/PyTorch code gives you maximum flexibility.

The trade-off is effort. Your team writes the feature pipelines, selects the algorithms, tunes hyperparameters, and maintains everything. MLflow tracks all of this beautifully. But the 80% of time spent on feature engineering happens before MLflow enters the picture. MLflow tracks what you built. It does not build it for you.

where_mlflow_time_actually_goes

StageHours per task% of totalMLflow helps?
Delta table joins & prep2.5 hours17%No
Feature computation (Spark)5.0 hours34%No
Feature iteration & selection4.2 hours29%Tracks experiments only
Model training & tuning1.8 hours12%Yes (full tracking)
Evaluation & deployment1.2 hours8%Yes (model registry)

Highlighted: 80% of the work happens before MLflow's tracking capabilities become relevant. MLflow is excellent infrastructure for the last 20%. It does not address the first 80%.

Option 5: DataRobot on Databricks

DataRobot integrates with Databricks via Spark connectors and can read from Unity Catalog. It brings enterprise AutoML with strong governance, explainability, and deployment features. Like Databricks AutoML, it requires a flat feature table as input.

DataRobot adds value over native Databricks AutoML in model governance, monitoring, and compliance documentation. But the core limitation is the same: it optimizes over a pre-engineered feature table. Cross-table patterns that were not manually encoded as features are invisible to DataRobot.

Option 6: H2O Sparkling Water

H2O Sparkling Water runs H2O's algorithms directly on Spark clusters. This gives you distributed training at scale without moving data out of Databricks. The integration is mature and well-tested.

Like every other option except Kumo, H2O requires a flat feature table. You write PySpark to join and aggregate your Delta tables, then H2O trains models on the result. The feature engineering bottleneck remains fully manual.

Option 7: Feature Store + AutoML

Databricks Feature Store (now part of Unity Catalog) lets you define, compute, and serve features as managed tables. Combined with AutoML, this is the most "Databricks-native" approach to production ML.

It is also the most complex. You define feature tables, write compute functions, schedule refresh jobs, manage point-in-time correctness, handle feature serving, and then feed the feature table to AutoML. This is the right approach for organizations with dedicated ML platform teams and dozens of models in production. For teams trying to get their first prediction live, it is months of infrastructure work before the first model trains.

The real question: who builds the features?

Every approach in this guide answers a slightly different question. But they all come back to the same bottleneck: converting your multi-table Delta Lake data into a flat feature table that a model can consume.

who_builds_the_features

ApproachWho writes the feature code?Feature engineering effort
Kumo.aiNobody (foundation model reads raw tables)Zero
Databricks AutoMLYour data scientistsFull manual effort
Genie CodeAI generates code, humans reviewReduced but not eliminated
MLflow + customYour ML engineersFull manual effort
DataRobotYour data scientistsFull manual effort
H2O Sparkling WaterYour ML engineersFull manual effort
Feature Store + AutoMLYour ML platform teamFull manual effort (most structured)

Highlighted: Kumo is the only option where nobody writes feature engineering code. The foundation model discovers predictive patterns directly from your relational Delta tables.

How Kumo reads your lakehouse differently

To understand why Kumo eliminates the feature engineering step, consider what the other tools see versus what Kumo sees when pointed at the same Unity Catalog tables.

what_automl_sees_vs_what_kumo_sees

Delta tableWhat AutoML/MLflow/DataRobot seeWhat Kumo's foundation model sees
customersSource table for manual joinsEntity nodes with temporal attributes
ordersSource table for aggregation SQLEvent nodes linked to customers and products
productsSource for one-hot encodingAttribute nodes with category relationships
support_ticketsSource for count/recency featuresSignal nodes with temporal patterns
Relationships between tablesInvisible (lost in flattening)Graph edges preserving full relational structure

Every other tool requires you to flatten the relational structure into a single table, losing cross-table patterns in the process. Kumo preserves the full relational structure as a temporal graph.

When to use each option

The right choice depends on your team, your data, and your timeline:

  • Kumo.ai Lakehouse App: You have multi-table Delta data and want predictions without building feature pipelines. You want your first prediction in minutes, not months. Your team's time is better spent on business problems than feature engineering.
  • Databricks AutoML: You already have a flat feature table or single-table data. You want a quick baseline model with minimal setup. Your data does not require multi-table joins.
  • Genie Code: You want AI assistance writing notebook code. Your team is comfortable reviewing generated code. You want to accelerate existing notebook-based workflows.
  • MLflow + custom: You have a strong ML team that wants full control. You need custom model architectures or domain-specific feature engineering. You already have feature pipelines in production.
  • DataRobot: You need enterprise governance and compliance documentation on top of AutoML. Your organization has regulatory requirements for model explainability.
  • H2O Sparkling Water: You need distributed training at scale on Spark. Your team has H2O expertise.
  • Feature Store + AutoML: You have a dedicated ML platform team, dozens of models in production, and the resources to build and maintain feature infrastructure.

PQL Query

PREDICT fraud_flag
FOR EACH unity_catalog.payments.transactions.txn_id
WHERE transactions.timestamp > '2026-03-01'

Fraud detection on Delta tables with a single PQL query. Kumo's foundation model reads transactions, accounts, merchants, and device tables from Unity Catalog, discovers cross-table anomaly patterns, and returns fraud probabilities. No Spark feature pipeline required.

Output

txn_idfraud_probabilityrisk_tiertables_used
T-882010.94Criticaltransactions, accounts, merchants, devices
T-882020.07Lowtransactions, accounts
T-882030.71Hightransactions, accounts, merchants
T-882040.02Lowtransactions, accounts

The bottom line

Databricks has built the best data lakehouse platform in the industry. But a data platform is not a prediction platform. Adding ML predictions still requires choosing who builds the features and how models get trained.

Six of the seven options on this page require you to solve the feature engineering problem yourself (manually, with AI code generation, or through Feature Store infrastructure). One option eliminates it entirely by reading your relational Delta tables as they are.

If your team has been spending weeks or months building feature pipelines before any model trains, the issue is not which AutoML tool you use on the flat table at the end. The issue is that you are building the flat table at all.

Frequently asked questions

Can Kumo.ai read Delta tables directly without exporting data?

Yes. Kumo.ai is available as a Databricks Lakehouse App that connects to Unity Catalog and reads Delta tables natively. You point PQL queries at your catalog tables, and Kumo reads the relational structure (multiple tables, foreign keys, timestamps) without any data export, CSV conversion, or feature pipeline. Predictions are written back to Delta tables in your lakehouse.

What is the difference between Databricks Genie Code and Kumo.ai?

Genie Code automates the workflow: it generates notebook code, writes feature engineering logic, and orchestrates training runs. Kumo automates the prediction itself: it understands relational structure across multiple Delta tables, discovers cross-table patterns, and generates predictions without any feature code. Genie automates writing the pipeline. Kumo eliminates the need for the pipeline.

Do I need to build a feature table before using Databricks AutoML?

Yes. Databricks AutoML requires a single flat table as input. You must join your Delta tables, compute aggregations, encode categorical variables, and produce one row per entity before AutoML can run. AutoML automates model selection and hyperparameter tuning on that table, but the feature engineering step (which typically consumes 80% of the total effort) remains manual.

How does Kumo.ai integrate with Unity Catalog?

Kumo registers as a Lakehouse App in your Databricks workspace. It reads table metadata and data through Unity Catalog, respecting your existing access controls and governance policies. You reference tables by their Unity Catalog names in PQL queries (e.g., catalog.schema.customers). Prediction outputs are written back as Delta tables registered in Unity Catalog.

Can I use MLflow to track experiments if I use Kumo.ai?

Yes. Kumo produces predictions and model metadata that can be logged to MLflow for experiment tracking, model registry, and deployment management. The difference is that with Kumo, you skip the months of feature engineering that typically precede the MLflow tracking step. You go from PQL query to tracked, versioned predictions in minutes instead of weeks.

What types of predictions can I run on Databricks with Kumo?

Any predictive task that can be expressed over relational Delta tables: churn prediction, fraud detection, lead scoring, demand forecasting, recommendation, credit risk, next-best-action, and customer lifetime value. PQL supports binary classification, multi-class classification, regression, and ranking tasks. Each task is a single query against your Unity Catalog tables.

See it in action

KumoRFM delivers predictions on relational data in seconds. No feature engineering, no ML pipelines. Try it free.