Berlin Tech Meetup: The Future of Relational Foundation Models, Systems, and Real-World Applications

Register now:
Learn14 min read

Build vs Buy for ML Predictions: The Full Cost Analysis

Real TCO comparison. Building: hiring 3-5 data scientists, 6-12 months per model, ongoing maintenance. Buying: foundation model platform, days to first prediction. Here is the full math.

TL;DR

  • 1True 3-year cost of building one ML model: $400K-1.5M (team $150K-450K + infrastructure $60K-300K/year + maintenance 20-30%/year + opportunity cost $300K+/month of delay). Most analyses capture only 30-40% of the total.
  • 2The cost gap widens with scale: 1 model saves 42% by buying ($480K build vs. $280K buy), 5 models save 83% ($2.1M vs. $350K), 10 models save 88% ($3.6M vs. $420K). The inflection point is 3 prediction tasks.
  • 3Opportunity cost is the largest hidden factor. A churn model retaining 15% more customers at $24M annual churn costs $300K per month of delayed deployment. Five months faster = $1.5M retained revenue.
  • 4Foundation models are not just cheaper -- they are more accurate. KumoRFM zero-shot achieves 76.71 AUROC on RelBench vs. 62.44 for manual LightGBM. Cost advantage and accuracy advantage point in the same direction.
  • 5Building makes sense for 1-2 high-stakes models where ML is a competitive moat or regulations demand full code-level auditability. For 3+ tasks on relational data, buying delivers equal or better accuracy at 3-10x lower cost.

Every VP of Data Science faces this decision: build ML prediction models in-house or buy a platform that delivers them. The standard analysis compares software license costs to team salaries and picks the cheaper option. That analysis is wrong because it misses the four cost categories that actually dominate: engineering time, opportunity cost, maintenance burden, and scaling economics.

This guide provides the complete cost framework with real numbers from enterprise deployments. No vendor-specific pricing. Just the structural economics of building versus buying ML predictions.

tco_build_vs_buy_summary

MetricBuild (Custom ML)Buy (Foundation Model)Difference
1 model (3-year TCO)$480K$280KBuy saves 42%
5 models (3-year TCO)$2.1M$350KBuy saves 83%
10 models (3-year TCO)$3.6M$420KBuy saves 88%
Time to first prediction3-6 monthsMinutes100-1000x faster
Team required per model2-3 data scientistsSQL-literate analystLower hiring bar
Annual maintenance20-30% of build costIncluded in platformZero incremental

The cost gap widens with every additional prediction task. At 10 tasks, building costs 8.5x more than buying.

The cost of building

Building a custom ML prediction model has five cost components. Most organizations account for the first two and underestimate the last three by 2x to 5x.

1. Team cost

A production ML model requires a team of 2 to 3 data scientists for 3 to 6 months. At $200K to $300K fully loaded cost per data scientist (salary, benefits, equipment, management overhead), the labor cost per model is:

  • Small model (2 people, 3 months): $100K to $150K
  • Medium model (3 people, 4 months): $200K to $300K
  • Complex model (3 people, 6 months): $300K to $450K

These numbers assume you already have the team. Recruiting data scientists takes 3 to 6 months in competitive markets and costs $30K to $50K per hire in recruiter fees. If you need to build the team first, add 6 months and $100K to $150K to the timeline.

2. Infrastructure cost

ML training and serving requires compute infrastructure: GPU instances for training, CPU/GPU instances for serving, data storage, and experiment tracking tools.

  • Training compute: $2K to $10K per model training run
  • Serving infrastructure: $3K to $15K per month per model
  • Data storage and processing: $1K to $5K per month
  • MLOps tooling (experiment tracking, model registry): $1K to $5K per month

Annual infrastructure cost per model: $60K to $300K depending on scale and serving requirements.

3. Feature engineering cost (the hidden dominant)

Feature engineering consumes 80% of data science time. The Stanford RelBench study measured this precisely: 12.3 hours and 878 lines of code per prediction task for experienced data scientists working on relational databases.

For a model that uses 200 features derived from 5 to 10 tables, feature engineering takes 4 to 8 weeks of a data scientist's time. At $150K to $225K annualized cost for that period, feature engineering alone costs $12K to $35K per model. But the real cost is not the time spent. It is the time wasted: the Stanford study showed that data scientists explore fewer than 5% of the possible feature space, missing the multi-hop and temporal patterns that carry the strongest signal.

4. Maintenance cost (the compounding hidden cost)

Models in production degrade. Data distributions shift, business rules change, upstream data pipelines break. Maintaining a production model costs 20% to 30% of the initial build cost per year:

  • Retraining every 3 to 6 months: 2 to 4 weeks of data scientist time
  • Feature pipeline monitoring and repair: 5 to 10 hours per month
  • Data quality investigations: 2 to 5 hours per incident, 1 to 3 incidents per month
  • Infrastructure updates and dependency management: ongoing

Over 3 years, maintenance costs exceed the initial build cost. A $300K model costs $180K to $270K to maintain over 3 years, for a total lifetime cost of $480K to $570K.

5. Opportunity cost (the largest hidden cost)

Every month without a working prediction model is revenue left on the table. If a churn model can retain 15% more customers and your annual churn costs $24M in lost revenue, each month of delay costs $300K in preventable losses.

A 5-month build timeline means $1.5M in opportunity cost. A foundation model that delivers predictions in the first week eliminates nearly all of that.

build_timeline_vs_buy_timeline (churn model)

WeekBuild (Custom ML)Buy (Foundation Model)Revenue Impact
Week 1Kick off, data access requestsConnect database, run zero-shotBuy: first predictions live
Week 2-4Data exploration, schema mappingValidate predictions, fine-tuneBuy: model in production
Week 5-8Feature engineering (SQL joins)Monitoring, iterate on thresholds$300K/mo retained (buy)
Week 9-12More feature engineering, iterationRunning, retaining customers$300K/mo retained (buy)
Week 13-16Model training, validationRunning$300K/mo retained (buy)
Week 17-20Deployment, integrationRunning$300K/mo retained (buy)
Week 21-24Model goes liveRunning (5 months ahead)Build: first predictions live

The foundation model is in production by week 2. The custom build takes until week 21-24. At $300K/month in retained revenue from churn reduction, the 5-month gap represents $1.5M in lost impact.

The cost of buying

"Buying" in this context means using a foundation model platform that delivers predictions on your relational data without custom model building.

Platform cost

Foundation model platforms typically price on one of two models:

  • Subscription: $50K to $200K per year for a given data volume and number of prediction tasks. Includes zero-shot predictions, fine-tuning, and API access.
  • Usage-based: Pay per prediction or per compute hour. Costs scale with usage but start lower.

Integration cost

Connecting the platform to your data warehouse and integrating predictions into your workflows. This is primarily engineering time:

  • Database connection: 1 to 2 days
  • First prediction task validation: 1 to 2 weeks
  • Workflow integration (CRM, marketing automation, app): 2 to 4 weeks

Total integration cost: $20K to $60K in team time, a one-time investment that applies to all subsequent prediction tasks.

Marginal cost per additional task

This is where buying structurally differs from building. With a foundation model, each new prediction task is a new PQL query. No new feature engineering, no new model training, no new infrastructure. The marginal cost per additional task is:

  • Writing and validating the PQL query: 1 to 3 days of analyst time
  • Optional fine-tuning: 2 to 8 hours of compute
  • Integration: often reuses existing workflows

Marginal cost per task: $2K to $10K. Compare this to $150K to $500K per task when building.

Build: cost per model

  • Team: $150K-450K (2-3 data scientists, 3-6 months)
  • Infrastructure: $60K-300K/year
  • Feature engineering: 80% of team time
  • Maintenance: 20-30% of build cost per year
  • Opportunity cost: $300K+ per month of delay

Buy: cost per model

  • Platform: $50K-200K/year (covers all tasks)
  • Integration: $20K-60K one-time
  • Feature engineering: $0 (eliminated)
  • Maintenance: included in platform
  • Opportunity cost: near-zero (minutes to prediction)

hidden_costs_of_building

Cost CategoryYear 1Year 2Year 33-Year Total
Team (2-3 DS x 4 months)$200K-300K----$200K-300K
Infrastructure$60K-120K$60K-120K$60K-120K$180K-360K
Feature Engineering (labor)$12K-35K$6K-15K$6K-15K$24K-65K
Maintenance (20-30%/yr)--$60K-90K$60K-90K$120K-180K
Opportunity Cost (5mo delay)$1.5M----$1.5M
Recruiting (if needed)$100K-150K----$100K-150K

Highlighted rows are costs most build-vs-buy analyses miss entirely. Opportunity cost alone can exceed the total platform cost.

The scaling math: why the gap widens

The build-vs-buy decision becomes clearer when you model the cost across multiple prediction tasks. Here is the math for a 3-year period.

Scenario: 1 prediction task

Cost categoryBuildBuy
Year 1 (build + deploy)$300K$120K
Year 2 (maintain + iterate)$90K$80K
Year 3 (maintain + iterate)$90K$80K
3-year total$480K$280K

At 1 task, buying saves roughly 40%. Meaningful but not significant.

Scenario: 5 prediction tasks

Cost categoryBuildBuy
Year 1 (build 2-3 models)$750K$150K
Year 2 (build 2-3 more + maintain)$900K$100K
Year 3 (maintain all 5)$450K$100K
3-year total$2.1M$350K

At 5 tasks, buying is 6x cheaper. The gap comes from zero marginal feature engineering cost per additional task.

Scenario: 10 prediction tasks

Cost categoryBuildBuy
Year 1$1.2M$180K
Year 2$1.5M$120K
Year 3$900K$120K
3-year total$3.6M$420K

At 10 tasks, buying is 8.5x cheaper. And the build number assumes you can even hire enough data scientists to staff 10 concurrent model builds, which most organizations cannot.

The accuracy dimension

Cost is only half the equation. What about accuracy?

On the RelBench benchmark, which evaluates all approaches on the same data with the same temporal splits:

  • Manual ML (LightGBM + feature engineering): 62.44 average AUROC
  • Foundation model zero-shot: 76.71 average AUROC
  • Foundation model fine-tuned: 81.14 average AUROC

The foundation model is not just cheaper. It is more accurate, because it explores the full relational feature space that human engineers cannot enumerate. The cost advantage and accuracy advantage point in the same direction, which is unusual in enterprise software decisions.

A simple cost framework

Use this framework to calculate the build-vs-buy economics for your specific situation:

Step 1: Count your prediction tasks

List every prediction your organization needs or wants: churn, fraud, demand, recommendations, lead scoring, lifetime value, next-best-action, credit risk. Include both existing models and models you have not built yet because they are too expensive.

Step 2: Calculate build cost per task

(Number of data scientists x monthly fully loaded cost x months to build) + (monthly infrastructure cost x 36 months) + (30% x build cost x 3 years maintenance). Multiply by number of tasks.

Step 3: Calculate buy cost

(Annual platform cost x 3 years) + (one-time integration cost) + (marginal cost per additional task x number of tasks).

Step 4: Add opportunity cost to the build scenario

For each task, estimate the monthly revenue impact of the prediction (retained revenue from churn, prevented fraud losses, incremental conversion revenue). Multiply by the months of delay in the build scenario compared to the buy scenario. Add this to the build cost.

Step 5: Compare

For most enterprises with 5 or more prediction tasks on relational data, buying is 3x to 10x cheaper with equal or better accuracy. The rare exception: organizations with single high-stakes models where regulatory requirements demand full code-level auditability of every model component.

When building makes sense

  • 1-2 high-stakes models justifying deep investment
  • Genuinely unique data (proprietary sensors, classified)
  • Regulatory requirement for full code-level auditability
  • ML system is a core competitive differentiator
  • Established team with excess capacity

When buying makes sense

  • 3+ prediction tasks on relational data
  • Time to value matters (every month is $300K+ in lost impact)
  • Data science team is at capacity or hard to hire
  • Predictions are a means to an end, not the product
  • Need to test many predictions before committing

The real question

Build-vs-buy is often framed as a technology decision. It is actually a resource allocation decision. Every data scientist hour spent on feature engineering is an hour not spent on interpreting predictions, building decision systems, or identifying new use cases.

The organizations getting the most value from ML predictions are not the ones with the biggest data science teams. They are the ones that have eliminated the repetitive engineering work (feature engineering, pipeline maintenance, model retraining) and redirected their talent toward the strategic work: deciding what to predict, evaluating what the predictions mean, and building the organizational systems that turn predictions into revenue.

If your data scientists spend 80% of their time on feature engineering and pipeline maintenance, the build-vs-buy question is already answered. The only remaining question is when.

PQL Query

PREDICT churn_30d
FOR EACH customers.customer_id
WHERE customers.arr > 50000

This query delivers enterprise churn predictions in seconds. Building the equivalent custom model: 3-6 months, $300K-450K, team of 3 data scientists. The PQL query costs effectively zero marginal.

Output

customer_idchurn_riskarrretention_action
ENT-10010.82$120KExecutive outreach recommended
ENT-10020.15$95KNo action needed
ENT-10030.67$210KCSM escalation triggered
ENT-10040.04$78KExpansion opportunity detected

KumoRFM was built by the team behind the ML systems at Pinterest, Airbnb, and LinkedIn: Vanja Josifovski (CEO, former CTO at Airbnb and Pinterest), Jure Leskovec (Chief Scientist, Stanford professor, co-creator of GraphSAGE), and Hema Raghavan (Head of Engineering, former Sr. Director at LinkedIn). Backed by Sequoia Capital.

Frequently asked questions

How much does it cost to build a custom ML prediction model?

A single custom ML model costs $150K to $500K when you include team time (2-3 data scientists at $200K-300K fully loaded, working 3-6 months), infrastructure ($5K-20K/month for compute), and opportunity cost. A portfolio of 10 models costs $1.5M to $5M over 2-3 years. The dominant cost is feature engineering labor, not infrastructure.

What does a foundation model platform cost?

Foundation model platforms use pay-per-prediction or subscription pricing. Typical cost: $50K-150K per year for a mid-size deployment (10-20 prediction tasks, millions of predictions per month). The marginal cost per additional task is near zero because the same model handles all tasks. Total 3-year cost for 10 models: $150K-450K, compared to $1.5M-5M for building.

What hidden costs do most build-vs-buy analyses miss?

Four hidden costs: (1) Maintenance: models degrade and need retraining every 3-6 months, costing 20-30% of initial build cost per year. (2) Hiring: data scientists take 3-6 months to recruit in competitive markets. (3) Opportunity cost: every month without predictions is revenue left on the table. (4) Model debt: pipelines become fragile over time, requiring increasing engineering effort to maintain.

When is building custom ML the right choice?

Building makes sense when: you have a single high-stakes model that justifies deep investment, your data is genuinely unique (proprietary sensors, classified data), regulatory requirements demand full code-level auditability, or you need the ML system to be a competitive moat. For most enterprise prediction tasks (churn, fraud, demand), these conditions do not apply.

How quickly can I get predictions from a foundation model vs. building?

Building: first prediction in 3 to 6 months after project kickoff. Foundation model: first zero-shot prediction in minutes after database connection. Fine-tuned prediction in hours to days. The time difference is 100x to 1000x. For a company losing $2M per month to preventable churn, 5 months of faster deployment represents $10M in retained revenue.

See it in action

KumoRFM delivers predictions on relational data in seconds. No feature engineering, no ML pipelines. Try it free.