Skip to main content

Documentation Index

Fetch the complete documentation index at: https://kumo.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Kumo can run fully inside your Virtual Private Cloud (VPC/VNet) when you need customer-controlled infrastructure, private network access, enterprise identity, and strong separation between teams. This deployment runs the Kumo UI, API, authentication, data engine, and model execution services directly on Kubernetes in your AWS, Azure, or Google Cloud account. The architecture is self-contained: Kumo services, compute, intermediate storage, model artifacts, backups, and operational data remain in your cloud environment. Compared with earlier VPC patterns, the deployment has fewer moving parts and minimal external dependencies: a Kubernetes cluster, customer cloud storage, private network access, identity provider integration, and the source data systems you choose to connect. This page is intended for security, infrastructure, and data platform teams who want to understand how the VPC deployment behaves technically—what runs in your boundary, how users and data systems connect, and what your team must provision to install and operate it.

1. Architecture and isolation

Each customer receives a dedicated Kumo environment that runs inside their own cloud network:
  • Kumo application services run as containers in your Kubernetes cluster, such as EKS, AKS, or GKE.
  • The cluster runs multiple pods for the UI, API, authentication layer, data engine, and model execution services.
  • Model execution uses Kubernetes-managed compute, including GPU capacity for training and prediction workloads.
  • Customer cloud storage, such as S3, Google Cloud Storage, or Azure Data Lake Storage, holds intermediate artifacts, model binaries, prediction outputs, embeddings, logs, and automatic backups.
  • Users access Kumo through your private network path, typically corporate VPN, private DNS, and a customer-managed gateway or load balancer.
No customer data or metadata leaves your VPC/VNet unless you explicitly configure a connection to an external system such as your identity provider or source warehouse. High Level Architecture

2. Identity, RBAC, and console access

Access to the Kumo console and APIs is integrated with your identity provider:
  • OIDC identity provider integration is supported for enterprise login.
  • MFA, device posture checks, and other access policies remain enforced by your identity provider.
  • Admin-configurable RBAC maps users and groups to Kumo projects and permissions.
  • User groups can be restricted to their own projects, and each project can be configured to access only specific datasets in the source warehouse or storage location.
This allows separate teams to use the same VPC deployment while maintaining strong separation between projects, source datasets, models, predictions, and artifacts.

3. Data sources and protection

The design principle is simple: your primary data stays in your platforms, and Kumo reads only through the access paths you configure. Supported data sources include:
  • Parquet files in customer cloud storage.
  • Direct upload, if enabled by your administrators.
  • Standard cloud data warehouses and lakehouse platforms, including BigQuery, Snowflake, and Databricks.
Kumo connects to each source using customer-owned permissions, such as service accounts, warehouse roles, storage roles, or managed identities. For team-separated deployments, those permissions can be scoped by project so different user groups can only access the datasets assigned to them. Training, evaluation, prediction, and embedding jobs run inside your VPC/VNet. Outputs are written back to destinations you configure, such as warehouse tables or customer cloud storage. Data in transit is protected with TLS, and secrets, model artifacts, logs, backups, and intermediate data are stored using your cloud storage and encryption controls.

4. Network and dependencies

The VPC deployment is self-contained inside your cloud account. It does not require a Kumo-hosted control plane or Kumo-managed data plane outside your VPC/VNet. Runtime dependencies are limited to the Kubernetes cluster, customer cloud storage, private employee access, OIDC login, and the data platforms you choose to connect. Container images and updates can be delivered through the registry or image distribution process approved by your security team. Kubernetes resources can be managed with Kumo-provided Helm charts or through GitOps tooling; Flux configurations are available.

5. Installation flow and customer responsibilities

The deployment follows an infrastructure-as-code approach. Typical prerequisites:
  • A Kubernetes cluster with the node pools, storage classes, and ingress components required for the expected workload.
  • Permissions for Kumo workloads to request or launch appropriate GPU resources on demand through your Kubernetes autoscaling pattern.
  • An object storage bucket or prefix for intermediate data, model artifacts, predictions, embeddings, logs, and backups.
  • Network rules, DNS, and gateway or load balancer configuration that allow employees to reach the Kumo UI and API from the corporate network.
  • OIDC configuration for your identity provider, including the user and group claims Kumo should use for RBAC.
  • Service accounts, roles, managed identities, or warehouse permissions for each data source and destination the deployment should access.
Install steps are typically:
  • Provision the Kubernetes cluster, storage locations, network access, and identity provider application.
  • Install Kumo into a dedicated namespace using Kumo-provided Helm charts or GitOps configuration, with the required service accounts, secrets, and deployment values.
  • Configure OIDC, project-level RBAC, data source permissions, and output destinations.
  • Smoke test data access, training, prediction, and writeback paths for each team or project.

6. Estimated infrastructure cost

The Kubernetes deployment separates always-on application services from GPU resources used for jobs. The estimates below use AWS us-east-1 on-demand Linux pricing as a reference point and assume the control plane runs on an r6g.2xlarge node at about $0.4032/hr (~$294/month). GPU nodes are launched only when training or prediction jobs need them, so GPU capacity is consumed only while jobs are running. Fixed overhead is the estimated always-on non-GPU baseline for the deployment: the control-plane node, EKS cluster management fee, load balancer, estimated cloud storage cost, and routine logs or metrics.
Usage patternDataset sizeGPU instanceJobs/monthGPU hours/jobEst. GPU cost/monthFixed overhead/monthEst. total/month
Low usage< 10 GBg4dn.2xlarge151~$11~$450~$461
Medium usage< 300 GBg4dn.8xlarge304~$261~$550~$811
High usage> 1 TBg7e.12xlarge6010~$4,972~$750~$5,722
Actual costs depend on region, selected GPU family, job duration, storage volume, retention policy, network transfer, warehouse compute, and whether your organization uses savings plans, reserved instances, or spot capacity.

7. Operations and lifecycle

  • Upgrades are coordinated through your change management process and deployed into your Kubernetes cluster using the approved image delivery path.
  • Observability can be connected to your logging and metrics stack, such as CloudWatch, Cloud Logging, Azure Monitor, Datadog, or Splunk.
  • Support can be delivered through your approved access pattern, such as VDI or a time-bounded customer-controlled account; no persistent Kumo access is required.
  • Kubernetes lets the deployment scale operational components and run GPU capacity on demand, which can improve reliability and cost efficiency for production automation and larger datasets.
If you want to evaluate the Virtual Private Cloud deployment, your Kumo representative can share a tailored bill of materials for your cloud, identity provider, network pattern, data sources, and expected workload size.