1. What Private Link solves
- Eliminates public endpoints for the Kumo UI/API; users and automation connect through a Private Link/PSC endpoint inside your network.
- Keeps training and graph processing off the public internet: data is streamed over Private Link to Kumo-managed compute, processed in-memory/ephemeral storage, and discarded when the job finishes.
- Simplifies operations compared to fully air-gapped installs: Kumo hosts and maintains the control plane and compute services that you consume privately.
2. Architecture and isolation
Each customer receives a dedicated environment with two clear zones on AWS, GCP, or Azure:- Customer zone (your VPC/VNet): Your data sources/destinations (warehouses, lakes, buckets) remain in your account. You own service accounts/roles and can restrict access paths to Private Link/PSC endpoints only.
- Kumo-managed zone: Control plane (UI, API, scheduler) and data plane (Spark, GPUs for training, graph engine) are hosted by Kumo. These services are reachable only via Private Link/Private Service Connect endpoints you create in your account/subscription. Data pulled for processing is kept in ephemeral storage with 0-day retention.

3. Identity and console access
- SSO via SAML or OIDC is required; MFA and device posture remain enforced by your IdP.
- Roles in Kumo are mapped to IdP groups; SCIM/JIT provisioning is available for lifecycle automation.
- Administrative access for Kumo support is optional and can be restricted to your VDI or vendor laptops during onboarding.
4. Data flow and protection
The principle remains: your primary data stays in your platforms.- Kumo connects to your warehouses/lakes (Snowflake, Databricks, BigQuery, S3, others) using least-privilege service accounts you own. Connector guides list the exact permissions.
- Training and scoring execute on Kumo-managed compute; data is streamed over Private Link/PSC, processed in-memory/ephemeral volumes, and dropped after the job completes (0-day retention). Outputs are written back to destinations you configure (tables, buckets, or downstream apps).
- All control plane ↔ data plane communication uses TLS over Private Link/PSC. Secrets and artifacts are encrypted at rest in your storage/KMS when written back; Customer-Managed Keys are supported if you want to bring your own KMS keys. Direct file upload can be disabled if you want sources only from your systems.
5. Connectivity and dependencies
- Required endpoints: Private Link/PSC endpoints for the Kumo control plane.
- No outbound egress to the public internet: Kumo reads your data only through the Private Link/PSC endpoints, and your users/employees access Kumo only through those same private endpoints.
6. Installation flow and customer responsibilities
- Create Private Link/PSC endpoints and network rules so that: (1) Kumo can reach your data warehouses/lakes that hold the relational data for model building, and (2) your employees can access the Kumo UI/API (including SDK) via your corporate network/VDI.
- Configure SSO, network policies, and logging/metrics destinations. Service accounts/roles for each data platform should follow the connector guides.
- Validate end-to-end with a smoke test: ingest → train → score, confirming traffic stays on Private Link paths and is not retained after execution.
7. Operations and lifecycle
- Upgrades are delivered by Kumo to the hosted control plane and data plane; compute nodes pull signed images over Private Link/PSC. No public egress is needed for updates, and no job data is retained after runs.
- Support access is time-bounded and can be provided via your VDI/vendor laptop pattern if you want installation assistance or ongoing maintenance.