Yobitel NeoCloud

TL;DR

Yobitel NeoCloud is the sovereign Tier III+ GPU cloud that sits underneath Yobibyte and is also available as a standalone capacity service — multi-tenant on-demand, dedicated single-tenant, and confidential-compute tenancies, with a per-second pricing model in USD.
Sovereignty leads: UK NCSC OFFICIAL alignment in primary regions, EU Data Boundary regions for EU workloads, and US FedRAMP-equivalent partner regions for US-regulated workloads. Admission refuses to spill workloads across boundaries regardless of capacity or price.
Accelerator coverage spans NVIDIA B300/B200/H200/H100/A100/L40S/L4 and AMD MI300X, with high-bandwidth AI fabric (NVLink within node, InfiniBand within rack, RoCEv2 across racks) and Tier III+ facility design at sub-1.1 PUE.
Customer-facing surfaces: a console for reservation, spot, and on-demand provisioning; a per-region capacity catalogue; DCGM-emitted metrics into customer Prometheus; FOCUS 1.1 billing export to customer-owned object storage; the standard Yobibyte workspace boundary for tenancy.
Yobitel operates the facility, the fleet, the fabric, and the platform end-to-end; the customer owns the data, the workload, and the consumption decisions. A 99.99% uptime SLA backs the production tier.

Overview#

GPU capacity has become the gating constraint for production AI, and the constraint is not uniform. A UK trust cannot place patient data in a US region. A European bank cannot leave the EU Data Boundary. A US federal customer cannot consume capacity that has not cleared a moderate-baseline assessment. A FinOps team cannot defend a procurement decision that does not pass an open price test. Each of those constraints has historically forced a separate cloud relationship; the platform team's job has been to keep the pieces aligned by hand.

Yobitel NeoCloud is the sovereign Tier III+ GPU cloud Yobitel runs to solve the constraint set rather than the capacity question alone. Primary regions in the UK align to NCSC Cloud Security Principles and the OFFICIAL classification; EU regions sit inside the EU Data Boundary; US regions, delivered through FedRAMP-equivalent partner facilities, align to the moderate-baseline control set. A workload pinned to a sovereignty region never spills across the boundary, regardless of capacity or price — admission refuses non-eligible placement at the platform layer.

The capacity layer covers the SKUs production teams actually need. NVIDIA B300, B200, H200, H100, A100, L40S, and L4 cover the bulk of training and inference; AMD MI300X covers the open-accelerator path; AI fabric (NVLink within node, InfiniBand within rack, RoCEv2 across racks) supports up to 2,048-node distributed training; Tier III+ facility design at sub-1.1 PUE keeps the energy cost of compute defensible. The customer-facing surface is a console for reservation, spot, and on-demand consumption, a per-region capacity catalogue, and the standard Yobibyte workspace boundary for tenancy and identity.

Yobitel Communications, the UK-headquartered AI infrastructure company that operates NeoCloud, runs the facility, the fleet, the fabric, and the platform end-to-end. The customer owns the workload, the data, the encryption keys, and the consumption decisions. The 99.99% uptime SLA on the production tier carries financial credits; per-second billing is exported via the FOCUS 1.1 specification into customer-owned object storage; observability is delivered through DCGM and OpenTelemetry on stable metric names so the customer's existing Prometheus and Grafana setup just works.

Quick start#

Provisioning capacity on NeoCloud is a console workflow today; sign in to the Yobitel console with your corporate identity provider, open NeoCloud under the main navigation, and choose three things — the region, the accelerator, and the term. The console quote is live against current capacity and surfaces sovereignty eligibility next to every result.

The customer first picks a region from the per-region catalogue. Each region carries a sovereignty tag (UK NCSC OFFICIAL, EU Data Boundary, US FedRAMP-equivalent), a facility tier (Tier III+ is the production default), and a current capacity heat map by accelerator. Next, the customer picks an accelerator (B300, B200, H200, H100, A100, L40S, L4, or MI300X) and a tenancy model (`shared` multi-tenant, `dedicated` single-tenant, or `confidential` with TEE attestation). Finally, the customer picks a term — on-demand by the second, reservation by the month or year, or spot for fault-tolerant workloads — and a USD spend cap.

Once provisioned, the workload runs inside the customer's Yobibyte workspace (or a standalone NeoCloud workspace if the customer is consuming capacity without adopting Yobibyte). Identity federates from any OIDC IdP, telemetry flows into customer Prometheus through the DCGM exporter, and billing lands hourly in the customer's FOCUS 1.1 export bucket.

Spot capacity on NeoCloud is per-region rather than per-account; the spot floor moves with regional supply and demand. Set a spend cap on every spot reservation — when the floor moves above the cap, the reservation simply rolls off rather than running at an unexpected price.

Concepts#

NeoCloud exposes a small set of customer-facing concepts that map to how platform and procurement teams think about sovereign GPU capacity. The mental model is region (and its sovereignty pin) at the top, with tenancy and term as the two key dimensions for any consumption decision underneath.

Region — the geographic placement of capacity, bound to a sovereignty tag. UK regions align to NCSC Cloud Security Principles and the OFFICIAL classification; EU regions sit inside the EU Data Boundary; US regions, delivered through FedRAMP-equivalent partner facilities, align to the moderate-baseline control set.
Tenancy — the isolation model for a workload. `shared` runs on multi-tenant nodes with cgroup and MIG isolation (the right default for most workloads). `dedicated` pins to single-tenant nodes for regulatory or noisy-neighbour reasons. `confidential` runs on NVIDIA confidential-compute mode with TEE attestation so encryption keys never leave the GPU.
Reservation — committed capacity for a defined term (monthly, 1-year, or 3-year) at a discounted rate. Used for steady production inference and known-shape training reservations.
Spot — opportunistic capacity available at a discount, reclaimable on short notice. Used for fault-tolerant fine-tunes, batch scoring, and shadow training. Per-region spot floors move with supply and demand.
On-demand — uncommitted per-second capacity at the headline rate. Used for bursty workloads and pre-reservation prototyping.
AI Fabric — the high-bandwidth interconnect that ties accelerators together. NVLink-fabric within a single node (typical for 8-GPU single-host workloads), InfiniBand-rack within one rack (typical for distributed training up to 64 nodes), and RoCEv2-DC across racks (for cluster sizes beyond a single rack). The customer picks the topology requirement; the platform satisfies it from the regional fabric.
Spend Cap — a hard USD budget set per reservation, per workspace, or per organisation. When reached, on-demand consumption pauses and a P2 alert fires; reservations continue to bill until the term ends.
Compliance Tag — the regulatory posture the workload is pinned to. NCSC OFFICIAL, EU Data Boundary, FedRAMP-equivalent, HIPAA-eligible, ISO 27001-certified, SOC 2 Type II attested. Admission refuses placement outside the tagged set.

Reference — customer-facing API and console fields#

Customers who prefer a declarative consumption flow can submit reservations through the NeoCloud API. The fields below are the full customer-facing surface; the underlying scheduler, admission engine, and fleet are operated by Yobitel and are not customer-tunable.

Field	Type	Description
reservation.region	string	Sovereignty region (e.g. `uk-london`, `uk-manchester`, `eu-frankfurt`, `eu-paris`, `us-ashburn`).
reservation.accelerator	enum	`b300`, `b200`, `h200`, `h100-sxm5`, `h100-pcie`, `a100`, `l40s`, `l4`, `mi300x`.
reservation.count	integer	Number of accelerators. For distributed training, the count is expressed in nodes (8-GPU each).
reservation.tenancy	enum	`shared`, `dedicated`, or `confidential`.
reservation.term	enum	`on-demand`, `spot`, `reserved-monthly`, `reserved-1y`, `reserved-3y`.
reservation.fabric	enum	`nvlink-node`, `infiniband-rack`, `infiniband-row`, `rocev2-dc`, or `none` for single-accelerator workloads.
reservation.complianceTags	string[]	Required tags (e.g. `ncsc-official`, `eu-data-boundary`, `hipaa`).
reservation.spendCap	object	USD budget with `amount` and `window` (`hourly`, `daily`, or `monthly`).
reservation.startTime	datetime	Requested start. Reservations queue if the region does not have capacity at the requested time.
reservation.endTime	datetime	Requested end (optional for on-demand and reserved terms).
identity.oidcIssuer	string	Workspace OIDC issuer for identity federation.
identity.scimEndpoint	string	Optional SCIM 2.0 endpoint for user and group provisioning.
observability.prometheusScrape	boolean	Exposes the standard `yobitel_*` metric set on a workspace-scoped scrape endpoint.
observability.dcgmExporter	boolean	DCGM exporter enabled per accelerator.
observability.otelEndpoint	string	Customer-owned OTel collector endpoint.
billing.focusExportBucket	string	Customer-owned object-storage bucket for FOCUS 1.1 hourly export.
billing.focusKmsKey	string	Customer-managed KMS key the export is encrypted with at rest.
network.cidrAllowList	string[]	Optional inbound CIDR allow-list at the workspace gateway.
network.privateConnectivity	enum	`internet`, `aws-directconnect`, `azure-expressroute`, `gcp-interconnect`, or `dedicated-cross-connect`.

Workload patterns#

Three patterns cover most customers. The patterns differ on term type, tenancy, and topology; the customer-facing console and API are the same across all three.

Pattern A — training-cluster reservation. A research team reserves a 64-node H100 cluster on InfiniBand fabric in `uk-london` for a 3-week training run, with confidential tenancy and an immutable FOCUS export to the team's S3 bucket. The reservation queue surfaces the start time once the rack is allocated; the team's workload runs inside its Yobibyte workspace.
Pattern B — production inference tenancy. A SaaS product reserves a steady pool of 16 H200 cards on `shared` tenancy in `uk-london` and `eu-frankfurt`, with autoscaling spot capacity layered on top for traffic peaks. The Yobibyte workspace draws on the reservation first and bursts to spot when steady capacity is saturated.
Pattern C — air-gapped sovereign deployment. A regulated UK customer takes a dedicated tenancy block in a Tier III+ enclave with no internet egress, NCSC OFFICIAL-SENSITIVE alignment, and on-premise audit export. The customer's workspace runs inside the enclave; Yobitel operates the facility under a NeoCloud Operations engagement.

Sizing and capacity tiers#

Per-region capacity is grouped into tiers so customers can plan reservations against published bands. The tier list below covers the most common request shapes; capacity beyond these tiers is available through the account team and a reservation lead-time.

Tier	Typical use	Reservation scale	Term
Pilot	Single-team prototype, evaluation	1 - 8 accelerators	On-demand or monthly reservation
Production-small	Steady inference for a single product	8 - 64 accelerators	Monthly or 1-year reservation
Production-medium	Multi-product platform, mixed inference and fine-tune	64 - 256 accelerators	1-year or 3-year reservation
Production-large	Org-wide AI platform, multi-region	256 - 1,024 accelerators	1-year or 3-year reservation with multi-region split
Training-cluster	Distributed training on InfiniBand fabric	8 - 512 nodes (64 - 4,096 GPUs)	Reserved 2-12 week training block
Reserved-frontier	Frontier model training, multi-thousand-GPU clusters	512 - 2,048 nodes (4,096 - 16,384 GPUs)	Reserved 3-month to 1-year training block; lead time required

Limits and quotas#

Default limits keep pilot and onboarding workloads predictable; almost every ceiling is raisable through self-service or a support request once the customer moves past pilot. The hard ceilings exist where the underlying primitive imposes them (e.g. per-region capacity, fabric fan-out).

Resource	Default	Enterprise ceiling	How to raise
GPUs per workspace	32	16,384	Self-service up to 256; ticket beyond.
Workspaces per organisation	10	200	Self-service.
Reservations per workspace	20	500	Self-service.
Concurrent regions per workspace	3	12	Self-service.
Spot reservations per workspace	10	200	Self-service; per-region spot floors apply.
Reservation lead time (pilot tier)	Immediate	Immediate	On-demand capacity is provisioned in seconds when available.
Reservation lead time (frontier tier)	30 days	30 - 90 days	Frontier reservations require lead time and account-team coordination.
Network egress to internet	10 TB/month per workspace included; metered beyond	1 PB/month	Self-service; FOCUS export shows egress per workspace.
Egress between NeoCloud regions	Unmetered	Unmetered	Hard floor; no inter-region egress charge.
Confidential-tenancy quota	8 GPUs	1,024 GPUs	Support request; confirms TEE attestation supply.
Spend cap precision	USD 1	USD 1	Hard floor.
FOCUS export retention	13 months	7 years	Configurable in console; customer storage policy governs.
DCGM scrape interval	10 seconds	1 second	Configurable per workspace.

Observability#

NeoCloud emits three telemetry streams for every customer workload. DCGM (the NVIDIA open standard) covers GPU SM occupancy, HBM usage, power draw, and NVLink throughput. Yobitel-namespaced metrics under `yobitel_neocloud_*` cover reservation state, fabric health, and per-region capacity. OpenTelemetry traces cover the workspace gateway and the platform control plane.

All three streams are scrape-compatible with customer Prometheus and OTel collectors. Most customers add the alert below within the first week — it catches the 'reservation is allocated but the fabric is degraded' failure mode that simple liveness probes miss, and pages on it before a distributed training run produces silent corruption.

yaml

groups:
- name: neocloud-fabric
  interval: 30s
  rules:
  - alert: FabricDegraded
    expr: |
      max by (workspace, reservation, fabric) (
        yobitel_neocloud_fabric_link_error_rate
      ) > 1e-9
    for: 5m
    labels: { severity: page }
    annotations:
      summary: "{{ $labels.reservation }} fabric error rate above 1e-9"
      runbook: https://docs.yobitel.com/neocloud/runbooks/fabric-degraded

  - alert: ReservationCapacityShort
    expr: |
      yobitel_neocloud_reservation_capacity_short_seconds > 0
    for: 10m
    labels: { severity: page }
    annotations:
      summary: "{{ $labels.reservation }} short of committed capacity"

  - alert: SovereigntyAdmissionDenied
    expr: |
      rate(yobitel_neocloud_admission_denied_total{reason="compliance"}[15m]) > 0
    for: 30m
    labels: { severity: ticket }

DCGM is the open NVIDIA standard for accelerator telemetry; the same exporter is supported on every major Kubernetes distribution. NeoCloud emits DCGM on the same metric names as a self-managed deployment, so customers who already have Prometheus and Grafana setups can keep their dashboards.

Cost and FinOps#

Pricing is per-second for GPUs on the on-demand and spot terms, per-month for reservations, and unmetered for inter-region egress between NeoCloud regions. The table below is indicative for mid-2026 UK regions; the account team confirms current rates and any multi-year or volume discounts before contract. All numbers are USD; the FOCUS 1.1 billing export carries the standard column set (BilledCost, EffectiveCost, ListCost, ChargePeriod*, ServiceCategory, SubAccountId, Tags) so the data drops directly into a Cloudability, Apptio, Vantage, or customer-built lakehouse pipeline.

SKU	On-demand $/GPU/hr	1-yr reserved $/GPU/hr	3-yr reserved $/GPU/hr	Spot floor
NVIDIA B200 192GB	$6.00	$4.50	$3.60	n/a
NVIDIA H200 141GB	$4.25	$3.20	$2.55	$1.75
NVIDIA H100 SXM5 80GB	$3.25	$2.45	$1.95	$1.20
NVIDIA H100 PCIe 80GB	$2.95	$2.20	$1.75	$1.05
NVIDIA A100 80GB	$2.25	$1.70	$1.40	$0.80
NVIDIA L40S 48GB	$1.20	$0.90	$0.70	$0.45
NVIDIA L4 24GB	$0.50	$0.40	$0.30	$0.22
AMD MI300X 192GB	$4.00	$3.00	$2.45	n/a
Object storage (per GB-month)	$0.022	—	—	—
Egress to internet (per GB)	$0.075	—	—	—
Egress between NeoCloud regions (per GB)	$0.00	—	—	—
Confidential-tenancy surcharge (per GPU/hr)	$0.40	$0.30	$0.24	n/a
Dedicated-tenancy surcharge (per GPU/hr)	$0.30	$0.22	$0.18	n/a

Reservations are billed for the full term regardless of utilisation; on-demand consumption pauses when a spend cap is reached. Customers running steady-state inference often combine a smaller reservation footprint with on-demand burst to keep the reservation utilisation high while keeping headroom available.

Security and compliance#

Sovereignty is enforced at admission. A workload labelled `compliance=ncsc-official` is rejected if it targets a non-UK region; a workload labelled `compliance=eu-data-boundary` is rejected if it targets outside the EU; a workload labelled `compliance=fedramp-equiv` is rejected if it targets outside the US partner regions. NeoCloud does not silently spill workloads across compliance boundaries regardless of capacity or price.

UK frameworks lead the compliance posture — NCSC Cloud Security Principles, G-Cloud, OFFICIAL and OFFICIAL-SENSITIVE classifications, Cyber Essentials Plus — before EU frameworks (GDPR, EU Data Boundary, DORA-aligned where customers require it) and US frameworks (FedRAMP-equivalent, HIPAA BAA, HITRUST alignment where applicable). The full attestation set is available under NDA through the account team.

NCSC Cloud Security Principles — controls mapped per principle for UK primary regions; OFFICIAL-tier audited annually.
G-Cloud — listed under Cloud Hosting (Lot 1) and Cloud Software (Lot 2); orderable through the Crown Commercial Service framework.
Cyber Essentials Plus — current certificate.
ISO 27001:2022, ISO 27017, ISO 27018 — current certificates.
SOC 2 Type II — annual third-party audit covering security, availability, confidentiality.
GDPR / UK DPA 2018 — DPA, sub-processor list, EU SCCs available; data residency enforced at admission.
EU Data Boundary — EU regions sit inside the boundary; admission refuses spill.
DORA — operational-resilience evidence available for financial-services customers.
FedRAMP-equivalent — moderate-baseline-aligned controls available via US partner regions.
HIPAA — BAA available for healthcare workloads.
Confidential tenancy — NVIDIA confidential-compute mode with TEE attestation; encryption keys never leave the GPU.

Migration and alternatives#

NeoCloud is one option in a small set of credible sovereign GPU cloud providers. The comparison below is the honest read on when NeoCloud, a US-anchored neocloud, a hyperscaler, or a UK-historic sovereign cloud fit best.

Concern	Yobitel NeoCloud	CoreWeave	Lambda	UKCloud
UK NCSC OFFICIAL primary regions	Yes, multiple	No (US-anchored)	No (US-anchored)	Yes, historic
EU Data Boundary regions	Yes	Limited	No	No
US FedRAMP-equivalent regions	Yes via partner	Yes	Limited	No
NVIDIA B200/H200/H100 coverage	B300, B200, H200, H100	H100, H200, GB200	H100, H200	Limited
AMD MI300X coverage	Yes	Limited	Yes	No
Confidential tenancy	NVIDIA TEE attestation	Limited	No	No
Per-second billing in USD	Yes	Yes (USD)	Yes (USD)	GBP-anchored
FOCUS 1.1 billing export	Yes	Proprietary	Proprietary	Proprietary
Sovereignty enforcement at admission	Yes	Region-only	Region-only	Yes
AI fabric (InfiniBand-rack, RoCEv2-DC)	Yes	Yes	Yes (NVLink-fabric, InfiniBand)	Limited
Spot floor	Yes (per-region floor)	Yes	Yes	No
Yobibyte managed surface available	Native	No	No	No

Troubleshooting#

The errors below are the most common during onboarding and the first weeks of production. The full runbook library is at docs.yobitel.com/neocloud/runbooks.

Error	Cause	Fix
ReservationProvisioningFailed: capacity-short	Requested region does not currently have the requested SKU and term available.	Either accept the reservation queue (NeoCloud auto-provisions when capacity frees), pre-purchase via the Capacity tab, or move to a sibling region inside the same sovereignty boundary.
RegionCapacityUnavailable: uk-london-1	Region is at capacity for the requested SKU.	Same fixes as above; the console surfaces nearest sibling regions inside the workspace's sovereignty boundary.
FabricDegraded: link error rate above threshold	An InfiniBand or RoCEv2 link in the reservation's fabric is experiencing elevated error rates.	NeoCloud's NOC auto-pages on this alert; for customer-side action, pause distributed training until the runbook clears (typically minutes) to avoid silent corruption in NCCL collectives.
AdmissionDenied: complianceMismatch	Workload labelled with a sovereignty tag that does not match the region the reservation targets.	Either move the workload to a region inside the workspace's sovereignty boundary, or remove the compliance label if the constraint no longer applies.
BillingExportEmpty: FOCUS bucket policy denies write	Customer-owned FOCUS export bucket policy does not allow the NeoCloud writer role to put objects.	Apply the bucket policy snippet in the workspace's Billing tab and wait one export window; exports retry hourly.
SpotReclaimWarning: 60-second notice	Spot capacity is being reclaimed by NeoCloud for a higher-priority reservation.	Workload receives a 60-second reclaim notice on the standard NeoCloud webhook; checkpoint state and exit. Spot reservations retry on the next eligible window.
IdentityFederationFailed: OIDC discovery 401	Workspace OIDC issuer URL or audience misconfigured.	Re-enter issuer URL and audience in the workspace Identity tab; confirm the IdP's discovery document resolves over the public internet.
SpendCapExceeded: on-demand paused	Workspace USD spend cap reached on the configured window.	Raise the cap in the Billing tab or wait for the next budget window; reservations continue, on-demand resumes when the cap allows.
KmsDecryptDenied	Customer-managed KMS key policy is missing the NeoCloud data-plane role.	Add the role ARN shown in the workspace's Setup tab to the KMS key policy.
NetworkEgressOverrun	Workspace egress to internet exceeded the included quota for the month.	FOCUS export shows per-workspace egress; either raise the quota, route through dedicated cross-connect, or move egress-heavy workloads to a region with private connectivity.

Where NeoCloud fits in the Yobitel stack#

NeoCloud is the sovereign capacity layer that sits underneath every Yobitel-operated surface. Yobibyte runs on NeoCloud — when a customer deploys an Inference inside a Yobibyte workspace, the capacity that backs the deployment is NeoCloud capacity (selected by the platform from the customer's sovereignty region). Omniscient Compute indexes NeoCloud capacity alongside every other provider in its vendor-neutral catalogue, with no preferential ranking. MediQuery and the rest of the Yobitel AI Applications suite run on Yobibyte and therefore consume NeoCloud capacity transparently.

Practically, a customer can adopt NeoCloud at three levels. A platform team that wants raw sovereign capacity adopts NeoCloud directly and runs its own workloads in the workspace boundary; pricing is per-second USD with reservation and spot options. A team that wants the managed inference surface adopts Yobibyte (which uses NeoCloud underneath) and never sees the capacity layer. A team building or operating its own sovereign GPU cloud adopts NeoCloud Operations (the consulting and operating engagement) to stand up a partner-branded neocloud that participates in the broader Omniscient Compute index.

The boundaries are deliberate. NeoCloud is the sovereign-capacity contract — facility, fleet, fabric, sovereignty enforcement, billing. Yobibyte is the managed-inference contract. NeoCloud Operations is the partner-build contract. A customer can buy any of the three independently; most customers buy two or three together because the boundaries map to how procurement is actually organised.

References

Yobitel NeoCloud page · Yobitel
Yobibyte platform · Yobitel
Omniscient Compute · Yobitel
NCSC Cloud Security Principles · NCSC
FOCUS — FinOps Open Cost and Usage Specification · FinOps Foundation
EU Data Boundary · European Commission
FedRAMP · FedRAMP PMO

TL;DR

Yobitel NeoCloud is the sovereign Tier III+ GPU cloud that sits underneath Yobibyte and is also available as a standalone capacity service — multi-tenant on-demand, dedicated single-tenant, and confidential-compute tenancies, with a per-second pricing model in USD.
Sovereignty leads: UK NCSC OFFICIAL alignment in primary regions, EU Data Boundary regions for EU workloads, and US FedRAMP-equivalent partner regions for US-regulated workloads. Admission refuses to spill workloads across boundaries regardless of capacity or price.
Accelerator coverage spans NVIDIA B300/B200/H200/H100/A100/L40S/L4 and AMD MI300X, with high-bandwidth AI fabric (NVLink within node, InfiniBand within rack, RoCEv2 across racks) and Tier III+ facility design at sub-1.1 PUE.
Customer-facing surfaces: a console for reservation, spot, and on-demand provisioning; a per-region capacity catalogue; DCGM-emitted metrics into customer Prometheus; FOCUS 1.1 billing export to customer-owned object storage; the standard Yobibyte workspace boundary for tenancy.
Yobitel operates the facility, the fleet, the fabric, and the platform end-to-end; the customer owns the data, the workload, and the consumption decisions. A 99.99% uptime SLA backs the production tier.

Overview#

Quick start#

Concepts#

Region — the geographic placement of capacity, bound to a sovereignty tag. UK regions align to NCSC Cloud Security Principles and the OFFICIAL classification; EU regions sit inside the EU Data Boundary; US regions, delivered through FedRAMP-equivalent partner facilities, align to the moderate-baseline control set.
Tenancy — the isolation model for a workload. `shared` runs on multi-tenant nodes with cgroup and MIG isolation (the right default for most workloads). `dedicated` pins to single-tenant nodes for regulatory or noisy-neighbour reasons. `confidential` runs on NVIDIA confidential-compute mode with TEE attestation so encryption keys never leave the GPU.
Reservation — committed capacity for a defined term (monthly, 1-year, or 3-year) at a discounted rate. Used for steady production inference and known-shape training reservations.
Spot — opportunistic capacity available at a discount, reclaimable on short notice. Used for fault-tolerant fine-tunes, batch scoring, and shadow training. Per-region spot floors move with supply and demand.
On-demand — uncommitted per-second capacity at the headline rate. Used for bursty workloads and pre-reservation prototyping.
AI Fabric — the high-bandwidth interconnect that ties accelerators together. NVLink-fabric within a single node (typical for 8-GPU single-host workloads), InfiniBand-rack within one rack (typical for distributed training up to 64 nodes), and RoCEv2-DC across racks (for cluster sizes beyond a single rack). The customer picks the topology requirement; the platform satisfies it from the regional fabric.
Spend Cap — a hard USD budget set per reservation, per workspace, or per organisation. When reached, on-demand consumption pauses and a P2 alert fires; reservations continue to bill until the term ends.
Compliance Tag — the regulatory posture the workload is pinned to. NCSC OFFICIAL, EU Data Boundary, FedRAMP-equivalent, HIPAA-eligible, ISO 27001-certified, SOC 2 Type II attested. Admission refuses placement outside the tagged set.

Reference — customer-facing API and console fields#

Field	Type	Description
reservation.region	string	Sovereignty region (e.g. `uk-london`, `uk-manchester`, `eu-frankfurt`, `eu-paris`, `us-ashburn`).
reservation.accelerator	enum	`b300`, `b200`, `h200`, `h100-sxm5`, `h100-pcie`, `a100`, `l40s`, `l4`, `mi300x`.
reservation.count	integer	Number of accelerators. For distributed training, the count is expressed in nodes (8-GPU each).
reservation.tenancy	enum	`shared`, `dedicated`, or `confidential`.
reservation.term	enum	`on-demand`, `spot`, `reserved-monthly`, `reserved-1y`, `reserved-3y`.
reservation.fabric	enum	`nvlink-node`, `infiniband-rack`, `infiniband-row`, `rocev2-dc`, or `none` for single-accelerator workloads.
reservation.complianceTags	string[]	Required tags (e.g. `ncsc-official`, `eu-data-boundary`, `hipaa`).
reservation.spendCap	object	USD budget with `amount` and `window` (`hourly`, `daily`, or `monthly`).
reservation.startTime	datetime	Requested start. Reservations queue if the region does not have capacity at the requested time.
reservation.endTime	datetime	Requested end (optional for on-demand and reserved terms).
identity.oidcIssuer	string	Workspace OIDC issuer for identity federation.
identity.scimEndpoint	string	Optional SCIM 2.0 endpoint for user and group provisioning.
observability.prometheusScrape	boolean	Exposes the standard `yobitel_*` metric set on a workspace-scoped scrape endpoint.
observability.dcgmExporter	boolean	DCGM exporter enabled per accelerator.
observability.otelEndpoint	string	Customer-owned OTel collector endpoint.
billing.focusExportBucket	string	Customer-owned object-storage bucket for FOCUS 1.1 hourly export.
billing.focusKmsKey	string	Customer-managed KMS key the export is encrypted with at rest.
network.cidrAllowList	string[]	Optional inbound CIDR allow-list at the workspace gateway.
network.privateConnectivity	enum	`internet`, `aws-directconnect`, `azure-expressroute`, `gcp-interconnect`, or `dedicated-cross-connect`.

Workload patterns#

Three patterns cover most customers. The patterns differ on term type, tenancy, and topology; the customer-facing console and API are the same across all three.

Pattern A — training-cluster reservation. A research team reserves a 64-node H100 cluster on InfiniBand fabric in `uk-london` for a 3-week training run, with confidential tenancy and an immutable FOCUS export to the team's S3 bucket. The reservation queue surfaces the start time once the rack is allocated; the team's workload runs inside its Yobibyte workspace.
Pattern B — production inference tenancy. A SaaS product reserves a steady pool of 16 H200 cards on `shared` tenancy in `uk-london` and `eu-frankfurt`, with autoscaling spot capacity layered on top for traffic peaks. The Yobibyte workspace draws on the reservation first and bursts to spot when steady capacity is saturated.
Pattern C — air-gapped sovereign deployment. A regulated UK customer takes a dedicated tenancy block in a Tier III+ enclave with no internet egress, NCSC OFFICIAL-SENSITIVE alignment, and on-premise audit export. The customer's workspace runs inside the enclave; Yobitel operates the facility under a NeoCloud Operations engagement.

Sizing and capacity tiers#

Tier	Typical use	Reservation scale	Term
Pilot	Single-team prototype, evaluation	1 - 8 accelerators	On-demand or monthly reservation
Production-small	Steady inference for a single product	8 - 64 accelerators	Monthly or 1-year reservation
Production-medium	Multi-product platform, mixed inference and fine-tune	64 - 256 accelerators	1-year or 3-year reservation
Production-large	Org-wide AI platform, multi-region	256 - 1,024 accelerators	1-year or 3-year reservation with multi-region split
Training-cluster	Distributed training on InfiniBand fabric	8 - 512 nodes (64 - 4,096 GPUs)	Reserved 2-12 week training block
Reserved-frontier	Frontier model training, multi-thousand-GPU clusters	512 - 2,048 nodes (4,096 - 16,384 GPUs)	Reserved 3-month to 1-year training block; lead time required

Limits and quotas#

Resource	Default	Enterprise ceiling	How to raise
GPUs per workspace	32	16,384	Self-service up to 256; ticket beyond.
Workspaces per organisation	10	200	Self-service.
Reservations per workspace	20	500	Self-service.
Concurrent regions per workspace	3	12	Self-service.
Spot reservations per workspace	10	200	Self-service; per-region spot floors apply.
Reservation lead time (pilot tier)	Immediate	Immediate	On-demand capacity is provisioned in seconds when available.
Reservation lead time (frontier tier)	30 days	30 - 90 days	Frontier reservations require lead time and account-team coordination.
Network egress to internet	10 TB/month per workspace included; metered beyond	1 PB/month	Self-service; FOCUS export shows egress per workspace.
Egress between NeoCloud regions	Unmetered	Unmetered	Hard floor; no inter-region egress charge.
Confidential-tenancy quota	8 GPUs	1,024 GPUs	Support request; confirms TEE attestation supply.
Spend cap precision	USD 1	USD 1	Hard floor.
FOCUS export retention	13 months	7 years	Configurable in console; customer storage policy governs.
DCGM scrape interval	10 seconds	1 second	Configurable per workspace.

Observability#

yaml

groups:
- name: neocloud-fabric
  interval: 30s
  rules:
  - alert: FabricDegraded
    expr: |
      max by (workspace, reservation, fabric) (
        yobitel_neocloud_fabric_link_error_rate
      ) > 1e-9
    for: 5m
    labels: { severity: page }
    annotations:
      summary: "{{ $labels.reservation }} fabric error rate above 1e-9"
      runbook: https://docs.yobitel.com/neocloud/runbooks/fabric-degraded

  - alert: ReservationCapacityShort
    expr: |
      yobitel_neocloud_reservation_capacity_short_seconds > 0
    for: 10m
    labels: { severity: page }
    annotations:
      summary: "{{ $labels.reservation }} short of committed capacity"

  - alert: SovereigntyAdmissionDenied
    expr: |
      rate(yobitel_neocloud_admission_denied_total{reason="compliance"}[15m]) > 0
    for: 30m
    labels: { severity: ticket }

Cost and FinOps#

SKU	On-demand $/GPU/hr	1-yr reserved $/GPU/hr	3-yr reserved $/GPU/hr	Spot floor
NVIDIA B200 192GB	$6.00	$4.50	$3.60	n/a
NVIDIA H200 141GB	$4.25	$3.20	$2.55	$1.75
NVIDIA H100 SXM5 80GB	$3.25	$2.45	$1.95	$1.20
NVIDIA H100 PCIe 80GB	$2.95	$2.20	$1.75	$1.05
NVIDIA A100 80GB	$2.25	$1.70	$1.40	$0.80
NVIDIA L40S 48GB	$1.20	$0.90	$0.70	$0.45
NVIDIA L4 24GB	$0.50	$0.40	$0.30	$0.22
AMD MI300X 192GB	$4.00	$3.00	$2.45	n/a
Object storage (per GB-month)	$0.022	—	—	—
Egress to internet (per GB)	$0.075	—	—	—
Egress between NeoCloud regions (per GB)	$0.00	—	—	—
Confidential-tenancy surcharge (per GPU/hr)	$0.40	$0.30	$0.24	n/a
Dedicated-tenancy surcharge (per GPU/hr)	$0.30	$0.22	$0.18	n/a

Security and compliance#

NCSC Cloud Security Principles — controls mapped per principle for UK primary regions; OFFICIAL-tier audited annually.
G-Cloud — listed under Cloud Hosting (Lot 1) and Cloud Software (Lot 2); orderable through the Crown Commercial Service framework.
Cyber Essentials Plus — current certificate.
ISO 27001:2022, ISO 27017, ISO 27018 — current certificates.
SOC 2 Type II — annual third-party audit covering security, availability, confidentiality.
GDPR / UK DPA 2018 — DPA, sub-processor list, EU SCCs available; data residency enforced at admission.
EU Data Boundary — EU regions sit inside the boundary; admission refuses spill.
DORA — operational-resilience evidence available for financial-services customers.
FedRAMP-equivalent — moderate-baseline-aligned controls available via US partner regions.
HIPAA — BAA available for healthcare workloads.
Confidential tenancy — NVIDIA confidential-compute mode with TEE attestation; encryption keys never leave the GPU.

Migration and alternatives#

Concern	Yobitel NeoCloud	CoreWeave	Lambda	UKCloud
UK NCSC OFFICIAL primary regions	Yes, multiple	No (US-anchored)	No (US-anchored)	Yes, historic
EU Data Boundary regions	Yes	Limited	No	No
US FedRAMP-equivalent regions	Yes via partner	Yes	Limited	No
NVIDIA B200/H200/H100 coverage	B300, B200, H200, H100	H100, H200, GB200	H100, H200	Limited
AMD MI300X coverage	Yes	Limited	Yes	No
Confidential tenancy	NVIDIA TEE attestation	Limited	No	No
Per-second billing in USD	Yes	Yes (USD)	Yes (USD)	GBP-anchored
FOCUS 1.1 billing export	Yes	Proprietary	Proprietary	Proprietary
Sovereignty enforcement at admission	Yes	Region-only	Region-only	Yes
AI fabric (InfiniBand-rack, RoCEv2-DC)	Yes	Yes	Yes (NVLink-fabric, InfiniBand)	Limited
Spot floor	Yes (per-region floor)	Yes	Yes	No
Yobibyte managed surface available	Native	No	No	No

Troubleshooting#

The errors below are the most common during onboarding and the first weeks of production. The full runbook library is at docs.yobitel.com/neocloud/runbooks.

Error	Cause	Fix
ReservationProvisioningFailed: capacity-short	Requested region does not currently have the requested SKU and term available.	Either accept the reservation queue (NeoCloud auto-provisions when capacity frees), pre-purchase via the Capacity tab, or move to a sibling region inside the same sovereignty boundary.
RegionCapacityUnavailable: uk-london-1	Region is at capacity for the requested SKU.	Same fixes as above; the console surfaces nearest sibling regions inside the workspace's sovereignty boundary.
FabricDegraded: link error rate above threshold	An InfiniBand or RoCEv2 link in the reservation's fabric is experiencing elevated error rates.	NeoCloud's NOC auto-pages on this alert; for customer-side action, pause distributed training until the runbook clears (typically minutes) to avoid silent corruption in NCCL collectives.
AdmissionDenied: complianceMismatch	Workload labelled with a sovereignty tag that does not match the region the reservation targets.	Either move the workload to a region inside the workspace's sovereignty boundary, or remove the compliance label if the constraint no longer applies.
BillingExportEmpty: FOCUS bucket policy denies write	Customer-owned FOCUS export bucket policy does not allow the NeoCloud writer role to put objects.	Apply the bucket policy snippet in the workspace's Billing tab and wait one export window; exports retry hourly.
SpotReclaimWarning: 60-second notice	Spot capacity is being reclaimed by NeoCloud for a higher-priority reservation.	Workload receives a 60-second reclaim notice on the standard NeoCloud webhook; checkpoint state and exit. Spot reservations retry on the next eligible window.
IdentityFederationFailed: OIDC discovery 401	Workspace OIDC issuer URL or audience misconfigured.	Re-enter issuer URL and audience in the workspace Identity tab; confirm the IdP's discovery document resolves over the public internet.
SpendCapExceeded: on-demand paused	Workspace USD spend cap reached on the configured window.	Raise the cap in the Billing tab or wait for the next budget window; reservations continue, on-demand resumes when the cap allows.
KmsDecryptDenied	Customer-managed KMS key policy is missing the NeoCloud data-plane role.	Add the role ARN shown in the workspace's Setup tab to the KMS key policy.
NetworkEgressOverrun	Workspace egress to internet exceeded the included quota for the month.	FOCUS export shows per-workspace egress; either raise the quota, route through dedicated cross-connect, or move egress-heavy workloads to a region with private connectivity.

Where NeoCloud fits in the Yobitel stack#

References

Yobitel NeoCloud page · Yobitel
Yobibyte platform · Yobitel
Omniscient Compute · Yobitel
NCSC Cloud Security Principles · NCSC
FOCUS — FinOps Open Cost and Usage Specification · FinOps Foundation
EU Data Boundary · European Commission
FedRAMP · FedRAMP PMO

Yobitel NeoCloud

Overview#

Quick start#

Concepts#

Reference — customer-facing API and console fields#

Workload patterns#

Sizing and capacity tiers#

Limits and quotas#

Observability#

Cost and FinOps#

Security and compliance#

Migration and alternatives#

Troubleshooting#

Where NeoCloud fits in the Yobitel stack#

References

Browse all entries

Deploy on Yobitel

Yobitel NeoCloud

Overview#

Quick start#

Concepts#

Reference — customer-facing API and console fields#

Workload patterns#

Sizing and capacity tiers#

Limits and quotas#

Observability#

Cost and FinOps#

Security and compliance#

Migration and alternatives#

Troubleshooting#

Where NeoCloud fits in the Yobitel stack#

References

Browse all entries

Deploy on Yobitel