TL;DR
- Yobitel Edge AI is the managed edge-inference capability inside the Yobitel Infrastructure pillar — on-device inference, OTA model updates, attested rollouts, and fleet observability for workloads where central-cloud latency, bandwidth, or sovereignty rules out cloud-resident inference.
- Customers register an edge fleet (a logical group of edge sites running a shared application), push a model image through the workspace, and observe rollout, attestation, and telemetry through the same Yobibyte console used for cloud workloads.
- Supported edge hardware spans NVIDIA Jetson Orin Nano / NX / AGX Orin / Thor, NXP i.MX series, Hailo-8/10, AMD Versal AI Edge, and named partner hardware. The runtime profile is optimised per platform; sub-10ms inference latency is the typical target.
- Distinct from cloud Yobibyte (which is cloud-resident inference) and from Yobitel GPU Cloud (which is data-centre GPU capacity); Edge AI is the on-device and near-edge surface, with the same identity, audit, and FOCUS billing inheriting from the workspace.
- Sovereignty extends to the edge: fleets bound to UK NCSC OFFICIAL accept only signed model images attested against the workspace's KMS-anchored chain and ship telemetry to a UK-resident audit destination.
Overview#
Some workloads cannot wait for the cloud. A robot's grasp decision needs to happen in under 5 ms; a livestock-monitoring fleet runs in pastures with intermittent connectivity; a defence sensor must keep operating after the satellite uplink drops; a manufacturing-vision QC pipeline needs to inspect every part at line speed without exporting the image stream. Each of these is a physical AI or edge IoT workload, and each needs a runtime that is on-device, fleet-managed, and operated under the same identity, audit, and sovereignty surface as the customer's cloud-resident AI.
Yobitel Edge AI is the managed edge-inference capability inside the Yobitel Infrastructure pillar. It is the edge counterpart to cloud Yobibyte: customers register a fleet (a logical group of edge sites running a shared application), push a model image, and observe rollout, attestation, and telemetry through the same Yobibyte console used for cloud workloads. The runtime on each device is industry-standard (per platform) and operated by Yobitel; the customer's interaction is the fleet surface, not the per-device runtime.
Compared with AWS IoT Greengrass and Azure IoT Edge, the differentiator is two-fold: first, the platform surface above is the same Yobibyte workspace that operates the customer's cloud inference, so identity, audit, billing, and policy do not fragment between cloud and edge; second, the sovereignty discipline that runs in cloud regions extends to edge fleets — signed model images, attested rollouts, KMS-anchored update chains, and UK-resident audit destinations for UK-pinned workspaces. Compared with running an edge stack in-house on commodity orchestration, the difference is that Yobitel operates the fleet's runtime end-to-end; the customer configures the fleet, not the per-device update mechanics.
Yobitel Communications — a UK-headquartered AI infrastructure company and NVIDIA Inception partner — operates the edge capability. Fleets can be UK-only (NCSC OFFICIAL eligible), EU-only (EU Data Boundary), or US-only (FedRAMP-equivalent) depending on the host workspace. Pricing is in USD throughout per-device per-month with predictable rollover for adds and removals.
Quick start#
Edge deployment is a workspace activity. Sign in to the Yobibyte console with your corporate identity provider and open the Edge AI view; the surface is scoped to your workspace's sovereignty pin (a UK-bound workspace's fleets and audit destinations stay UK-resident).
Register a fleet. The dialog asks for fleet name, target hardware platform (Jetson Orin NX, AMD Versal AI Edge, etc.), expected fleet size envelope, and the OTA channel (`stable`, `staged`, or `canary`). Yobitel produces a fleet identity (a signing identity bound to the workspace's KMS chain) and an enrolment token that field-deployed devices use to join the fleet. The enrolment workflow is documented per hardware platform and is typically embedded in the customer's existing device-imaging pipeline.
Push a model image. From the marketplace or from a customer-private upload, pick the model and select the fleet as the deployment target. Yobitel produces a fleet-signed, attested model image for the fleet's hardware platform; the OTA channel determines whether the rollout is staged or pushed to all devices. The console surfaces per-device acceptance, attestation success, and the first inference's telemetry sample for each device as the rollout progresses.
Observe the rollout. The fleet dashboard surfaces rollout progress (percentage of devices on the new image), health (each device's last-seen, inference rate, telemetry sample), and any attestation failures. Customers running this for the first time typically watch a canary rollout to a single site, then promote to a staged channel, then to the full fleet.
Treat the OTA channel as the lever for risk control. Canary rollouts on 1–2 sites catch the silly bugs (a quantisation regression on a specific Jetson revision); staged rollouts on 10% of the fleet catch the operational ones (a runtime memory regression that only shows up at scale); full rollouts close the loop. Skipping channels saves an afternoon and costs a week.
Concepts#
Edge AI introduces a small set of concepts that sit alongside the workspace primitives. The mental model is: a fleet is a managed group of edge sites; an OTA channel is the rollout lever; a model image is the unit of deployment; attestation is the trust mechanism; telemetry is the observability surface.
- Edge Site — a single physical device or grouped set of devices at one location (a robot, a camera, a vehicle, a kiosk, a sensor cluster). Sites enrol against a fleet identity and inherit the fleet's policy.
- Fleet — a logical group of edge sites running a shared application or model. Fleets are scoped to a workspace, inherit its sovereignty pin and audit destination, and are the unit at which OTA, rollback, and policy apply.
- OTA Update — the over-the-air update path for fleet model images. Updates are signed by the fleet identity, attested on-device, and applied via the customer's chosen channel (`stable`, `staged`, `canary`).
- Local Inference Runtime — the per-platform managed runtime that loads and serves models on the device. Customers do not select or operate the runtime; Yobitel ships the platform-appropriate runtime as part of the fleet's pinned profile.
- Model Sync — the mechanism by which model images reach devices: pull from the workspace's UK/EU/US-resident object store, with delta updates and resumable transfers for low-bandwidth or intermittent links.
- Telemetry Stream — the per-device telemetry the fleet emits back to the workspace: inference rate, model output sample (subject to policy), device health (CPU, memory, thermals), and observability signals. Subject to the customer's redaction policy and sovereignty rules.
- Attestation Chain — the signature and verification chain that proves a model image came from the workspace's KMS-anchored identity. Devices refuse images that fail attestation.
- Fleet Policy — the customer-configurable rules that govern what a fleet can run: model size ceiling, OTA channel, telemetry sampling rate, redaction rules, audit destination, and the kill-switch.
Reference — supported edge hardware platforms#
The table below is the supported hardware index. Per-platform runtime profiles (quantisation defaults, supported model families, memory and compute envelopes) are published in each platform's edge-hardware listing.
| Platform | Class | Compute envelope | Typical workloads | OTA support |
|---|---|---|---|---|
| NVIDIA Jetson Orin Nano | Entry tier | 40 TOPS, 8 GB shared | Light vision, IoT sensors, smart cameras | Stable / staged / canary |
| NVIDIA Jetson Orin NX | Mid tier | 70–100 TOPS, 16 GB shared | Mid-complexity vision, multi-stream cameras, robotics PoC | Stable / staged / canary |
| NVIDIA Jetson AGX Orin | High tier | Up to 275 TOPS, 32 GB shared | Multi-model robotics, autonomous platforms, defence-grade vision | Stable / staged / canary |
| NVIDIA Jetson Thor | Physical AI tier | Targeting 2000+ TFLOPS FP4, 128 GB | Humanoid robotics, autonomous vehicles, multi-modal physical AI | Stable / staged / canary |
| NXP i.MX 8M Plus / 9 series | Low-power industrial | ~2.3 TOPS NPU, ARM cores | Industrial vision, predictive maintenance, building automation | Stable / staged |
| Hailo-8 | Inference accelerator | 26 TOPS at 2.5 W typical | Vision pipelines, retail analytics, edge surveillance | Stable / staged / canary |
| Hailo-10 | Inference accelerator (gen-2) | 40 TOPS, expanded model support | Edge LLM inference, multi-stream vision, agricultural AI | Stable / staged / canary |
| AMD Versal AI Edge | FPGA + AI Engine | Configurable; 27 INT8 TOPS typical | Industrial vision, defence sensors, telco edge | Stable / staged |
| Raspberry Pi 5 + AI HAT | Hobbyist / education | 13 TOPS via accelerator | PoC, education, ultra-low-cost pilots | Stable / staged |
| Partner-integrated hardware | Bespoke | Per partner spec | Industry-specific form factors | Per partner integration |
Workload patterns#
Three deployment shapes cover most of what customers do with Edge AI — a livestock-monitoring fleet running with intermittent connectivity, an industrial-vision QC pipeline on a manufacturing line, and a sovereign defence edge with attested updates only.
- Livestock fleet monitor — an agricultural customer deploys a fleet of 800 Jetson Orin Nano units across rural sites. The fleet's OTA channel is `staged` with a 7-day window; telemetry sampling is 1% to conserve bandwidth; the fleet's audit destination is the customer's EU-resident SIEM. Model images are signed by the fleet identity; each site verifies the signature against the workspace's KMS chain before applying. Sites operate offline for up to 72 hours; telemetry buffers locally and syncs on reconnect.
- Industrial vision QC — a manufacturer deploys a fleet of 60 Hailo-10 inference accelerators across a production-line vision system. The fleet's OTA channel is `canary` to one line for new model versions; promotion to `staged` and full rollout requires the platform team's approval. Telemetry includes per-part inference results streamed to the manufacturer's MES; the policy redacts image content from telemetry by default to keep the stream small.
- Sovereign defence edge — a defence-adjacent supplier deploys an AMD Versal AI Edge fleet with attestation-only updates and a UK-resident audit destination. The fleet's policy requires attestation success before any device accepts an image; failed attestations alert immediately. The fleet is in a workspace bound to UK NCSC OFFICIAL with the OFFICIAL-SENSITIVE handling option enabled.
# PREVIEW - the EdgeFleet declarative shape is in active development;
# this is the planned shape and is not runnable today. Edge fleets
# are registered today through the Yobibyte console.
#
# apiVersion: yobibyte.yobitel.com/v1
# kind: EdgeFleet
# metadata:
# name: livestock-uk
# workspace: agri-customer-uk
# spec:
# hardwarePlatform: jetson-orin-nano
# expectedSize: 800
# sovereignty: uk-ncsc-official
# otaChannel: staged
# model:
# marketplaceEntry: livestock-vision-2026.06
# signedBy: fleet-identity-livestock-uk
# policy:
# telemetrySamplingRate: 0.01
# redactImageContent: true
# offlineBufferHours: 72
# auditDestination: eu-siem-prod
# spendCap:
# amount: 12000
# currency: USD
# window: monthlyFleet size and edge SKU pairings#
The table below pairs typical fleet sizes with the edge SKU most often chosen for the workload. Fleet size is an envelope, not a hard limit; specific deployments may run much larger fleets on the same SKU.
| Workload shape | Typical fleet size | Recommended SKU(s) | Rationale |
|---|---|---|---|
| Smart camera / retail analytics | 10–500 sites | Hailo-8, Jetson Orin Nano | Low-power, ample inference for vision, low unit cost. |
| Agricultural / rural monitoring | 100–5,000 sites | Jetson Orin Nano, Hailo-10 | Intermittent connectivity, telemetry-sparse, on-device decisioning. |
| Industrial vision QC | 10–200 lines | Hailo-10, Jetson Orin NX | Low latency, repeatable inference, attested updates required. |
| Robotics development | 1–50 robots | Jetson AGX Orin, Jetson Thor | Multi-modal models, high compute envelope, physical AI. |
| Autonomous vehicles / drones | 10–500 platforms | Jetson Thor, Jetson AGX Orin | Sub-5 ms decisioning, high-bandwidth sensor fusion. |
| Defence sensor networks | 10–500 sites | AMD Versal AI Edge, Jetson AGX Orin | Attestation-only updates, sovereignty critical. |
| Telco network edge | 100–2,000 sites | AMD Versal AI Edge, Hailo-10 | FPGA flexibility, telco lifecycle, partner integrations. |
| Embedded IoT / predictive maintenance | 1,000–50,000 sites | NXP i.MX 8M Plus, Raspberry Pi 5 + AI HAT | Ultra-low cost, ARM ecosystem, low compute envelope. |
Limits and quotas#
Default per-fleet and per-workspace limits exist to protect the OTA fabric and the customer's bandwidth. Almost every limit is raisable on request.
| Resource | Default | Enterprise ceiling | How to raise |
|---|---|---|---|
| Fleets per workspace | 20 | 200 | Self-service in console. |
| Sites per fleet | 5,000 | 100,000 | Self-service up to 10,000; ticket beyond. |
| Model image size | 2 GB | 20 GB | Self-service; bound by device storage. |
| OTA rollout concurrent sites | 200 | 5,000 | Self-service; rate-limited to protect bandwidth. |
| Telemetry events per site per minute | 60 | 600 | Self-service; bandwidth-aware. |
| Offline buffer retention per site | 72 hours | 30 days | Self-service; subject to on-device storage. |
| Attestation failures retained per fleet | All | All | Hard-retained for compliance. |
| Custom hardware profiles per workspace | 10 | 50 | Per partner integration. |
| Fleet identities per workspace | 50 | 500 | Self-service. |
| OTA channels per fleet | 3 | 6 | Stable, staged, canary by default; custom channels available. |
| Audit log retention | 90 days | 7 years | Enterprise tier. |
| Per-fleet spend cap precision | USD 1 | USD 1 | Hard floor. |
Observability#
Every fleet emits a stable `yobitel_edge_*` metric set: fleet health (online site count, last-seen age, model version distribution), rollout health (rollout percentage, acceptance rate, attestation success rate), and inference health (per-site inference rate, telemetry sample, on-device thermals and power). The hosted Grafana dashboard surfaces these by default; customers ship the same stream to their own Prometheus or OpenTelemetry collector.
Three signals matter most for edge fleets: fleet online ratio (are sites reachable?), model rollout success rate (is the new image actually landing?), and attestation failure rate (is the trust chain intact?). The PromQL block below is the alert most fleet operators add first.
groups:
- name: yobitel-edge
interval: 30s
rules:
- alert: EdgeFleetOnlineRatioDegraded
expr: |
avg by (workspace, fleet) (
yobitel_edge_fleet_online_ratio
) < 0.85
for: 30m
labels: { severity: page }
annotations:
summary: "{{ $labels.fleet }} online ratio below 85%"
- alert: EdgeOtaRolloutStuck
expr: |
yobitel_edge_ota_rollout_acceptance_ratio < 0.9
and on (workspace, fleet)
yobitel_edge_ota_rollout_active == 1
for: 1h
labels: { severity: warn }
- alert: EdgeAttestationFailures
expr: increase(yobitel_edge_attestation_failed_total[15m]) > 0
for: 1m
labels: { severity: page }Cost and FinOps#
Edge AI is priced per-device per-month in USD, with a tier ladder that covers the on-device runtime, OTA bandwidth envelope, telemetry ingestion, and the workspace's audit pipeline. Hardware itself is customer-supplied; the Yobitel pricing covers the operating envelope around it. Spend caps can be set per fleet and per workspace.
| Tier | USD per device per month | Covers | Notes |
|---|---|---|---|
| Edge Lite | $3 / device / mo | Stable OTA channel, low-rate telemetry, monthly attestation audit | Best for ultra-low-cost fleets (NXP, Pi5 + AI HAT). |
| Edge Standard | $12 / device / mo | Stable + staged OTA, mid-rate telemetry, weekly attestation audit, fleet support | Default for most production fleets. |
| Edge Pro | $28 / device / mo | All OTA channels (stable, staged, canary), high-rate telemetry, daily attestation audit, named CSM | For attested-critical fleets (defence, regulated industrial). |
| Edge Enterprise | Custom | All channels + custom OTA channels, unlimited telemetry, on-call support, dedicated runtime envelope | Negotiated for large fleets (10,000+ sites). |
| OTA bandwidth overage | $0.02 / GB | Beyond tier envelope | Predictable per-device. |
| Telemetry overage | $0.50 / 1M events | Beyond tier envelope | Customer-controlled via sampling rate. |
| Audit retention beyond 90 days | $0.05 / GB-month | Beyond default | Enterprise tier includes 7-year retention. |
Edge pricing is per-device per-month rather than per-inference, so cost is predictable at fleet scale. Treat tier choice as a policy decision (channel coverage, attestation cadence, support) rather than a cost-optimisation lever.
Security and compliance#
Signed model images anchor the trust model. Every model image is signed by the fleet's identity, which is itself bound to the workspace's KMS chain; devices refuse images that fail attestation against the workspace's published public-key chain. Attestation failures alert immediately and are retained in the audit stream.
Sovereignty extends to the edge. A fleet in a workspace bound to UK NCSC OFFICIAL ships telemetry to a UK-resident audit destination, refuses to enrol sites whose declared location is outside the UK, and uses UK-resident object storage for model image distribution. The same discipline applies to EU Data Boundary and US FedRAMP-equivalent fleets in their respective workspaces.
- Signed model images — every image signed by the fleet identity, anchored in the workspace's KMS chain.
- On-device attestation — devices verify image signatures before applying; failures are alertable and audited.
- Hardware root of trust — supported on Jetson and AMD Versal AI Edge; required for OFFICIAL-SENSITIVE fleets.
- NCSC Cloud Security Principles — applies to the workspace; edge fleet configurations inherit the workspace's mapping.
- G-Cloud framework — sovereign edge fleets procurable via the framework alongside the rest of the Yobitel stack.
- ISO 27001 / SOC 2 Type II — current certificates available under NDA.
- Cyber Essentials Plus — annual third-party assessment maintained.
- GDPR / UK DPA 2018 — DPA, sub-processor list, EU SCCs available; edge telemetry residency enforced.
- EU AI Act — edge-deployed models declared per fleet; risk classification inherited from the model entry.
- Sovereign edge attestation — UK NCSC OFFICIAL fleets require attestation success on every update.
Alternatives and customer-owned baseline#
Without Yobitel Edge AI, customers either stitch together open edge stacks (Greengrass, IoT Edge, k3s + manual signing) or run a single-vendor edge SaaS that does not extend their cloud workspace surface. The comparison below positions the offering.
| Concern | Yobitel Edge AI | AWS IoT Greengrass | Azure IoT Edge | In-house edge stack (k3s + signing) |
|---|---|---|---|---|
| Same workspace as cloud inference | Yes — Yobibyte unified | No — separate AWS service | No — separate Azure service | DIY |
| Sovereignty enforcement to edge | Admission-gated | DIY | DIY | DIY |
| Hardware breadth | NVIDIA + NXP + Hailo + AMD Versal + partner | AWS-curated | Azure-curated | Whatever you support |
| OTA channels (stable / staged / canary) | First-class | Custom | Custom | DIY |
| Attestation chain | KMS-anchored, audited | DIY | DIY | DIY |
| Pricing model | Per-device per-month USD | Per-component AWS billing | Per-component Azure billing | Operational cost |
| Fleet observability | Yobitel-namespaced metrics | AWS CloudWatch | Azure Monitor | DIY |
| Time to first attested rollout | Hours | Days | Days | Weeks |
The in-house edge stack is the right answer when the customer already operates a large fleet with bespoke hardware and a mature signing/rollout process. For new edge programmes or for customers who want the cloud and edge AI surface unified, the Yobitel offering is materially faster to land.
Troubleshooting#
The errors below cover the failure modes seen most often during fleet onboarding and the first rollout cycles. The full runbook library is at docs.yobitel.com/runbooks.
| Error | Cause | Fix |
|---|---|---|
| EdgeSiteUnreachable | Site missed its last-seen window (typical for intermittent connectivity). | Verify the site's network and power; check the offline buffer is within the configured window; the site re-syncs on reconnect. |
| ModelSyncFailed: bandwidth | Model image is too large for the site's effective bandwidth within the OTA window. | Either reduce the model image size (more aggressive quantisation, smaller variant) or extend the OTA window in the fleet policy. |
| AttestationFailed: signature | Device received a model image whose signature does not verify against the fleet identity's public-key chain. | Confirm the model image was produced by the workspace's fleet pipeline; rotate the fleet identity if compromise is suspected (workspace owner action). |
| AttestationFailed: hardwareRoot | Device's hardware root of trust does not match the fleet's expected anchor (common after a device replacement that was not re-enrolled). | Re-enrol the device through the fleet's enrolment flow; the workspace's audit stream records the re-enrolment. |
| FleetRolloutStuck | OTA rollout is paused because acceptance ratio dropped below the channel's threshold (e.g. canary acceptance under 90%). | Review the canary site's telemetry for the regression; either roll back the image or relax the channel threshold if the regression is expected. |
| EnrolmentTokenExpired | Site attempted to enrol after the fleet enrolment token's expiry window. | Mint a new enrolment token in the workspace's fleet management tab and re-enrol affected sites. |
| TelemetrySamplingViolation | Site is emitting telemetry above its policy sample rate (typically a misconfigured custom telemetry hook). | Review the on-device telemetry config against the fleet policy; surface in the runbook tab for the offending model image. |
| FleetPolicyConflict | Policy update created an internal conflict (e.g. requiring attestation on a hardware platform without root of trust). | The console surfaces the conflict in the policy diff view; either change the policy or pick a hardware platform that supports the required control. |
| SovereigntyMismatchEdge | Site enrolment declared a location outside the workspace's sovereignty pin. | Move the site enrolment to a workspace whose pin matches the site's location, or correct the site's declared location. |
| SpendCapExceeded: fleet paused | Fleet spend cap reached; OTA rollouts pause but devices continue to serve cached models. | Either raise the cap or wait for the next budget window; serving continues uninterrupted. |
Where Yobitel Edge AI fits in the Yobitel stack#
Edge AI is the edge counterpart to cloud Yobibyte. The same workspace, the same marketplace, the same identity and audit surface, the same FOCUS billing pipeline — all extend to the edge fleet. Customers who already use Yobibyte for cloud inference adopt Edge AI as an additional surface, not a new platform; customers who are edge-first can still benefit from the workspace's cloud features when the workload pattern needs them (training, fleet-level analytics, knowledge-base maintenance).
The AI Applications Suite is a frequent consumer: applications like Livestock Monitor publish to fleet targets out of the box, and the workspace's policy bindings extend to the edge fleets they deploy onto. Yobitel GPU Cloud is the cloud counterpart for any training or evaluation the customer needs to support the edge workload — the cloud workspace trains the model, the edge fleet runs it.
Omniscient Compute indexes a near-edge tier (regional inference points of presence) for workloads that fit between on-device and central-cloud. Edge AI is the on-device surface; near-edge is Omniscient territory. Customers picking between them choose by latency, bandwidth, and sovereignty; the Yobibyte workspace is the same on both sides of the line.
References
- Yobitel Edge AI product page · Yobitel
- Yobibyte platform · Yobitel
- AI Applications suite · Yobitel
- NVIDIA Jetson developer site · NVIDIA
- Hailo edge AI accelerators · Hailo
- AMD Versal AI Edge · AMD
- NCSC Cloud Security Principles · NCSC
- EU AI Act overview · EU AI Act