TL;DR
- Savings Plans are commitment-based pricing instruments that grant a discount in exchange for an hourly dollar commitment to compute, rather than to a specific instance SKU.
- AWS introduced the model in 2019; Azure followed with Azure Savings Plans for Compute. GCP's equivalent is Flexible Committed Use Discounts (Flex CUDs).
- They sit between on-demand and Reserved Instances in flexibility, with discount levels comparable to RIs at the same term length.
- Best fit for workloads whose total compute spend is predictable, but whose mix of instance types, regions, or services may shift over the commitment term.
How They Work#
A Savings Plan is a commitment to spend a specified number of dollars per hour on eligible compute services for either 1 or 3 years. As long as your actual usage meets or exceeds that commitment, the covered usage receives a discounted rate. Usage above the commitment is billed at on-demand rates.
The provider applies the discount automatically across eligible usage; you do not have to attach the plan to a specific instance. If the same dollars-per-hour spend moves from one instance family to another mid-term, the discount follows the spend.
Plan Variants#
| Plan | Provider | Flexibility | Typical discount |
|---|---|---|---|
| Compute Savings Plans | AWS | All regions, all EC2 families, Fargate, Lambda | ~30-45 % (1yr) / ~50-60 % (3yr) |
| EC2 Instance Savings Plans | AWS | Single instance family in single region | Slightly higher than Compute SP |
| Azure Savings Plan for Compute | Azure | Across VM families and regions | ~30-65 % depending on term and commit |
| Flexible Committed Use Discounts | GCP | Across regions, instance families | Up to ~28 % (1yr) / ~46 % (3yr) |
Choosing the Commitment Level#
The commitment level should sit at or just below your sustained minimum compute spend — the floor of your spend curve over the last 30-90 days, with safety margin. Tools in each provider's console recommend a commitment level based on historic usage; treat those recommendations as a starting point, not a final answer.
Over-committing wastes money on commitments you do not use. Under-committing leaves on-demand spend on the table that could be covered. Most teams iterate — start with a conservative commitment, observe how much of the on-demand bill it leaves uncovered, and then increase.
Layer commitments rather than concentrating them. Multiple smaller plans bought at different times are easier to evolve than one large plan that locks in the assumptions of a single quarter.
Savings Plans vs Reserved Instances#
Both grant similar discounts at similar terms. The differences are about what you commit to and how rigidly the discount applies.
- Reserved Instances commit to a specific instance family, size, region, and tenancy.
- Savings Plans commit to dollars-per-hour of compute spend across an eligible scope.
- RIs can offer marginally higher discounts when you are certain about the instance type.
- Savings Plans are more forgiving when workloads evolve.
- Both can coexist — RIs apply first, then Savings Plans cover remaining eligible usage.
GPU and AI Coverage#
AWS Compute Savings Plans cover EC2 GPU instances (P5, P4, G6 etc.) on the same terms as other instances. Azure Savings Plans cover NC and ND-series GPU VMs. GCP Flex CUDs cover A3 and A2 accelerator instances. Coverage of AI managed services — SageMaker, Bedrock, Azure OpenAI, Vertex AI — varies and should be checked per service.
Yobitel Reserved Capacity#
Yobitel's reserved-capacity model is conceptually closer to Savings Plans than to per-SKU RIs: customers commit to a compute envelope across GPU generations rather than to specific GPU types. As workloads migrate between H100, H200 and B200 capacity, the commitment continues to apply, which matches how AI infrastructure actually evolves over a multi-year horizon.
References
- AWS Savings Plans documentation · AWS
- Azure Savings Plan for Compute · Microsoft Learn
- Google Cloud Committed Use Discounts · Google Cloud