Skip to main content
Reservations are the link between capacity pools and teams. They guarantee a specific amount of GPU capacity for a team.

What is a Reservation?

A reservation:
  • Allocates GPUs from a pool to a team
  • Guarantees capacity for reserved workloads
  • Can be adjusted or removed as needs change
  • Supports the many-to-many relationship between pools and teams

Many-to-Many Relationships

Teams and pools have a many-to-many relationship through reservations:

One Team, Multiple Pools

A team can have reservations in multiple pools: This enables:
  • Geographic distribution of workloads
  • Access to different GPU types
  • Redundancy across clusters

One Pool, Multiple Teams

A pool can have reservations from multiple teams:

Capacity Guarantees

Reserved Capacity

When a team has a reservation:
  • Reserved workloads from that team are guaranteed those GPUs
  • Workloads can start immediately if capacity is available within the reservation
  • Other teams cannot use this capacity for reserved workloads

Elastic Usage

Unreserved pool capacity is available for elastic workloads from any team:
Elastic workloads can use idle reserved capacity, but will be preempted if a reserved workload needs those resources.

Best Practices

Reserve what teams actually need, not what they might need. Unused reserved capacity can’t serve elastic workloads from other teams.
Teams with unpredictable demand should use smaller reservations plus elastic capacity rather than large reservations.
Audit reservations quarterly. Teams’ needs change; reservations should too.
Configure alerts for low utilization (below 30%) and high queue depth to catch misconfigurations.