What is a Reservation?
A reservation:- Allocates GPUs from a pool to a team
- Guarantees capacity for reserved workloads
- Can be adjusted or removed as needs change
- Supports the many-to-many relationship between pools and teams
Many-to-Many Relationships
Teams and pools have a many-to-many relationship through reservations:One Team, Multiple Pools
A team can have reservations in multiple pools: This enables:- Geographic distribution of workloads
- Access to different GPU types
- Redundancy across clusters
One Pool, Multiple Teams
A pool can have reservations from multiple teams:Capacity Guarantees
Reserved Capacity
When a team has a reservation:- Reserved workloads from that team are guaranteed those GPUs
- Workloads can start immediately if capacity is available within the reservation
- Other teams cannot use this capacity for reserved workloads
Elastic Usage
Unreserved pool capacity is available for elastic workloads from any team:Elastic workloads can use idle reserved capacity, but will be preempted if a reserved workload needs those resources.
Best Practices
Don't over-reserve
Don't over-reserve
Reserve what teams actually need, not what they might need. Unused reserved capacity can’t serve elastic workloads from other teams.
Use elastic for variable workloads
Use elastic for variable workloads
Teams with unpredictable demand should use smaller reservations plus elastic capacity rather than large reservations.
Review quarterly
Review quarterly
Audit reservations quarterly. Teams’ needs change; reservations should too.
Set up alerts
Set up alerts
Configure alerts for low utilization (below 30%) and high queue depth to catch misconfigurations.

