How a Workload Gets Scheduled
Submit
You submit a workload via the API, CLI, or dashboard. Chamber records it and marks it Pending.
Allocate
The scheduler assigns an allocation — picking the best fit based on your team’s reservations and available capacity. The workload moves to Queued.
Admit
The in-cluster scheduler admits the workload based on your team’s quota and current demand. When GPUs are available, it places the pods. The workload moves to Starting, then Running.
Reserved vs Elastic Workloads
Every workload you submit is either Reserved or Elastic. This determines how it accesses GPU capacity and whether it can be preempted.| Reserved | Elastic | |
|---|---|---|
| What it means | Guaranteed capacity backed by your team’s reservation | Uses spare GPUs when available |
| Can it be preempted? | Never | Yes — when reserved work needs the GPUs |
| Priority level | Non-preemptible (priority ≥ 100) | Preemptible (priority < 100) |
| Best for | Production inference, critical training runs | Experiments, hyperparameter sweeps, batch jobs |
How Scheduling Decisions Are Made
When you submit a workload:- Chamber picks the best allocation for your workload based on available capacity across your team’s reservations.
- Reserved workloads are admitted first. If your team’s allocation has room, the workload starts. Reserved capacity is guaranteed and protected.
- Elastic workloads use surplus capacity. They run on idle GPUs — either from your team’s unused reservation or from spare capacity elsewhere in the pool. Elastic workloads can be reclaimed at any time when reserved work needs the GPUs.
- The scheduler evaluates continuously. As workloads complete and GPUs free up, waiting workloads are admitted automatically.
Key Terms
| Term | What It Means |
|---|---|
| Bursting | When a team’s elastic workloads use GPUs beyond their reserved allocation. Your team “bursts” into idle capacity from the pool. |
| Burst limit | The maximum amount a team can burst beyond its reservation. For example, a 50% burst limit on a 32-GPU reservation means the team can use up to 48 GPUs total (32 reserved + 16 burst). Set to 0 to disable bursting entirely. |
| Bursting priority | A weight (1–10) that controls how surplus GPUs are divided when multiple teams are bursting at the same time. Higher priority means a larger share of the available surplus. |
| Preemption | When the scheduler stops an elastic workload to free GPUs for higher-priority work. Only elastic workloads can be preempted — reserved workloads are always protected. |
Hierarchical Bursting
Your organization’s teams are arranged in a tree. Chamber respects this hierarchy when distributing surplus capacity:- Each team has a bursting priority that controls its share of surplus GPUs when multiple teams are bursting.
- Surplus capacity flows down the tree proportionally — Research gets 60% of excess (priority 3 out of 5), Production gets 40% (priority 2 out of 5).
- Within Research, NLP gets twice the surplus share of Vision (priority 2 vs 1).
- Teams don’t compete in a flat pool. The hierarchy ensures organizational priorities are respected even when capacity is tight.
- Each team can also have a burst limit that caps how far beyond its reservation it can go — preventing any single team from consuming the entire pool.
Example: Hierarchical Bursting in Action
A pool has 100 GPUs. Research has 48 reserved, Production has 32 reserved. Research is only using 20.| Team | Reservation | In Use | Idle | What Happens |
|---|---|---|---|---|
| Research | 48 | 20 | 28 | 28 idle GPUs available for bursting |
| Production | 32 | 32 | 0 | Elastic workloads can burst into Research’s idle capacity |
| Pool total | 100 | 52 | — | 48 surplus GPUs distributed by bursting priority |
Production’s elastic workloads can temporarily burst into Research’s idle GPUs. But the moment Research submits reserved work that needs those GPUs back, Production’s elastic workloads are preempted.
Preemption
Preemption is how Chamber reclaims GPUs for higher-priority work. It’s designed to be predictable and minimize disruption.Rules
- Reserved workloads are never preempted. The scheduler will not reclaim them under any circumstances.
- Elastic workloads are always reclaimable. The scheduler can reclaim their resources when reserved workloads need capacity.
- Higher-priority elastic workloads displace lower-priority ones. Among elastic workloads, priority determines who stays when the pool is full.
- Cheapest workloads are reclaimed first. The scheduler picks victims in this order:
- Queued workloads (haven’t started — zero cost)
- Starting workloads (pods being created — minimal cost)
- Running workloads, lowest priority first (most expensive, last resort)
- Only enough workloads are reclaimed to free the capacity needed — no more.
Preemption Examples
Reserved reclaims from elastic
Reserved reclaims from elastic
NLP Team has a 32-GPU reservation. They’re running 20 GPUs of reserved inference and 12 GPUs of elastic experiments — fully utilizing their allocation. A researcher submits a 12-GPU reserved training workload.The 12 elastic experiments are preempted to make room. Reserved always wins over elastic within the same allocation.
High-priority elastic displaces low-priority elastic
High-priority elastic displaces low-priority elastic
The pool is fully utilized. Team A is running an 8-GPU elastic sweep at priority 50. Team B submits an 8-GPU elastic workload at priority 75.Both are elastic, but priority determines who stays. The lower-priority sweep is reclaimed.
No preemption needed
No preemption needed
Vision Team has a 16-GPU reservation but is only using 4 GPUs. A researcher submits an 8-GPU elastic workload.Plenty of idle capacity — the elastic workload starts immediately with no preemption. If reserved work later needs those GPUs, this elastic workload would be the first to go.
Cross-team reclaim
Cross-team reclaim
Research has 48 GPUs reserved but is only using 20. Production’s elastic workloads have expanded into 28 of Research’s idle GPUs. A Research team member submits a 24-GPU reserved workload.Research’s reservation can support the new workload (48 − 20 = 28 free slots ≥ 24 needed). The scheduler preempts 24 of Production’s 28 elastic workloads — just enough to free the physical GPUs. Production’s remaining 4 elastic workloads and all reserved workloads are untouched.
Fractional GPUs
Not every workload needs a full GPU. You can request fractions in 0.25 increments:| Request | Use Case |
|---|---|
0.25 GPU | Lightweight inference, notebooks, debugging |
0.5 GPU | Moderate inference, small model fine-tuning |
0.75 GPU | Heavier single-GPU tasks |
1.0+ GPU | Standard training and inference |
0.5) or an explicit GPU memory limit in MiB.
Distributed Training
For multi-GPU workloads that span multiple pods, Chamber supports two scheduling modes.Gang Scheduling
All pods must start together or none do. No partial starts, no stragglers. This is the default behavior.Example: You submit an 8-node distributed training workload. The cluster only has 6 GPUs free. Chamber waits until all 8 are available, then starts them simultaneously. Your training framework sees all workers from the first step.
Topology-Aware Placement
For latency-sensitive distributed workloads, Chamber co-locates pods within network topology boundaries:- Required placement — All pods must land within a specific topology boundary (e.g., same zone). If placement isn’t possible, the workload waits.
- Preferred placement — Pods are placed within a tighter boundary (e.g., same rack) when possible, but the constraint is relaxed if needed.
Workload Status Reference
| Status | What It Means | What You Should Do |
|---|---|---|
| Pending | Submitted, waiting for allocation assignment | Wait — Chamber is finding the best allocation |
| Queued | Dispatched to cluster, waiting for GPUs | Wait — the scheduler will admit when capacity is available |
| Starting | Admitted by the scheduler, pods being created | Wait — images are pulling, almost there |
| Running | All pods are running | Monitor your workload |
| Completed | Finished successfully | Collect results |
| Failed | Something went wrong | Check the failure reason |
| Preempted | Reclaimed for a higher-priority workload | Resubmit — or switch to Reserved if this keeps happening |
| Cancelled | You cancelled it | No action needed |

