Coming Soon — Fully Managed GPU InfrastructureWe’re building automated infrastructure management across cloud providers so your research and MLE teams can run GPU workloads directly in their own cloud accounts without ever thinking about infrastructure. If you’d like early access or want to learn more, reach out at support@usechamber.io.

Auto-Containerize with `chamber run`

One command. That’s all it takes. Point chamber run at your training code and it handles everything: containerization, registry authentication, image building, and workload submission.

chamber run ./my-training-project --gpus 4 --team my-team

No Docker knowledge required. Chamber auto-detects your project structure, generates optimized Dockerfiles, and guides you through any missing prerequisites.

What Chamber Does For You

Detects your project

Scans for PyTorch, TensorFlow, or JAX. Finds train.py, main.py, or your entrypoint. Reads requirements.txt or pyproject.toml.

Generates optimized Dockerfile

Creates a GPU-optimized container with the right CUDA version, cuDNN, and your dependencies pre-installed.

Handles authentication

Auto-authenticates with Google Artifact Registry or AWS ECR using your existing cloud credentials.

Builds and pushes

Uses docker buildx build --push to build and push in a single efficient step. Automatically pulls the :latest tag to seed the layer cache so teammates get fast rebuilds. Uses content-addressed image tags — if the image already exists, build and push are skipped entirely.

Submits workload

Creates and submits a Kubernetes job to Chamber with your GPU and resource requirements.

Quick Start

# Preview what will be generated (recommended first step)
chamber run ./my-training-project --gpus 4 --team <team-id> --dry-run

# Build, push, and submit the workload
chamber run ./my-training-project --gpus 4 --team <team-id>

Always use --dry-run first to preview the generated Dockerfile and Kubernetes manifest before building.

Interactive Setup (First-Time Users)

Chamber guides you through everything. Missing something? Chamber will help:

No Registry Configured?

$ chamber run ./my-project --gpus 4 --team abc123

No container registry configured

Chamber needs a container registry to store your Docker images.
You can use Google Artifact Registry, AWS ECR, or any Docker-compatible registry.

Select your registry type:

  [1] Google Artifact Registry (recommended for GCP users)
      Example: us-central1-docker.pkg.dev/my-project/ml-images

  [2] AWS ECR (recommended for AWS users)
      Example: 123456789012.dkr.ecr.us-east-1.amazonaws.com/ml-images

  [3] Other Docker registry

Select an option [1]: 1

Google Artifact Registry Setup

Enter your GAR registry URL: us-central1-docker.pkg.dev/my-project/ml-images

Save as default registry? (y/n) [y]: y
Default registry saved to ~/.chamber/config.json

Docker Not Installed?

⚠ Docker is not installed
  Docker is required to build and push container images.

Installation options:

  [1] Quick install (recommended)
      brew install --cask docker
  [2] Open installation guide in browser
  [3] Show manual installation instructions
  [4] Skip (continue anyway)

Select an option [1]:

AWS CLI Not Configured?

⚠ AWS CLI is not authenticated

You need AWS credentials to push images to ECR.

Options:
  [1] Run 'aws configure' interactively (recommended)
  [2] Set environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)
  [3] Show me how to get credentials
  [4] Skip (I'll handle authentication myself)

Select an option [1]:

One-time setup. After your first successful run, these settings are saved. Future runs work instantly without prompts.

Supported Container Registries

Chamber automatically handles authentication for major cloud registries:

Registry	URL Pattern	Auto-Auth
Google Artifact Registry	`{region}-docker.pkg.dev/{project}/{repo}`	gcloud CLI
AWS ECR	`{account}.dkr.ecr.{region}.amazonaws.com`	AWS CLI
Other registries	Any Docker-compatible registry	Via `docker login`

Project Detection

Chamber automatically detects your project configuration:

What	How
Framework	Scans `requirements.txt` for PyTorch, TensorFlow, or JAX
Entrypoint	Looks for `train.py`, `main.py`, `run.py`, or `app.py`
Python Version	Checks `.python-version` or `pyproject.toml`
Distributed Training	Detects Accelerate, DeepSpeed, Ray, or Horovod
Requirements	Uses `requirements.txt`, `pyproject.toml`, or `setup.py`

Project Configuration

Create a .chamber.yaml file for persistent settings (optional):

# .chamber.yaml
name: "llm-finetune"
entrypoint: "train.py"
entrypoint_args: "--config config.yaml --epochs 100"

# Resources
gpu_type: "H100"
gpus: 4
job_class: "ELASTIC"

# Environment variables injected into the container
env:
  WANDB_PROJECT: "my-project"
  NCCL_DEBUG: "INFO"

# Forward these local env vars to container (if set)
forward_env:
  - WANDB_API_KEY
  - HF_TOKEN

# Files to exclude from Docker build context
ignore:
  - "output/"
  - "checkpoints/"
  - "*.bin"
  - "*.safetensors"

CLI flags always override .chamber.yaml values, so you can set defaults while still customizing per-run.

Command Reference

chamber run <directory> [flags]

Required Flags

Flag	Description
`--team, -t`	Team ID for the workload

Resource Flags

Flag	Default	Description
`--gpus, -g`	From config or 1	Number of GPUs
`--gpu-type`	From config or H100	GPU type (H100, A100, L40S, etc.)
`--class, -c`	ELASTIC	Workload class: `RESERVED` or `ELASTIC`

Container Flags

Flag	Description
`--registry`	Container registry URL (or use `default_registry` config)
`--base-image`	Override the auto-detected base image
`--dockerfile`	Use an existing Dockerfile instead of generating one
`--no-cache`	Force Docker rebuild without cache

Entrypoint Flags

Flag	Description
`--name, -n`	Workload name (default: from config or directory name)
`--entrypoint`	Override detected Python entrypoint
`--entrypoint-args`	Arguments to pass to the entrypoint
`--env`	Environment variable KEY=VALUE (repeatable)

Workflow Flags

Flag	Description
`--dry-run`	Preview generated artifacts without executing
`--save-dockerfile`	Write generated Dockerfile to project directory
`--save-manifest`	Write generated K8s manifest to project directory

Examples

Basic Training Workload

chamber run ./my-project --gpus 4 --gpu-type H100 --team abc123

Custom Entrypoint with Arguments

chamber run ./my-project \
  --gpus 8 \
  --gpu-type H100 \
  --team abc123 \
  --entrypoint train.py \
  --entrypoint-args "--config config.yaml --lr 1e-4 --batch-size 32"

With Environment Variables

chamber run ./my-project \
  --gpus 4 \
  --gpu-type H100 \
  --team abc123 \
  --env WANDB_API_KEY=$WANDB_API_KEY \
  --env HF_TOKEN=$HF_TOKEN

Reserved (Non-Preemptible) Capacity

chamber run ./my-project \
  --gpus 4 \
  --gpu-type H100 \
  --team abc123 \
  --class RESERVED

Using Your Own Dockerfile

chamber run ./my-project \
  --gpus 4 \
  --gpu-type H100 \
  --team abc123 \
  --dockerfile Dockerfile.custom

Generated Artifacts

Example Dockerfile

For a PyTorch project with Accelerate, Chamber generates:

# syntax=docker/dockerfile:1
# Auto-generated by Chamber CLI
FROM nvcr.io/nvidia/pytorch:24.04-py3

WORKDIR /workspace

COPY requirements.txt .
RUN --mount=type=cache,target=/root/.cache/pip pip install -r requirements.txt

COPY . .

CMD ["accelerate", "launch", "train.py", "--config", "config.yaml"]

Chamber uses BuildKit with pip cache mounts, so repeated builds reuse cached Python packages instead of re-downloading them.

Example Kubernetes Manifest

apiVersion: batch/v1
kind: Job
metadata:
  name: llm-finetune-a1b2c3d4
spec:
  backoffLimit: 0
  template:
    spec:
      restartPolicy: Never
      containers:
        - name: training
          image: us-central1-docker.pkg.dev/my-project/ml-images:abc123
          resources:
            requests:
              nvidia.com/gpu: "4"
            limits:
              nvidia.com/gpu: "4"
          env:
            - name: "WANDB_PROJECT"
              value: "my-project"

Troubleshooting

Docker is not installed

Chamber will prompt you with installation options:

⚠ Docker is not installed

Installation options:
  [1] Quick install (recommended)
  [2] Open installation guide in browser
  [3] Show manual installation instructions

Select an option and follow the prompts.

Registry authentication failed

If auto-authentication fails:For Google Artifact Registry:

gcloud auth login
gcloud auth configure-docker {region}-docker.pkg.dev

For AWS ECR:

aws configure
# Or set environment variables:
export AWS_ACCESS_KEY_ID=your-key
export AWS_SECRET_ACCESS_KEY=your-secret

Build failed - missing dependencies

Check your requirements.txt is complete. Use --dry-run to preview the Dockerfile:

chamber run ./my-project --dry-run

If needed, use --save-dockerfile to inspect and modify:

chamber run ./my-project --dry-run --save-dockerfile
# Edit Dockerfile.chamber
chamber run ./my-project --dockerfile Dockerfile.chamber

Docker Build Optimizations

Chamber automatically applies several optimizations to make builds fast:

Optimization	What it does
Single build+push	Uses `docker buildx build --push` to build and push in one step. This is significantly faster than separate build+push because buildkit pushes layers directly as they complete, avoiding Docker daemon/containerd manifest duplication issues
BuildKit	Enabled by default (`DOCKER_BUILDKIT=1`) for parallel build stages and advanced caching
Pip cache mounts	Uses `--mount=type=cache,target=/root/.cache/pip` so pip packages are cached across builds instead of re-downloaded
Remote layer caching	Pulls the `:latest` tag from your registry before building to seed the local layer cache. After a successful build, tags as `:latest` using `docker buildx imagetools create` (fast manifest aliasing, no layer re-upload) so future builds benefit
Content-addressed tags	Images are tagged with a content hash of your project. If the image already exists in the registry, build and push are skipped entirely
Platform targeting	Explicitly builds for `linux/amd64` to ensure consistent images regardless of your local architecture
Reduced metadata	Uses `--provenance=false --sbom=false` to skip unnecessary metadata generation that slows down builds
Context size warnings	Reports build context size and warns if it exceeds 500MB, helping you identify large files that should be in `.dockerignore`
Streaming output	Build output is streamed to your terminal in real-time so you can follow progress

Why single build+push matters: With Docker Desktop’s containerd image store, a separate docker push can push ALL manifests from multi-platform base images (e.g., 6 manifests for NVIDIA images), causing each layer to be checked 6 times. The buildx build --push approach only pushes what it built — making pushes dramatically faster, especially for large ML images.

Use --no-cache to force a full Docker rebuild (this does not force re-push if the image already exists):

chamber run ./my-project --gpus 4 --team abc123 --no-cache

Best Practices

Start with --dry-run

Always preview generated artifacts before building to catch issues early.

Use .chamber.yaml

Store project-specific settings to avoid repeating flags on every run.

Set default_registry

Configure your registry once: chamber config set default_registry <url>

Use forward_env

Securely inject API keys without hardcoding them in your config.

​Auto-Containerize with chamber run

​What Chamber Does For You

​Quick Start

​Interactive Setup (First-Time Users)

​No Registry Configured?

​Docker Not Installed?

​AWS CLI Not Configured?

​Supported Container Registries

​Project Detection

​Project Configuration

​Command Reference

​Required Flags

​Resource Flags

​Container Flags

​Entrypoint Flags

​Workflow Flags

​Examples

​Basic Training Workload

​Custom Entrypoint with Arguments

​With Environment Variables

​Reserved (Non-Preemptible) Capacity

​Using Your Own Dockerfile

​Generated Artifacts

​Example Dockerfile

​Example Kubernetes Manifest

​Troubleshooting

​Docker Build Optimizations

​Best Practices

Start with --dry-run

Use .chamber.yaml

Set default_registry

Use forward_env

Auto-Containerize with `chamber run`

What Chamber Does For You

Quick Start

Interactive Setup (First-Time Users)

No Registry Configured?

Docker Not Installed?

AWS CLI Not Configured?

Supported Container Registries

Project Detection

Project Configuration

Command Reference

Required Flags

Resource Flags

Container Flags

Entrypoint Flags

Workflow Flags

Examples

Basic Training Workload

Custom Entrypoint with Arguments

With Environment Variables

Reserved (Non-Preemptible) Capacity

Using Your Own Dockerfile

Generated Artifacts

Example Dockerfile

Example Kubernetes Manifest

Troubleshooting

Docker Build Optimizations

Best Practices