Skip to main content
This guide covers common issues with the Chamber agent and how to resolve them.

Quick Checks

# Check agent pod status
kubectl get pods -l app.kubernetes.io/name=chamber-agent

# View recent logs
kubectl logs -l app.kubernetes.io/name=chamber-agent --tail=50

Cluster Not Appearing in Dashboard

Symptoms: Agent pod is running but cluster doesn’t appear in Chamber
Verify the token is correct and not expired:
kubectl logs -l app.kubernetes.io/name=chamber-agent | grep -i "token\|auth"
If you see authentication errors, generate a new token from Settings > Cluster Tokens in the Chamber dashboard.
The agent needs outbound HTTPS access to Chamber:
kubectl logs -l app.kubernetes.io/name=chamber-agent | grep -i "connect\|websocket"
If behind a corporate firewall, you may need to configure proxy settings.
Verify the cluster name was set correctly during installation:
kubectl logs -l app.kubernetes.io/name=chamber-agent | grep -i "cluster"

GPUs Not Detected

Symptoms: Cluster appears but shows 0 GPUs
The NVIDIA device plugin must be running:
kubectl get pods -n kube-system -l name=nvidia-device-plugin-ds
If not running, install it from NVIDIA’s documentation.
Check that nodes report GPU resources:
kubectl get nodes -o custom-columns=NAME:.metadata.name,GPUS:.status.allocatable.nvidia\\.com/gpu
Nodes should show a GPU count. If showing <none>, the NVIDIA drivers or device plugin may not be configured correctly.

Jobs Not Tracked

Symptoms: Jobs run but don’t appear in Chamber
Jobs must have the team label to be tracked:
metadata:
  labels:
    chamber.io/team: your-team-slug
If you configured watchNamespaces, verify your job’s namespace is included. By default, the agent watches all namespaces.

Agent Not Starting

Symptoms: Pod in CrashLoopBackOff or Error state
# Check logs from crashed pod
kubectl logs -l app.kubernetes.io/name=chamber-agent --previous
ErrorSolution
Token/auth errorsGenerate new token from Chamber dashboard
Connection errorsCheck firewall allows outbound HTTPS
Permission errorsReinstall agent to fix RBAC

Proxy Configuration

If your cluster uses an HTTP proxy:
# values.yaml
env:
  - name: HTTPS_PROXY
    value: "http://proxy.example.com:8080"
  - name: NO_PROXY
    value: "10.0.0.0/8,172.16.0.0/12,.cluster.local"
Then upgrade the agent:
helm upgrade chamber-agent oci://public.ecr.aws/q4a1a5s3/chamber-agent-chart -f values.yaml

Getting Help

If you’re still having issues:
  1. Collect logs: kubectl logs -l app.kubernetes.io/name=chamber-agent > agent-logs.txt
  2. Contact support with logs and your agent version