Skip to main content
The Standalone Agent is in beta. OTLP integration behavior may change.
The Standalone Agent includes a built-in OpenTelemetry (OTLP) gRPC receiver. Any application instrumented with the OpenTelemetry SDK can send metrics directly to the agent, which forwards them to the Chamber dashboard alongside GPU and host metrics. This lets you correlate application-level metrics (training loss, throughput, queue depth, request latency) with infrastructure metrics in a single view.

How It Works

  1. Your application exports OTLP metrics over gRPC to localhost:4317
  2. The agent receives, transforms, and buffers the metrics
  3. Metrics are uploaded to Chamber in batches every 60 seconds
  4. They appear on the Services dashboard alongside infrastructure metrics

Configuration

The OTLP receiver is enabled by default on port 4317. No additional configuration is required for most setups.
VariableDefaultDescription
CHAMBER_OTLP_GRPC_ENABLEDtrueEnable or disable the OTLP gRPC receiver
CHAMBER_OTLP_GRPC_PORT4317Port the gRPC server listens on
CHAMBER_OTLP_GRPC_HOST0.0.0.0Bind address for the gRPC server
CHAMBER_OTLP_METRICS_QUEUE_SIZE10000Maximum queued metric batches before dropping
To disable the receiver, set CHAMBER_OTLP_GRPC_ENABLED=false in /etc/chamber/agent.env and restart the agent.

Sending Metrics from Your Application

Point your application’s OTLP exporter to the agent’s gRPC endpoint. Here is an example using the Python OpenTelemetry SDK:
from opentelemetry import metrics
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader
from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter
from opentelemetry.sdk.resources import Resource

# Configure exporter to send to the local Chamber agent
exporter = OTLPMetricExporter(
    endpoint="localhost:4317",
    insecure=True,  # Local connection, no TLS needed
)

reader = PeriodicExportingMetricReader(exporter, export_interval_millis=30000)
provider = MeterProvider(
    resource=Resource.create({"service.name": "my-training-job"}),
    metric_readers=[reader],
)
metrics.set_meter_provider(provider)

meter = metrics.get_meter("my-training-job")

# Create metrics
loss_gauge = meter.create_gauge("training.loss", description="Current training loss")
throughput_counter = meter.create_counter(
    "training.samples.processed",
    unit="1",
    description="Total training samples processed",
)

# Record metrics in your training loop
loss_gauge.set(0.42)
throughput_counter.add(batch_size)
Set service.name in your resource attributes — it appears as the service name label on all metrics in the Chamber dashboard.

Metric Name Transformation

The agent transforms OTLP metric names to a Prometheus-compatible format before uploading:
TransformationExample
Dots and hyphens become underscorestraining.losstraining_loss
Unit suffixes are appendedrequest_duration (unit: s) → request_duration_seconds
Monotonic sums get _total suffixsamples.processed (sum, monotonic) → samples_processed_total
Consecutive underscores are collapsedmy__metricmy_metric

Unit Suffix Mapping

OTel UnitSuffix
s_seconds
ms_milliseconds
By_bytes
KBy_kilobytes
MBy_megabytes
GBy_gigabytes
1_ratio

Histogram Support

OTLP histograms are decomposed into Prometheus-style metrics:
Generated MetricDescription
{name}_bucketCumulative bucket counts with le (less-than-or-equal) labels
{name}_countTotal number of observations
{name}_sumSum of all observed values

Supported Metric Types

OTel TypeSupported
GaugeYes
Sum (monotonic and non-monotonic)Yes
HistogramYes
ExponentialHistogramNo
SummaryNo

Verifying Metrics Are Flowing

After configuring your application, check the agent logs:
sudo journalctl -u chamber-agent-standalone -f | grep -i otlp
You should see log lines indicating metrics received. Then check the Services tab in the dashboard — your application metrics will appear with the service_name label you configured.

Next Steps

Metrics Reference

Full list of infrastructure metrics collected

Configuration

All environment variables and settings