Skip to Content
Network Mainnet Telemetry Overview

Telemetry

Prover Nodes run a local OpenTelemetry Collector sidecar that collects metrics and traces from the prover node process, scrapes host-level metrics, and forwards everything to the Fermah Gateway via authenticated gRPC.

The Datadog Agent has been fully deprecated as of April 17, 2026. All telemetry is now handled through the OTel Collector. If you are still running a Datadog Agent, see Migrating from Datadog Agent below.

Installation

Telemetry is installed through the Fermah install script. During installation, select the Telemetry option to automatically set up the OTel Collector.

The installer creates the collector configuration, Docker Compose file, and .env file with the required environment variables.

You can re-run the install script at any time and select only the Telemetry step to update or reinstall the collector.

Prerequisites

The OTel Collector requires root Docker (not rootless) to access host metrics through /proc, /sys, and the host filesystem.

Your server needs both Docker installations:

Docker ModeUsed By
Root (privileged)Telemetry collector (host metrics)
RootlessProver containers

Both are set up during the server preparation step.

Architecture

The collector receives traces and metrics from the prover node over localhost:4317, scrapes host-level metrics (CPU, memory, disk, network), and forwards everything to the Fermah Gateway.

Prover Node ──▶ OTel Collector (localhost) ──▶ Fermah Gateway Host Metrics (CPU, memory, disk, network)

Logs are not collected by the telemetry pipeline. Only metrics and traces are forwarded to the gateway.

Environment Variables

The collector requires three environment variables, set automatically by the installer into a .env file:

VariableDescription
FERMAH_OTEL_TOKENBearer token for authenticating with the Fermah Gateway
FERMAH_GATEWAY_ENDPOINTGateway gRPC endpoint (e.g. telemetry.fermah.xyz:4317)
FERMAH_OPERATOR_HOSTFriendly name for this operator (e.g. fermah-cp-1a2b). Shows as fermah.operator_host tag. Defaults to OS hostname if unset.

Running Telemetry

The collector runs as a root Docker container. A shared fermah-net network is required for connectivity:

docker network create fermah-net

Start the collector:

sudo docker compose -p fermah-telemetry up -d

Verify it is running:

sudo docker compose -p fermah-telemetry logs

Collector Reference

Below is the full collector configuration and Docker Compose file for reference. These are managed automatically by the installer.

docker-compose.yml

services: otel-collector: image: otel/opentelemetry-collector-contrib:0.150.1 container_name: fermah-otel-collector restart: unless-stopped pid: host env_file: .env environment: - FERMAH_OTEL_TOKEN=${FERMAH_OTEL_TOKEN} - FERMAH_GATEWAY_ENDPOINT=${FERMAH_GATEWAY_ENDPOINT} - FERMAH_OPERATOR_HOST=${FERMAH_OPERATOR_HOST} volumes: - ./otel-collector.yml:/etc/otelcol-contrib/config.yaml:ro - /proc:/host/proc:ro - /sys:/host/sys:ro - /:/hostfs:ro ports: - "4317:4317" - "4318:4318" networks: - fermah-net networks: fermah-net: external: true

otel-collector.yml

receivers: otlp: protocols: grpc: endpoint: 0.0.0.0:4317 http: endpoint: 0.0.0.0:4318 hostmetrics: collection_interval: 10s root_path: /hostfs scrapers: cpu: metrics: system.cpu.utilization: enabled: true system.cpu.physical.count: enabled: true system.cpu.logical.count: enabled: true disk: filesystem: metrics: system.filesystem.utilization: enabled: true exclude_mount_points: mount_points: ["/snap/*", "/boot", "/hostfs/snap/*", "/hostfs/boot"] match_type: regexp exclude_fs_types: fs_types: [squashfs, tmpfs, devtmpfs, sysfs, proc] match_type: strict load: memory: network: paging: metrics: system.paging.utilization: enabled: true processes: prometheus: config: scrape_configs: - job_name: otelcol scrape_interval: 10s static_configs: - targets: ["0.0.0.0:8888"] processors: batch: send_batch_max_size: 1000 send_batch_size: 100 timeout: 10s memory_limiter: check_interval: 1s limit_mib: 512 spike_limit_mib: 128 resourcedetection: detectors: [env, system] system: resource_attributes: os.description: enabled: true host.arch: enabled: true host.cpu.vendor.id: enabled: true host.cpu.family: enabled: true host.cpu.model.id: enabled: true host.cpu.model.name: enabled: true host.cpu.stepping: enabled: true host.cpu.cache.l2.size: enabled: true resource/operator_host: attributes: - key: host.name value: "${FERMAH_OPERATOR_HOST}" action: upsert - key: fermah.operator_host value: "${FERMAH_OPERATOR_HOST}" action: upsert exporters: otlp_grpc: endpoint: "${FERMAH_GATEWAY_ENDPOINT}" headers: authorization: "Bearer ${FERMAH_OTEL_TOKEN}" tls: insecure: true sending_queue: enabled: true queue_size: 1000 retry_on_failure: enabled: true connectors: datadog/connector: service: pipelines: traces: receivers: [otlp] processors: [memory_limiter, resourcedetection, resource/operator_host, batch] exporters: [datadog/connector, otlp_grpc] metrics: receivers: [datadog/connector, otlp, hostmetrics, prometheus] processors: [memory_limiter, resourcedetection, resource/operator_host, batch] exporters: [otlp_grpc]

Prover Node Configuration

The telemetry settings in ~/.fermah/config/prover-node-config.toml:

[telemetry] mode = "Otlp" layers = "metrics,traces" level = "info" filters = [] interval = 30 temporality = "Cumulative"

Make sure layers is set to "metrics,traces" (without logs) in both the [telemetry] and [zksync.telemetry] sections. The collector does not forward logs to the gateway, so exporting them would generate unnecessary overhead. If your config still has "logs,metrics,traces", remove the logs entry.

Custom Forwarding

Since every operator runs a local OTel Collector, you can easily add your own backends as additional export destinations. See Custom Export for details.


Migrating from Datadog Agent

If you are still running the Datadog Agent, bring it down first:

docker ps # find the Datadog Agent container ID docker rm -f <container_id> # remove the Datadog Agent container

Then re-run the install script and select only the Telemetry step.

Last updated on