Multi-GPU Setup
Machines with multiple GPUs can run one Prover Node instance per GPU. Each instance runs in its own container with a dedicated Fermah home directory and its own machine secret.
Each instance must have a unique machine secret. Sharing the same secret across instances will cause registration conflicts.
Configuration
Your prover-node-config.toml must include an entry for each GPU on the machine:
[[hardware.gpus]]
price = "117"
resource.gpuId = "unknown-gpu-0"
[[hardware.gpus.resource.specs]]
VRAM = 25769803776
[[hardware.gpus]]
price = "117"
resource.gpuId = "unknown-gpu-1"
[[hardware.gpus.resource.specs]]
VRAM = 25769803776If your machine has more than two GPUs, add additional [[hardware.gpus]] entries following the same pattern, incrementing the gpuId index.
Container Image
Build a minimal container image for the prover node:
FROM nvidia/cuda:12.9.1-runtime-ubuntu24.04
RUN apt-get update && \
apt-get install -y --no-install-recommends ca-certificates tini && \
rm -rf /var/lib/apt/lists/*
ENV HOME=/root
WORKDIR /root
ENTRYPOINT ["/usr/bin/tini", "--", "/root/.fermah/bin/fpn"]docker build -t fermah-fpn:24.04 .Replace the base image with nvidia/cuda:12.2.0-runtime-ubuntu22.04 if you are running CUDA 12.2.
Preparing Home Directories
Each GPU instance needs its own Fermah home directory with a unique machine secret. Generate a new secret before copying each directory:
# GPU 0 — generate the first machine secret
prover-node gen-machine-secret
# Copy the home directory for GPU 1, then generate a new secret
cp -r ~/.fermah ~/.fermah-gpu1
prover-node gen-machine-secret
# The base ~/.fermah now has a new secret (GPU 1)
# Swap so GPU 0 keeps the original
mv ~/.fermah ~/.fermah-tmp
mv ~/.fermah-gpu1 ~/.fermah
mv ~/.fermah-tmp ~/.fermah-gpu1For additional GPUs, repeat: copy the directory, then generate a new secret.
After creating each home directory, you must register each instance independently.
Docker Compose
Docker Compose is the recommended approach for multi-GPU setups. Use the device_ids field to pin each service to a specific GPU.
This requires the NVIDIA Container Toolkit to be installed and configured. See Install CUDA containers toolkit.
services:
fpn-gpu0:
image: fermah-fpn:24.04
restart: unless-stopped
environment:
NVIDIA_VISIBLE_DEVICES: 0
volumes:
- /home/fermah/.fermah:/root/.fermah:rw
- /var/log/fermah-gpu0:/var/log/fermah:rw
- /var/run/docker.sock:/var/run/docker.sock
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: ['0']
capabilities: [gpu]
networks:
- fermah-net
fpn-gpu1:
image: fermah-fpn:24.04
restart: unless-stopped
environment:
NVIDIA_VISIBLE_DEVICES: 1
volumes:
- /home/fermah/.fermah-gpu1:/root/.fermah:rw
- /var/log/fermah-gpu1:/var/log/fermah:rw
- /var/run/docker.sock:/var/run/docker.sock
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: ['1']
capabilities: [gpu]
networks:
- fermah-net
networks:
fermah-net:
external: trueCreate the network and start:
docker network create fermah-net
docker compose up -dFor additional GPUs, add more services following the same pattern, incrementing device_ids, NVIDIA_VISIBLE_DEVICES, and the host volume paths.
Systemd Alternative
If you prefer systemd, create one service per GPU. The key differences between services are:
- Container name — unique per instance (e.g.
fermah-gpu0,fermah-gpu1) - GPU device — pinned via
--gpus '"device=N"'andNVIDIA_VISIBLE_DEVICES - Host directory — each instance mounts its own Fermah home to
/root/.fermahinside the container
GPU 0
Create /etc/systemd/system/fermah-gpu0.service:
[Unit]
Description=Fermah Prover Node - GPU 0
After=network.target docker.service
Requires=docker.service
[Service]
Type=simple
User=root
ExecStartPre=/bin/mkdir -p /var/log/fermah-gpu0
ExecStartPre=/bin/chmod 755 /var/log/fermah-gpu0
ExecStart=/usr/bin/docker run --rm --name fermah-gpu0 \
--gpus '"device=0"' \
--network host \
-e HOME=/root \
-e NVIDIA_VISIBLE_DEVICES=0 \
-v /root/.fermah:/root/.fermah \
-v /var/log/fermah-gpu0:/var/log/fermah \
-v /var/run/docker.sock:/var/run/docker.sock \
fermah-fpn:24.04
ExecStop=/usr/bin/docker stop -t 20 fermah-gpu0
Restart=on-failure
RestartSec=5
TimeoutStopSec=20
LimitNOFILE=65535
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.targetGPU 1
Create /etc/systemd/system/fermah-gpu1.service:
[Unit]
Description=Fermah Prover Node - GPU 1
After=network.target docker.service
Requires=docker.service
[Service]
Type=simple
User=root
ExecStartPre=/bin/mkdir -p /var/log/fermah-gpu1
ExecStartPre=/bin/chmod 755 /var/log/fermah-gpu1
ExecStart=/usr/bin/docker run --rm --name fermah-gpu1 \
--gpus '"device=1"' \
--network host \
-e HOME=/root \
-e NVIDIA_VISIBLE_DEVICES=1 \
-v /root/.fermah-gpu1:/root/.fermah \
-v /var/log/fermah-gpu1:/var/log/fermah \
-v /var/run/docker.sock:/var/run/docker.sock \
fermah-fpn:24.04
ExecStop=/usr/bin/docker stop -t 20 fermah-gpu1
Restart=on-failure
RestartSec=5
TimeoutStopSec=20
LimitNOFILE=65535
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.targetsudo systemctl daemon-reload
sudo systemctl enable fermah-gpu0.service fermah-gpu1.service
sudo systemctl start fermah-gpu0.service fermah-gpu1.serviceMonitoring
Each instance writes logs to its own directory (/var/log/fermah-gpu0, /var/log/fermah-gpu1, etc.):
# Docker Compose
docker compose logs fpn-gpu0 -f
docker compose logs fpn-gpu1 -f
# Systemd
journalctl -u fermah-gpu0.service -f
journalctl -u fermah-gpu1.service -f