Resilience (Multi-Node)

A solo pool is only as available as the Bitcoin Core node behind it. A single bitcoind is a single point of failure: if it crashes, falls behind, or is restarted for an upgrade, the pool can't build templates — and worst of all, a node that's down at the exact second you solve a block can cost you the reward. dvb-WarpPool lets you run one or more backup nodes so a primary outage becomes a non-event.

Add a backup node

Backup nodes live in config.toml. The [node] block is your primary; add any number of [[node.backup]] entries below it:

[node]
rpc_url = "http://127.0.0.1:8332"
zmq_hashblock_addr = "tcp://127.0.0.1:28332"

[[node.backup]]
rpc_url = "http://192.168.1.50:8332"
zmq_hashblock_addr = "tcp://192.168.1.50:28332"   # "" = no ZMQ from this node (poll-only)
# rpc_cookie_path = "/mnt/node-b/.bitcoin/.cookie" # optional — see auth below

Authentication is resolved per backup, in this order:

  1. rpc_cookie_path — a local second node with its own data dir.
  2. user:pass embedded in the URL (http://user:pass@host:8332) — a remote node with its own credentials (this lives in config.toml, so mind the file permissions).
  3. Nothing — the global secrets.rpc_user / rpc_pass are reused (handy when the backup is provisioned identically to the primary).

The [node] block is read at startup only — restart the daemon after adding or changing a backup. This briefly interrupts mining; miners reconnect on their own.

The full field reference is in the Configuration Reference.

How failover behaves

  • Sticky, primary-preferred. Calls run against the last working node. On a transport, auth, or warm-up error they hop to the next node within the same call — the job loop never stalls. While running on a backup, about once a minute one call probes the primary first and switches back the moment it answers.
  • submitblock goes to every node in parallel. The block moment is the one instant a dead node can cost real money, so the solved block is broadcast to all nodes at once; the first Accepted wins.
  • ZMQ subscribes to every node that sets zmq_hashblock_addr. Block events are deduplicated by hash, so whichever node reports first triggers the template refresh — you automatically get the lowest-latency node.
  • Degraded ≠ down. A pool happily running on a backup is degraded, not down: the health snapshot carries a warning, and rpc-down fires only when no node is reachable at all.

Verified live: primary kill → failover in under a second with the job stream uninterrupted; primary restart → automatic return within ~60 s.

Notifications

  • node-failover — the active node changed (failover to a backup, or return to the primary). Delivered to sinks that enabled on_rpc_down.
  • rpc-down — no node reachable at all.

See Notifications for sink setup.

Tips

  • Put the backup on separate hardware and power — a backup on the same machine only covers bitcoind crashes, not host failures.
  • A node without ZMQ still works as an RPC failover target (poll-only); set zmq_hashblock_addr = "" to skip ZMQ from a node that doesn't expose it.
  • Order matters only as preference: the primary is always tried first on the periodic probe; backups are tried in the order they appear.

See also