CPU Monitoring

Once you've set up your relays and/or gateways, you may wish to set up automated monitoring. This page describes two major methods of verifying the health and capacity of your strongDM relays and gateways.

Functionality/Liveness Check

The strongDM binary includes a configurable 'liveness' URL that you can use to verify that the relay/gateway is alive and functioning properly. To enable this URL:

  • Docker: Add -e SDM_ORCHESTRATOR_PROBES=:9090 to the invocation. 9090 is the default port; you can replace it with any port.
  • Kubernetes: Liveness check is already enabled in the Kubernetes configuration.
  • Direct configuration: Add the SDM_ORCHESTRATOR_PROBES environment variable when starting the relay/gateway process, setting it to :9090 or whichever port you prefer.

Once configured, you can check http://ip-of-relay:9090/liveness, replacing 9090 with the port you configured in the environment variable. If it returns HTTP status 200, then the relay/gateway is in good health.

Relay/Gateway Capacity

The strongDM binary is carefully designed to use a relatively constant amount of RAM, so its memory utilization should not change significantly through the process lifecycle. Because of this, strongDM recommends watching the CPU load of the underlying machine to assess the need for additional capacity.

When increasing gateway/relay capacity, you can either add a new gateway or relay or, if you are running in a virtual environment, simply add additional CPUs to the system.

Load Average

The strongDM binary will use all available CPUs. If you note that more than 50% of your CPU cores are constantly saturated, then this is a good measure that it is time to scale up.

CPU Time of sdm Process

If you notice that the CPU time of the sdm process is increasing faster than real time (for instance, if it uses 30 hours of CPU time in 15 hours of real time) then this is another indication that it is time to scale up capacity.