Docker Swarm with Swarmprom for real-time monitoring and alerts

This article lives in:

Intro

Here’s how you can set up Swarmprom to monitor your cluster.

It will allow you to:

  • Monitor CPU, disk, memory usage, etc.
  • Monitor it all per node, per service, per container, etc.
  • Have a nice, interactive, real-time dashboard with all the data nicely plotted.
  • Trigger alerts (for example, in Slack, Rocket.chat, etc) when your services/nodes pass certain thresholds.
  • And more…

Swarmprom is actually just a set of tools pre-configured in a smart way for a Docker Swarm cluster.

It includes:

Here’s how it looks like:

Instructions

$ git clone https://github.com/stefanprodan/swarmprom.git
$ cd swarmprom
  • Set and export an ADMIN_USER environment variable:
export ADMIN_USER=admin
  • Set and export an ADMIN_PASSWORD environment variable:
export ADMIN_PASSWORD=changethis
  • Set and export a hashed version of the ADMIN_PASSWORD using openssl, it will be used by Traefik's HTTP Basic Auth for most of the services:
export HASHED_PASSWORD=$(openssl passwd -apr1 $ADMIN_PASSWORD)
  • You can check the contents with:
echo $HASHED_PASSWORD

it will look like:

$apr1$89eqM5Ro$CxaFELthUKV21DpI3UTQO.
  • Create and export an environment variable DOMAIN, e.g.:
export DOMAIN=example.com

and make sure that the following sub-domains point to your Docker Swarm cluster IPs:

  • grafana.example.com
  • alertmanager.example.com
  • unsee.example.com
  • prometheus.example.com

(and replace example.com with your actual domain).

Note: You can also use a subdomain, like swarmprom.example.com. Just make sure that the subdomains point to (at least one of) your cluster IPs. Or set up a wildcard subdomain (*).

  • Set and export an environment variable with the tag used by Traefik public to filter services (by default, it’s traefik-public):
export TRAEFIK_PUBLIC_TAG=traefik-public
  • If you are using Slack and want to integrate it, set the following environment variables:
export SLACK_URL=https://hooks.slack.com/services/TOKEN
export SLACK_CHANNEL=devops-alerts
export SLACK_USER=alertmanager

Note: by using export when declaring all the environment variables above, the next command will be able to use them.

  • Deploy the Traefik version of the stack:
docker stack deploy -c docker-compose.traefik.yml swarmprom

To test it, go to each URL:

About me

Creator of FastAPI and Typer. Dev at Exposion AI. APIs, Deep Learning/Machine Learning, full-stack distributed systems, SQL/NoSQL, Python, Docker, JS, TS, etc.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store