Uptime monitoring for Traefik reverse proxy

Traefik sits in front of everything — your APIs, your microservices, your databases. When Traefik goes down, everything behind it goes down with it. But because Traefik is infrastructure rather than an application, it often gets skipped when teams set up uptime monitoring. The assumption is "it just runs." It doesn't always.

This tutorial covers production-grade uptime monitoring for Traefik using Vigilmon. We will walk through:

Monitoring the Traefik dashboard health endpoint
Monitoring backend services through Traefik
SSL certificate monitoring for routed domains
Webhook alerts for DOWN/UP events

Prerequisites

Traefik 2.x or 3.x running as a reverse proxy (Docker, Kubernetes, or bare metal)
A free account at vigilmon.online

Part 1: Enable the Traefik dashboard health endpoint

Traefik exposes a built-in API and dashboard. The /ping endpoint is a lightweight health check specifically designed for external monitoring tools — it returns HTTP 200 with the text OK when Traefik is healthy.

Enable ping in static configuration

File-based (traefik.yml):

# traefik.yml
api:
  dashboard: true
  insecure: false

ping: {}

entryPoints:
  web:
    address: ":80"
  websecure:
    address: ":443"
  traefik:
    address: ":8080"

CLI flags:

traefik \
  --api.dashboard=true \
  --ping=true \
  --entrypoints.web.address=:80 \
  --entrypoints.websecure.address=:443 \
  --entrypoints.traefik.address=:8080

Docker Compose:

# docker-compose.yml
version: "3.8"

services:
  traefik:
    image: traefik:v3.0
    command:
      - "--api.dashboard=true"
      - "--ping=true"
      - "--entrypoints.web.address=:80"
      - "--entrypoints.websecure.address=:443"
      - "--entrypoints.traefik.address=:8080"
      - "--providers.docker=true"
      - "--providers.docker.exposedbydefault=false"
    ports:
      - "80:80"
      - "443:443"
      - "8080:8080"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - ./traefik.yml:/etc/traefik/traefik.yml:ro

Verify the ping endpoint

curl http://localhost:8080/ping

Expected response:

OK

If Traefik is running and healthy, /ping always returns HTTP 200 with body OK. If Traefik is starting up or in an unhealthy state, the endpoint returns 503 or times out.

Part 2: Expose the ping endpoint on a public entrypoint

By default the Traefik API runs on the internal traefik entrypoint (port 8080), which may not be publicly reachable. To let Vigilmon poll it, either:

Option A — allow external access to port 8080 (simpler, but expose only the /ping path)

Restrict which paths are reachable using a Traefik router:

# traefik.yml (static config)
api:
  dashboard: true

ping: {}

entryPoints:
  web:
    address: ":80"
  websecure:
    address: ":443"
  traefik:
    address: ":8080"

Then firewall all paths on port 8080 except /ping:

# Only allow /ping on 8080 from the Vigilmon IP ranges
ufw allow from any to any port 8080 proto tcp
# Use application-level routing below for tighter control

Option B — expose /ping on your main entrypoint via a router (recommended for production)

# dynamic-config/ping-router.yml
http:
  routers:
    ping:
      rule: "PathPrefix(`/ping`)"
      entryPoints:
        - websecure
      service: ping@internal
      tls:
        certResolver: letsencrypt
      middlewares:
        - ping-ip-whitelist

  middlewares:
    ping-ip-whitelist:
      ipAllowList:
        sourceRange:
          - "0.0.0.0/0"  # Replace with Vigilmon's IP ranges if needed

This exposes https://yourproxy.example.com/ping over TLS on your main entrypoint, which Vigilmon can poll without special firewall rules.

Part 3: Set up HTTP monitoring in Vigilmon

Monitor Traefik itself

Log in to vigilmon.online and click Add Monitor.
Choose HTTP(S) monitor.
Enter: https://yourproxy.example.com/ping
Set interval to 1 minute.
Add a keyword check: must contain OK.
Add your alert channel.
Click Save.

The keyword check is important here because a misconfigured Traefik can return HTTP 200 from a cached error page at the load balancer level. Checking for the text OK confirms the actual Traefik ping handler responded.

Monitor backend services through Traefik

Add a separate monitor for each critical service routed through Traefik. This gives you visibility into whether a service is down even when Traefik itself is healthy:

| Service | Monitor URL | Keyword | |---------|-------------|---------| | API server | https://api.example.com/health | "status":"ok" | | Frontend | https://app.example.com/ | My App | | Auth service | https://auth.example.com/health | "ok":true |

Use descriptive monitor names (Traefik /ping, API via Traefik, etc.) so the Vigilmon dashboard clearly shows which layer failed.

Part 4: SSL certificate monitoring

Traefik handles Let's Encrypt certificates automatically with its ACME provider, but certificate renewal can fail silently — a missing DNS record, a rate limit hit, or a misconfigured resolver. By the time the certificate expires, users see browser warnings and your site is effectively down.

In Vigilmon, click Add Monitor.
Choose SSL monitor.
Enter each domain routed through Traefik: yourproxy.example.com, api.example.com, etc.
Set alert threshold to 14 days before expiry.
Add your alert channel.

Pair SSL monitors with your HTTP monitors so you get coverage at the application, infrastructure, and certificate layers.

Part 5: Webhook alerts

Add a webhook receiver to your infrastructure to receive Vigilmon DOWN/UP events. This can live on any backend service behind Traefik:

// Example: Express.js webhook receiver
import express from 'express';

const app = express();
app.use(express.json());

app.post('/webhook/vigilmon', (req, res) => {
  const { monitor_name, status, url, response_code, checked_at } = req.body;

  if (status === 'down') {
    console.error('[VIGILMON] Monitor DOWN', {
      monitor: monitor_name,
      url,
      code: response_code,
      at: checked_at,
    });

    // Page on-call, post to Slack, create incident...
    notifyOnCall({ monitor: monitor_name, url });
  } else if (status === 'up') {
    console.info('[VIGILMON] Monitor recovered', { monitor: monitor_name });
  }

  res.sendStatus(204);
});

Or use Vigilmon's native Slack integration to post directly to your team channel without deploying a webhook receiver.

Vigilmon sends this payload on DOWN and UP transitions:

{
  "monitor_id": "mon_abc123",
  "monitor_name": "Traefik /ping",
  "status": "down",
  "url": "https://yourproxy.example.com/ping",
  "checked_at": "2026-06-30T08:01:00Z",
  "response_code": 503,
  "response_time_ms": 1502
}

Part 6: Kubernetes deployments

If you run Traefik as an Ingress controller in Kubernetes, the same /ping endpoint is available inside the cluster. Expose it for external monitoring:

# traefik-values.yaml (for Helm chart)
deployment:
  additionalArgs:
    - "--ping=true"

ports:
  traefik:
    expose:
      default: true
    port: 9000
    protocol: TCP

service:
  spec:
    type: LoadBalancer

Then create a dedicated service for the ping endpoint:

apiVersion: v1
kind: Service
metadata:
  name: traefik-ping
  namespace: kube-system
spec:
  selector:
    app.kubernetes.io/name: traefik
  ports:
    - name: ping
      port: 9000
      targetPort: 9000
  type: LoadBalancer

Once the service has an external IP:

kubectl get svc traefik-ping -n kube-system
# NAME           TYPE           CLUSTER-IP     EXTERNAL-IP     PORT(S)
# traefik-ping   LoadBalancer   10.0.0.12      203.0.113.42    9000:31234/TCP

Point Vigilmon at http://203.0.113.42:9000/ping.

IngressRoute for /ping on the main entrypoint

Alternatively, expose /ping through a dedicated IngressRoute on your TLS entrypoint so it goes through the same IP as your other services:

apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
  name: traefik-ping
  namespace: kube-system
spec:
  entryPoints:
    - websecure
  routes:
    - match: Host(`traefik.example.com`) && PathPrefix(`/ping`)
      kind: Rule
      services:
        - name: api@internal
          kind: TraefikService

Part 7: Monitoring the Traefik dashboard

The Traefik dashboard (/dashboard/) gives you real-time visibility into routers, services, and middlewares. You can monitor it separately to catch dashboard failures (which indicate API layer issues) even when the proxy itself is functional:

Expose the dashboard on a secured entrypoint:

# dynamic-config/dashboard.yml
http:
  routers:
    dashboard:
      rule: "Host(`traefik.example.com`) && (PathPrefix(`/dashboard`) || PathPrefix(`/api`))"
      entryPoints:
        - websecure
      service: api@internal
      middlewares:
        - dashboard-auth
      tls:
        certResolver: letsencrypt

  middlewares:
    dashboard-auth:
      basicAuth:
        users:
          - "admin:$apr1$..."  # htpasswd-generated hash

In Vigilmon, add an HTTP monitor for https://traefik.example.com/dashboard/ with basic auth credentials configured as custom headers (Authorization: Basic base64(user:pass)).

Summary

Your Traefik deployment now has four layers of monitoring:

/ping endpoint — confirms Traefik is alive and healthy, polled every 60 seconds by Vigilmon.
Backend service monitors — each service routed through Traefik has its own Vigilmon check so you know whether a failure is in Traefik or in the service.
SSL monitors — alerts you 14 days before any certificate expires, before users see browser warnings.
Webhook/Slack alerts — DOWN and UP events are delivered to your team channel and/or on-call rotation within one check interval.

Vigilmon handles check scheduling, multi-region polling, alert routing, and uptime history. You get notified within 60 seconds of any failure, whether it's in Traefik itself, a backend service, or a certificate.

Monitor your Traefik infrastructure free at vigilmon.online

#traefik #devops #monitoring #docker #kubernetes