tutorial

How to Monitor Nginx Uptime and SSL Certificates with Vigilmon

Nginx sits in front of everything — but stub_status is not uptime monitoring. Learn how to use Vigilmon to monitor nginx availability, upstream backend health, SSL certificate expiry, and log rotation job liveness.

Nginx is the front door to your infrastructure — reverse proxy, SSL terminator, load balancer, static file server. When it goes down, everything behind it goes down too. Yet nginx monitoring is often limited to checking whether the nginx process is running, which misses the failures that matter most: an upstream backend that nginx is silently returning 502 for, an SSL certificate expiring in seven days, or a blocked log rotation that will fill your disk overnight.

Vigilmon gives you external visibility into nginx availability, upstream health, SSL certificate expiry, and maintenance job liveness through HTTP probe and heartbeat monitoring. This tutorial walks through the complete setup.


Why nginx Needs External Monitoring

nginx's built-in stub_status module gives you connection counts and request rates — useful for capacity planning, but not for uptime alerting. It does not tell you:

  • Whether nginx is reachable from the public internet (your VPN might still reach it when users can't)
  • Whether upstream backends are returning errors that nginx is passing through
  • Whether your SSL certificates are within days of expiry
  • Whether log rotation or other maintenance tasks have silently stopped
  • Whether nginx is serving the correct content (a misconfigured server block can serve a default page instead of your application)

External monitoring through Vigilmon catches all of these.


Step 1: Enable the nginx stub_status Module

The stub_status module exposes basic nginx metrics. Enable it for internal monitoring (not as a public endpoint):

# /etc/nginx/conf.d/stub_status.conf
server {
    listen 127.0.0.1:8080;
    server_name localhost;

    location /nginx_status {
        stub_status on;
        allow 127.0.0.1;
        deny all;
    }
}

Reload nginx:

nginx -t && systemctl reload nginx

Verify locally:

curl http://127.0.0.1:8080/nginx_status
# Active connections: 12
# server accepts handled requests
#  45231 45231 182410
# Reading: 0 Writing: 5 Waiting: 7

This endpoint is for your sidecar health script — do not expose it publicly.


Step 2: Build an nginx Health Endpoint for Vigilmon

Create a dedicated health endpoint that wraps the stub_status data and your own application checks. Add this to your nginx configuration:

# /etc/nginx/conf.d/health.conf
server {
    listen 80;
    server_name your-app.example.com;

    # Public health endpoint Vigilmon will probe
    location /health/nginx {
        access_log off;
        return 200 '{"status":"ok"}';
        add_header Content-Type application/json;
    }

    # Route everything else to your app
    location / {
        proxy_pass http://127.0.0.1:3000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

For a more sophisticated health check that verifies an upstream is responding, use a Lua script (OpenResty) or a small sidecar:

#!/bin/bash
# /usr/local/bin/nginx-health-check.sh
# Called by a local HTTP server (e.g. Python http.server)

NGINX_STATUS=$(curl -fs http://127.0.0.1:8080/nginx_status)
if [ $? -ne 0 ]; then
  echo '{"status":"down","reason":"stub_status_unavailable"}'
  exit 1
fi

ACTIVE=$(echo "$NGINX_STATUS" | head -1 | awk '{print $3}')
echo "{\"status\":\"ok\",\"active_connections\":$ACTIVE}"

Expose this via a small Python server if you don't want to add an nginx health route:

# sidecar_health.py
from http.server import HTTPServer, BaseHTTPRequestHandler
import subprocess, json

class HealthHandler(BaseHTTPRequestHandler):
    def do_GET(self):
        if self.path != '/health/nginx':
            self.send_response(404)
            self.end_headers()
            return

        try:
            result = subprocess.run(
                ['curl', '-fs', 'http://127.0.0.1:8080/nginx_status'],
                capture_output=True, text=True, timeout=3
            )
            if result.returncode != 0:
                raise RuntimeError("stub_status unreachable")

            body = json.dumps({"status": "ok"}).encode()
            self.send_response(200)
            self.send_header('Content-Type', 'application/json')
            self.end_headers()
            self.wfile.write(body)
        except Exception as e:
            body = json.dumps({"status": "down", "error": str(e)}).encode()
            self.send_response(503)
            self.send_header('Content-Type', 'application/json')
            self.end_headers()
            self.wfile.write(body)

    def log_message(self, *args):
        pass  # suppress access logs

HTTPServer(('0.0.0.0', 8090), HealthHandler).serve_forever()

Step 3: Configure Vigilmon HTTP Monitor for nginx

  1. Log in to vigilmon.online and go to Monitors → New Monitor
  2. Choose HTTP / HTTPS
  3. Set the URL to your nginx health endpoint: https://your-app.example.com/health/nginx
  4. Set the check interval to 1 minute
  5. Under Expected response, configure:
    • Status code: 200
    • Response body contains: "status":"ok"
    • Response time threshold: 1000ms
  6. Under Alert channels, assign your Slack or PagerDuty channel
  7. Save the monitor

Vigilmon probes from multiple geographic regions simultaneously, requiring multi-region consensus before opening an incident. This prevents paging you for a single-probe hiccup.

Monitoring Upstream Backends with Vigilmon

If nginx serves as a reverse proxy to multiple upstream backends, monitor each backend independently:

# Expose a per-upstream health route in nginx
location /health/api-backend {
    proxy_pass http://api-backend/health;
    proxy_connect_timeout 2s;
    proxy_read_timeout 5s;
}

location /health/worker-backend {
    proxy_pass http://worker-backend/health;
    proxy_connect_timeout 2s;
    proxy_read_timeout 5s;
}

Create separate Vigilmon monitors for each upstream proxy health route. This way Vigilmon tells you exactly which backend is failing rather than just reporting nginx as down.


Step 4: SSL Certificate Expiry Monitoring

Expired SSL certificates are one of the most embarrassing and avoidable outages. Vigilmon monitors SSL certificate validity automatically when you use HTTPS endpoints.

In your Vigilmon HTTP monitor:

  1. Use the https:// URL for your monitor endpoint
  2. Under SSL / TLS, enable Check SSL certificate expiry
  3. Set the Warning threshold: 30 days
  4. Set the Critical threshold: 7 days
  5. Assign alert channels for each threshold level

Vigilmon checks the certificate on every probe cycle and alerts you:

  • 30 days before expiry: email warning (time to renew via Certbot / ACME)
  • 7 days before expiry: Slack + PagerDuty page (certificate renewal is now urgent)
  • On certificate error (expired, wrong domain, chain failure): immediate P1 alert

For Let's Encrypt certificates managed by Certbot, also set up a heartbeat monitor for the renewal job (covered in Step 5). An expired certificate usually means the renewal job silently failed — the heartbeat monitor catches this before the expiry date itself becomes an incident.

Testing Your SSL Configuration

Verify your nginx SSL setup before relying on Vigilmon to monitor it:

# Check certificate expiry from the command line
echo | openssl s_client -servername your-app.example.com \
  -connect your-app.example.com:443 2>/dev/null \
  | openssl x509 -noout -dates

# Check with curl
curl -vI https://your-app.example.com 2>&1 | grep -E "expire|SSL|TLS"

Step 5: Heartbeat Monitoring for Log Rotation and Maintenance Scripts

nginx log rotation (logrotate), certificate renewal (certbot renew), and cache purge scripts run in the background via cron. When they fail:

  • Log files grow until disk space is exhausted, causing nginx to stop accepting new connections
  • Certificates expire
  • Cache directories fill with stale entries

Vigilmon heartbeat monitors detect silent job failures: your maintenance script pings Vigilmon at the end of each successful run.

Set Up Heartbeat Monitors

Create one heartbeat per maintenance job in Vigilmon:

| Job | Interval | Grace Period | |---|---|---| | logrotate /etc/logrotate.d/nginx | 24 hours | 2 hours | | certbot renew | 12 hours | 2 hours | | Cache purge script | 6 hours | 1 hour |

Wire Heartbeats Into Your Cron Jobs

# /etc/cron.daily/nginx-logrotate
#!/bin/bash
set -euo pipefail

/usr/sbin/logrotate /etc/logrotate.d/nginx

# Only ping if logrotate succeeded
curl -fsS "$VIGILMON_LOGROTATE_HEARTBEAT_URL" > /dev/null
# /etc/cron.d/certbot
0 */12 * * * root certbot renew --quiet && \
  curl -fsS "$VIGILMON_CERTBOT_HEARTBEAT_URL" > /dev/null

The && operator ensures the heartbeat ping only fires when certbot renew exits successfully. A failed renewal leaves the heartbeat silent — Vigilmon alerts within the grace period.


Step 6: Vigilmon as External Validator vs nginx Internal Health Checks

nginx has built-in health check functionality (available in the commercial nginx Plus tier and via the nginx_upstream_check_module for open source). These are useful for upstream load balancing — but they are internal checks that nginx makes to its upstreams from within your network.

Vigilmon serves a different and complementary role:

| Check Type | nginx Health Checks | Vigilmon | |---|---|---| | What it monitors | nginx → upstream connectivity | Internet → nginx connectivity | | Who runs it | nginx process itself | Vigilmon's global probe network | | SSL certificate validity | No | Yes | | Alert routing | Configured in nginx | Configured in Vigilmon | | Geographic multi-region | No | Yes | | Survives nginx crash | No | Yes |

Use nginx internal health checks for upstream load balancing decisions. Use Vigilmon for public-facing availability monitoring, SSL certificate expiry, and maintenance job liveness — these are the signals your team needs for on-call alerting.


Summary

nginx availability monitoring is more than checking whether the process is running. Vigilmon gives you the external validation that matters for on-call alerting:

| Monitor Type | What It Covers | |---|---| | HTTP monitor on /health/nginx | Public nginx reachability, upstream errors | | HTTP monitor per upstream proxy route | Individual backend health | | SSL certificate expiry check | Certificate validity, days to expiry | | Heartbeat: logrotate | Log rotation job success | | Heartbeat: certbot renew | Certificate renewal job success |

Get started free at vigilmon.online — your first nginx monitor is running in under two minutes.

Monitor your app with Vigilmon

Free plan — 5 monitors, no credit card required. Up and running in 60 seconds.

Start free →