Monitoring ntfy with Vigilmon: Health Endpoint, TCP Port Check, Topic Availability & SSL Alerts

ntfy is a self-hosted push notification service that lets you send alerts to your phone or any subscriber with a simple HTTP POST — no app accounts, no proprietary SDKs. The irony of ntfy downtime is obvious: if your notification service is down, none of your other monitoring alerts can reach you. Teams that use ntfy as their alerting pipeline have a single point of failure that is itself unmonitored. Vigilmon watches ntfy's health endpoint, TCP port, topic availability, and SSL certificate externally — so you have an independent monitoring layer that doesn't rely on ntfy itself to tell you ntfy is down.

What You'll Build

An HTTP monitor for ntfy's root health check (returns HTML with 200)
An HTTP monitor for the /v1/health API endpoint if available
A TCP port check on port 80 or 443
An HTTP monitor for topic subscription endpoint availability
SSL certificate monitoring for your ntfy domain
An alerting setup that doesn't route through ntfy itself

Prerequisites

A running ntfy instance with a public or network-reachable domain (e.g., https://ntfy.example.com)
HTTPS configured with a valid certificate
A free account at vigilmon.online

Step 1: Understand ntfy's Liveness Signals

ntfy exposes multiple signals you can monitor externally:

Root path (/): Returns HTTP 200 with an HTML page — the ntfy web UI. This confirms the Go process is running and the HTTP server is accepting connections.

Health endpoint (/v1/health): Available in ntfy v2.0+, returns JSON confirming service health:

curl https://ntfy.example.com/v1/health

Response:

{"health":true}

Topic endpoint (/v1/stats) or any topic URL: Hitting a topic's SSE endpoint or stats returns a live response confirming the pub/sub layer is functional.

Start with the simplest check first.

Step 2: Create a Vigilmon Monitor for the Root Health Check

Log in to Vigilmon → Add Monitor → HTTP.
URL: https://ntfy.example.com/.
Check interval: 60 seconds.
Response timeout: 10 seconds.
Expected status: 200.
Keyword: ntfy.
Label: ntfy Root Health.
Click Save.

The keyword ntfy confirms the ntfy web UI is being served rather than a generic placeholder page from a misconfigured reverse proxy. This catches:

ntfy Go process crashes
Container or VM restarts
Web server or reverse proxy failures
Host resource exhaustion

Since ntfy is likely delivering your team's other monitoring alerts, a 60-second check interval gives you the fastest possible external notification if it goes down.

Step 3: Monitor the `/v1/health` API Endpoint

If you're running ntfy v2.0 or later, add a dedicated API health check:

Add Monitor → HTTP.
URL: https://ntfy.example.com/v1/health.
Check interval: 60 seconds.
Expected status: 200.
Keyword: health.
Label: ntfy API Health.
Click Save.

This endpoint specifically confirms that ntfy's internal health checks are passing — not just that the HTTP server is responding. If ntfy detects a problem with its own message queue or database, /v1/health will return {"health":false} or a non-200 status while the root page might still load.

Note: If your ntfy version doesn't expose /v1/health, skip this monitor — the root path check provides sufficient coverage.

Step 4: Add a TCP Port Check

Add a TCP-level check to catch network-layer failures independently of the application:

Add Monitor → TCP Port.
Host: ntfy.example.com.
Port: 443 (HTTPS) or 80 (HTTP, if applicable).
Check interval: 5 minutes.
Label: ntfy TCP 443.
Click Save.

A TCP failure combined with an HTTP monitor failure points to a network or firewall issue rather than an application crash — significantly narrowing your investigation when you get paged.

Step 5: Monitor Topic Subscription Availability

ntfy's core function is delivering messages to topic subscribers. Monitor that the pub/sub layer is functional by checking that a test topic endpoint responds:

Add Monitor → HTTP.
URL: https://ntfy.example.com/v1/messages?poll=1&topic=vigilmon-healthcheck (or use ntfy's stats endpoint: https://ntfy.example.com/v1/stats).
Check interval: 5 minutes.
Expected status: 200.
Label: ntfy Topic Layer.
Click Save.

Alternative: Use https://ntfy.example.com/vigilmon-healthcheck/json?poll=1 — polling a topic for recent messages returns a JSON array (empty or with messages), confirming the topic layer is alive without requiring authentication.

This monitor catches failures in ntfy's pub/sub infrastructure that wouldn't be visible in a root health check.

Step 6: Monitor SSL Certificates

ntfy is often used to deliver security alerts — intrusion detection notifications, failed login alerts, server down notifications. If your ntfy domain's SSL certificate expires:

ntfy mobile apps and the web UI show certificate errors
Automated publishers (your scripts sending curl https://ntfy.example.com/mytopic -d "...") start failing silently unless they check certificate validity
You lose your notification pipeline at the moment you most need it

Add Monitor → SSL Certificate.
Domain: ntfy.example.com.
Alert when expiry is within: 30 days.
Alert again: 14 days, 7 days, 3 days, 1 day.
Click Save.

Step 7: Configure Alerting

The critical architectural point: do not route Vigilmon alerts through ntfy itself. If ntfy is down, alerts sent through ntfy will be silently dropped.

In Vigilmon under Settings → Notifications, configure a separate alert channel — email, Slack, PagerDuty, or any channel that doesn't depend on ntfy:

| Monitor | Trigger | Action | |---|---|---| | Root Health | Non-200 or keyword missing | Check ntfy process; inspect container/service status | | API Health | Non-200 or health missing | Check ntfy version; inspect ntfy logs for internal errors | | TCP Port | Connection refused | Check firewall; verify ntfy is listening on the expected port | | Topic Layer | Non-200 response | Check ntfy database/message queue; restart if needed | | SSL Certificate | < 30 days to expiry | Renew immediately; check ACME automation |

Alert after: 2 failures for HTTP monitors. 1 failure for SSL and TCP monitors.

Escalation: The ntfy root health monitor should have the lowest possible alert threshold — if ntfy is your alerting backbone, its downtime silences every other monitor in your stack.

Common ntfy Failure Modes and What Vigilmon Catches

| Scenario | Vigilmon monitor | |---|---| | ntfy process crash | Root health returns 502/503; alert within 60 s | | Container OOM killed | TCP check fires; HTTP monitors follow | | Reverse proxy misconfiguration | HTTP monitors fire; TCP check may stay green | | SSL certificate expires | SSL monitor alerts at 30-day threshold; apps reject connection | | Message database corruption | API health returns {"health":false}; topic monitor may fire | | Disk full (message storage) | ntfy may stop accepting publishes; health endpoint may degrade | | DNS misconfiguration | All HTTP and SSL monitors fire simultaneously | | ntfy upgrade breaks API | v1/health or topic endpoint returns unexpected status | | Port conflict after reboot | TCP check fires; ntfy failed to bind its port | | Firewall rule added | TCP check fires; HTTP monitors unreachable from outside |

ntfy is the notification system for your notification system — monitoring it with an external tool like Vigilmon closes the loop that ntfy itself cannot close. Vigilmon's checks run from external infrastructure completely independent of your ntfy server, so an ntfy outage doesn't prevent you from learning about it.

Start monitoring ntfy in under 5 minutes — register free at vigilmon.online.