tutorial

Monitoring Immich with Vigilmon: Health Endpoint, Web App Availability, ML Service TCP Port & SSL Alerts

How to monitor Immich self-hosted photo management with Vigilmon — server ping health checks, web app availability, machine learning microservice TCP port monitoring, and SSL certificate alerts.

Immich is a self-hosted alternative to Google Photos — providing AI-powered photo and video backup, face recognition, EXIF search, and shared albums, all on your own hardware. When Immich's server goes down, your mobile app stops syncing, backups from family members fail silently, and access to years of photos is cut off until you manually diagnose the problem. The machine learning microservice handles facial recognition and smart search — if it crashes, those features degrade without any obvious error on screen. Vigilmon gives you external visibility into Immich's health endpoint, web interface, ML service TCP port, and SSL certificate so you catch failures in under a minute.

What You'll Build

  • A monitor on Immich's /api/server-info/ping health endpoint
  • An HTTP monitor for the Immich web app
  • A TCP monitor for the Immich machine learning microservice (port 3003)
  • SSL certificate monitoring for your Immich domain
  • An alerting setup that separates API failures from ML service degradation

Prerequisites

  • A running Immich instance with a public or network-reachable domain
  • HTTPS configured (e.g., https://immich.example.com)
  • A free account at vigilmon.online

Step 1: Verify Immich's Ping Endpoint

Immich exposes a lightweight, unauthenticated ping endpoint at /api/server-info/ping that returns a simple alive response from the main server:

curl https://immich.example.com/api/server-info/ping

A healthy Immich instance returns HTTP 200 with a JSON body:

{
  "res": "pong"
}

This endpoint requires no authentication and confirms that the Immich Node.js server process is running and its internal routing is functional. It does not validate database connectivity, but it confirms the primary process that handles all API calls and mobile sync is alive.


Step 2: Create a Vigilmon HTTP Monitor for the Ping Endpoint

  1. Log in to VigilmonAdd Monitor → HTTP.
  2. URL: https://immich.example.com/api/server-info/ping.
  3. Check interval: 60 seconds.
  4. Response timeout: 10 seconds.
  5. Expected status: 200.
  6. Keyword: pong.
  7. Label: Immich Server Ping.
  8. Click Save.

This monitor catches:

  • Immich server process crashes (Node.js container down)
  • Out-of-memory kills — Immich is memory-intensive due to image processing and thumbnail generation
  • Container restart loops caused by misconfiguration after upgrades
  • Network failures between your reverse proxy and the Immich server container

The pong keyword check ensures you receive a genuine Immich response and not a proxy error page or maintenance page returning 200.

Why this matters: Immich is a memory-critical application. Large photo libraries, video transcoding jobs, and concurrent mobile sync sessions can exhaust RAM and kill the container. Unavailability means users lose access to their entire photo archive until the process is manually restarted.


Step 3: Monitor the Immich Web App

Immich's web interface is served by a separate frontend container (Vue.js/SvelteKit) or by the main server depending on your deployment. Monitoring it independently catches frontend-only failures that wouldn't affect the mobile app API:

  1. Add Monitor → HTTP.
  2. URL: https://immich.example.com.
  3. Check interval: 60 seconds.
  4. Expected status: 200.
  5. Keyword: Immich.
  6. Label: Immich Web App.
  7. Click Save.

This monitor catches reverse proxy failures, web UI container crashes, and static asset serving errors — failures that would prevent users from accessing their photo library in a browser while the mobile API might still be alive.


Step 4: Create a TCP Monitor for the Machine Learning Microservice

Immich's machine learning microservice runs as a separate Python container and listens on port 3003 by default. It handles:

  • Facial recognition and face clustering
  • CLIP-based smart search (search photos by description)
  • Image classification and tagging

When this service crashes, smart search stops working, new faces aren't clustered, and Immich logs errors for every photo that needs ML processing. The failures are silent from the user's perspective — photos still appear, but AI features degrade.

  1. Add Monitor → TCP.
  2. Host: immich.example.com (or the internal host/container name if the ML port is not publicly exposed).
  3. Port: 3003.
  4. Check interval: 60 seconds.
  5. Response timeout: 10 seconds.
  6. Label: Immich ML Service TCP.
  7. Click Save.

Note: In most Immich deployments the ML microservice port is not exposed externally — it's accessed only from the Immich server container on a Docker internal network. If that's your setup, use the internal container name (immich-machine-learning) as the host and ensure Vigilmon has network access, or skip the TCP monitor and rely on the /api/server-info/ping endpoint to detect cascading failures when ML errors bubble up to the API.


Step 5: Monitor SSL Certificates

Immich is accessed by mobile apps, desktop browsers, and shared album links. An expired SSL certificate:

  • Blocks the Immich mobile app from syncing on iOS and Android
  • Prevents browser access to your photo library
  • Breaks shared album links sent to family members
  • May silently fail without obvious error messages on older mobile clients
  1. Add Monitor → SSL Certificate.
  2. Domain: immich.example.com.
  3. Alert when expiry is within: 30 days.
  4. Alert again: 14 days, 7 days, 3 days, 1 day.
  5. Click Save.

Step 6: Configure Alerting

In Vigilmon under Settings → Notifications, configure your alert channels:

| Monitor | Trigger | Action | |---|---|---| | /api/server-info/ping | Non-200 or pong missing | Check Immich server container; inspect memory usage; review Docker logs | | Web App | Non-200 or keyword missing | Check reverse proxy; verify web UI container is running | | ML Service TCP | Connection refused or timeout | Restart immich-machine-learning container; check GPU/CPU resource limits | | SSL certificate | < 30 days to expiry | Renew certificate; verify ACME/Let's Encrypt auto-renewal is running |

Alert after: 2 consecutive failures for HTTP monitors. 1 failure for the TCP monitor — a crashed ML container won't recover on its own.


Common Immich Failure Modes and What Vigilmon Catches

| Scenario | Vigilmon monitor | |---|---| | Immich server container OOM-killed | Ping unreachable; alert within 60 s | | PostgreSQL database down | Server ping may still return 200; API calls return 500 errors | | Redis cache failure | Server may return degraded responses; search and job queues fail | | ML container crash | ML TCP monitor fires; facial recognition and smart search stop working | | Reverse proxy misconfiguration | Web app monitor fires; mobile API may still be reachable | | Disk full (photo storage) | Upload API returns 500; mobile sync fails silently | | SSL certificate expires | SSL monitor alerts at 30 days; mobile apps stop syncing | | Video transcoding job fills RAM | Server goes OOM; ping endpoint becomes unreachable | | DNS misconfiguration | All monitors fire simultaneously | | Port conflict after Docker restart | ML TCP monitor fires; server logs show connection refused to ML service |


Your photo archive is irreplaceable — and if Immich goes down without alerting you, backups stop, mobile sync silently fails, and family members lose access to shared albums. Vigilmon watches Immich's server health, web app, machine learning microservice, and SSL certificate so you're notified within 60 seconds of any failure, before anyone has to ask why their photos stopped syncing.

Start monitoring Immich in under 5 minutes — register free at vigilmon.online.

Monitor your app with Vigilmon

Free plan — 5 monitors, no credit card required. Up and running in 60 seconds.

Start free →