tutorial

Monitoring Proxmox VE with Vigilmon: API Health Endpoint, Web UI & SSL Certificate Alerts

How to monitor Proxmox VE with Vigilmon — API health endpoint checks, web UI availability, SSL certificate alerts, and node availability monitoring for your self-hosted virtualisation platform.

Proxmox Virtual Environment (Proxmox VE) is the leading open-source virtualisation platform — used by homelabbers, SMBs, and enterprises to run KVM virtual machines and LXC containers on bare-metal hardware. When a Proxmox node goes down, every VM and container on that node loses connectivity, and cluster failover may not trigger automatically unless high availability (HA) groups are configured. Vigilmon gives you external visibility into Proxmox's health: the REST API, web UI availability, SSL certificates, and node reachability — so you catch Proxmox failures before they cascade to the workloads running on top of it.

What You'll Build

  • A monitor on Proxmox's API health endpoint (/api2/json/version)
  • A web UI availability check on the Proxmox management interface
  • SSL certificate monitoring for your Proxmox node
  • Node availability monitoring for each node in a multi-node cluster
  • Alerting that distinguishes API failures from web layer failures

Prerequisites

  • A running Proxmox VE node (version 7.x or 8.x) with HTTPS access
  • Proxmox accessible at a public or network-reachable URL (e.g., https://proxmox.example.com:8006)
  • A free account at vigilmon.online

Step 1: Verify Proxmox's API Endpoint

Proxmox exposes its REST API on port 8006. The version endpoint is publicly accessible without authentication and returns the Proxmox version:

# Version endpoint — no authentication required
curl -k https://proxmox.example.com:8006/api2/json/version

# Web UI — returns HTML for the management interface
curl -k -I https://proxmox.example.com:8006

# Node status (requires API token)
curl -k -H "Authorization: PVEAPIToken=USER@REALM!TOKENID=UUID" \
  https://proxmox.example.com:8006/api2/json/nodes/proxmox/status

A healthy Proxmox node returns JSON from the version endpoint with a data object containing version, release, and repoid fields:

{
  "data": {
    "version": "8.2.4",
    "release": "8",
    "repoid": "b46aac3b"
  }
}

Note on self-signed certificates: Proxmox ships with a self-signed certificate by default. The -k flag in curl bypasses certificate validation. For production deployments, configure a valid certificate (Let's Encrypt via the Proxmox ACME plugin or a commercial cert) so that Vigilmon's SSL certificate monitor works correctly.


Step 2: Create a Vigilmon Monitor for the API Health Endpoint

The Proxmox API version endpoint (/api2/json/version) is the best health check for Proxmox because it:

  • Requires no authentication
  • Confirms the pveproxy daemon is running
  • Returns a predictable JSON response with a keyword you can check
  1. Log in to VigilmonAdd Monitor → HTTP.
  2. URL: https://proxmox.example.com:8006/api2/json/version.
  3. Check interval: 60 seconds.
  4. Response timeout: 10 seconds.
  5. Expected status: 200.
  6. Keyword: version (always present in the data object on a healthy node).
  7. Label: Proxmox API.
  8. Click Save.

If this monitor fires, the pveproxy service has stopped or the Proxmox node is unreachable. Every VM and container on that node may be affected.


Step 3: Monitor the Web UI Availability

The Proxmox web UI (served on port 8006 alongside the API) is what administrators use to manage VMs, containers, storage, and cluster configuration. Monitor it separately to distinguish application failures from proxy failures:

  1. Add Monitor → HTTP.
  2. URL: https://proxmox.example.com:8006.
  3. Check interval: 5 minutes.
  4. Response timeout: 15 seconds.
  5. Expected status: 200.
  6. Keyword: Proxmox Virtual Environment (present in the page title on a healthy instance).
  7. Label: Proxmox Web UI.
  8. Click Save.

If this monitor fires while the API endpoint is healthy, the web UI service component (ExtJS frontend serving) may have failed while the API backend remains operational. Administrators lose GUI access but API and CLI access still work.


Step 4: Monitor Node Availability for Each Cluster Node

In a multi-node Proxmox cluster, each node should be monitored independently. A single node failure doesn't bring down the cluster manager, but it takes all VMs on that node offline unless HA is configured:

Repeat the API endpoint monitor for each node in your cluster:

  1. Add Monitor → HTTP.
  2. URL: https://pve-node-2.example.com:8006/api2/json/version.
  3. Check interval: 60 seconds.
  4. Response timeout: 10 seconds.
  5. Expected status: 200.
  6. Keyword: version.
  7. Label: Proxmox Node 2 API.
  8. Click Save.

Add one monitor per node. In a three-node cluster, you should have three API monitors — one per node — so you can identify exactly which node failed rather than receiving a generic cluster alert.

HA considerations: If your cluster has HA enabled, a node failure may trigger VM migration to surviving nodes. Monitor the HA group's shared IP or load balancer in addition to per-node checks to confirm workloads are accessible after a node failure triggers HA failover.


Step 5: Monitor SSL Certificates

Proxmox's web UI and API are both served over HTTPS. If the certificate expires, administrators cannot access the management interface and API clients fail with TLS errors — which can block automated Terraform, Ansible, and backup operations:

  1. Add Monitor → SSL Certificate.
  2. Domain: proxmox.example.com.
  3. Port: 8006.
  4. Alert when expiry is within: 30 days.
  5. Alert again: 14 days, 7 days, 3 days, 1 day.
  6. Click Save.

Proxmox ACME plugin: Proxmox VE 7+ includes a built-in ACME (Let's Encrypt) plugin under Datacenter → ACME that auto-renews certificates. Even with auto-renewal enabled, monitor the expiry externally — ACME renewal can fail silently if your DNS or HTTP challenge isn't reachable, and the Proxmox UI only shows the current certificate status, not renewal failure alerts.


Step 6: Configure Alerting

In Vigilmon under Settings → Notifications, configure your alert channels:

| Monitor | Trigger | Action | |---|---|---| | API endpoint (/api2/json/version) | Non-200 or keyword missing | pveproxy down or node unreachable; all VMs on node at risk | | Web UI (port 8006 root) | Non-200 or keyword missing | GUI inaccessible; check pveproxy and ExtJS serving | | Per-node API monitors | Non-200 | That specific node is down; check HA failover status | | SSL certificate | < 30 days to expiry | Renew certificate; check ACME plugin renewal logs |

Alert after: 1 consecutive failure for the API endpoint — a Proxmox node failure means workloads are immediately affected. 2 failures for the web UI to avoid false positives during Proxmox upgrades that briefly restart pveproxy.


Common Proxmox Failure Modes and What Vigilmon Catches

| Scenario | Vigilmon monitor | |---|---| | pveproxy service crash | API endpoint returns connection refused; alert within 60 s | | Proxmox node kernel panic or power loss | All per-node monitors fire; no response from node | | Network cable/switch failure on management interface | API and web UI monitors fire; storage and VM traffic may be unaffected | | Full disk on root or data partition | API may degrade; node may go into read-only mode | | Corosync quorum loss (cluster split-brain) | Individual nodes may appear healthy; check cluster quorum separately | | Let's Encrypt ACME renewal failure | SSL monitor alerts at 30-day threshold | | Proxmox upgrade restarts pveproxy | Brief API outage; 2-failure threshold avoids false positive | | HA fencing kills a node | Per-node monitor fires; surviving nodes should take over VMs | | Storage backend (Ceph/NFS/iSCSI) failure | Proxmox API healthy but VMs may freeze or crash |


Proxmox VE outages are high-impact because every VM and container on the affected node stops simultaneously. Unlike cloud providers that migrate workloads transparently, self-hosted Proxmox requires manual intervention or pre-configured HA groups. Vigilmon's per-node API monitoring gives you immediate, external visibility when a node goes down — so you can trigger HA failover, notify users, and start recovery before automated processes miss the failure entirely.

Start monitoring Proxmox VE in under 5 minutes — register free at vigilmon.online.

Monitor your app with Vigilmon

Free plan — 5 monitors, no credit card required. Up and running in 60 seconds.

Start free →