tutorial

Monitoring Woodpecker CI with Vigilmon: Health Endpoint, Agent TCP Port & SSL Certificate Alerts

How to monitor Woodpecker CI with Vigilmon — health endpoint checks, web UI availability, agent TCP port monitoring, SSL certificate alerts, and pipeline failure notifications for your lightweight Git-based CI/CD platform.

Woodpecker CI is a lightweight, community-driven CI/CD platform built on Gitea, Forgejo, GitHub, and GitLab — popular with teams that want a self-hosted pipeline runner without the complexity of Jenkins or the overhead of GitLab CI. When your Woodpecker server goes down, new commits stop triggering pipelines, developers lose visibility into build status, and failed deployments go unnoticed. Vigilmon gives you external visibility into Woodpecker's health: the dedicated health endpoint, web UI availability, agent connectivity, and SSL certificates — so you catch CI outages before they block your team's workflow.

What You'll Build

  • A monitor on Woodpecker's /healthz endpoint
  • A web UI availability check
  • A TCP port monitor for the Woodpecker agent connection port
  • SSL certificate monitoring for your Woodpecker domain
  • Alerting that distinguishes server failures from agent disconnections

Prerequisites

  • A running self-hosted Woodpecker CI server with HTTPS access
  • Woodpecker accessible at a public or network-reachable URL (e.g., https://ci.example.com)
  • A free account at vigilmon.online

Step 1: Verify Woodpecker's Health Endpoint

Woodpecker exposes a dedicated health endpoint that confirms the server process is running:

# Health endpoint — returns "OK" with HTTP 200 when healthy
curl https://ci.example.com/healthz

# Web UI — should return 200 with HTML content
curl -I https://ci.example.com

# Agent gRPC/TCP port (default: 9000) — check TCP connectivity
nc -zv ci.example.com 9000

A healthy Woodpecker server returns OK (plain text) with HTTP 200 from /healthz. The web UI serves HTML on port 443. Agents connect to the server over gRPC on port 9000 by default — if this port is unreachable, agents cannot receive pipeline jobs even if the server appears healthy via HTTP.


Step 2: Create a Vigilmon Monitor for the Health Endpoint

  1. Log in to VigilmonAdd Monitor → HTTP.
  2. URL: https://ci.example.com/healthz.
  3. Check interval: 60 seconds.
  4. Response timeout: 10 seconds.
  5. Expected status: 200.
  6. Keyword: OK.
  7. Label: Woodpecker Health.
  8. Click Save.

The keyword check on OK catches cases where the server returns 200 but the application is in a degraded or starting state. This is the primary monitor for Woodpecker server health.


Step 3: Monitor the Web UI Availability

The Woodpecker web UI is what your developers use to view pipeline runs, logs, and repository settings. Monitor it separately from the health endpoint — the UI can fail due to reverse proxy misconfiguration even when the Go server process is healthy:

  1. Add Monitor → HTTP.
  2. URL: https://ci.example.com.
  3. Check interval: 5 minutes.
  4. Response timeout: 15 seconds.
  5. Expected status: 200.
  6. Keyword: Woodpecker (present in the page title on a healthy instance).
  7. Label: Woodpecker Web UI.
  8. Click Save.

If this monitor fires while the health endpoint is healthy, the reverse proxy (nginx/Caddy/Traefik) has failed at the web layer while the Go backend is still running.


Step 4: Monitor the Agent TCP Port

Woodpecker agents connect to the server over gRPC on port 9000 (configurable via WOODPECKER_GRPC_ADDR). If this port is blocked or the gRPC listener fails, agents cannot receive jobs and all pipeline executions queue indefinitely without error messages visible in the UI:

  1. Add Monitor → TCP Port.
  2. Host: ci.example.com.
  3. Port: 9000 (or your configured WOODPECKER_GRPC_ADDR port).
  4. Check interval: 2 minutes.
  5. Response timeout: 10 seconds.
  6. Label: Woodpecker Agent Port.
  7. Click Save.

This is the most operationally important monitor for teams running Woodpecker at scale. A failed gRPC port means agents are silently disconnected — pipelines are queued but never executed, and the web UI may show agents as connected from a stale registration even when they cannot receive jobs.

Firewall note: If your agents run on separate hosts, ensure port 9000 is accessible from the agent network. A TCP monitor from Vigilmon's probing network confirms external reachability but does not validate connectivity from agent hosts specifically.


Step 5: Monitor SSL Certificates

Woodpecker's web UI and API must remain on HTTPS — agents authenticate to the server and the UI authenticates to your Git provider over TLS. A certificate expiry breaks both agent connectivity and OAuth login:

  1. Add Monitor → SSL Certificate.
  2. Domain: ci.example.com.
  3. Alert when expiry is within: 30 days.
  4. Alert again: 14 days, 7 days, 3 days, 1 day.
  5. Click Save.

If Woodpecker uses a separate domain for the gRPC endpoint (e.g., grpc.ci.example.com), add a second SSL monitor for that domain. An expired gRPC certificate disconnects all agents even if the web UI certificate is still valid.


Step 6: Configure Alerting

In Vigilmon under Settings → Notifications, configure your alert channels:

| Monitor | Trigger | Action | |---|---|---| | Health endpoint (/healthz) | Non-200 or OK keyword missing | Woodpecker server process down; check container/systemd unit | | Web UI | Non-200 or keyword missing | Reverse proxy failure; check nginx/Caddy/Traefik | | Agent TCP port | Connection refused or timeout | Agents disconnected; pipelines queuing but not running | | SSL certificate | < 30 days to expiry | Renew certificate; check auto-renewal configuration |

Alert after: 1 consecutive failure for the health endpoint and agent port — pipeline disruption is immediate. 2 failures for the web UI to avoid false positives during server restarts.


Common Woodpecker CI Failure Modes and What Vigilmon Catches

| Scenario | Vigilmon monitor | |---|---| | Woodpecker server process crash | Health endpoint returns connection refused; alert within 60 s | | SQLite/PostgreSQL database unreachable | Health endpoint may degrade; pipeline history inaccessible | | gRPC listener crash (agents disconnect) | TCP port monitor fires; pipelines queue but never run | | Reverse proxy misconfiguration | Web UI monitor fires; health endpoint may still pass | | Docker socket access revoked | Agents connected but cannot start pipeline containers | | Let's Encrypt renewal failure | SSL monitor alerts at 30-day threshold | | Woodpecker upgrade fails to restart | Health endpoint unreachable; web UI returns 502 | | Port 9000 blocked by firewall change | TCP monitor fires; agents silently disconnect | | Server OOM / memory pressure | Health endpoint times out; check container memory limits |


Woodpecker CI failures are particularly disruptive because the impact is delayed — when the agent port fails, developers don't see errors immediately. Commits are pushed, webhooks fire, jobs are queued, and hours pass before anyone notices that no pipelines have run. Vigilmon's combination of the health endpoint check, web UI availability, and TCP port monitor for the agent connection gives you the external visibility to catch CI outages within minutes rather than hours.

Start monitoring Woodpecker CI in under 5 minutes — register free at vigilmon.online.

Monitor your app with Vigilmon

Free plan — 5 monitors, no credit card required. Up and running in 60 seconds.

Start free →