Jenkins is the CI/CD workhorse for thousands of engineering teams — running builds, deploying to production, and executing scheduled jobs around the clock. When Jenkins goes down, developers can't merge PRs (if Jenkins gates merges), builds queue up silently, and scheduled deployments stop without notice. When the TCP port is blocked or SSL certificates expire, every pipeline agent and webhook delivery fails with connection errors. Vigilmon gives you external visibility into Jenkins before your developers notice: the health check endpoint, login page availability, TCP port reachability, and SSL certificate expiry.
What You'll Build
- A monitor on Jenkins's
/loginpage to catch UI and auth failures - A monitor on the
/jenkins/health(or/health) health check endpoint - A TCP port check for direct Jenkins JNLP agent connectivity
- SSL certificate monitoring for your Jenkins domain
- An alerting setup that distinguishes process failures from network issues
Prerequisites
- A running Jenkins instance (version 2.x) accessible via HTTPS
- A domain pointing to Jenkins (e.g.,
https://jenkins.example.com) - A free account at vigilmon.online
Step 1: Verify Jenkins's Health Endpoints
Jenkins exposes a lightweight health check endpoint at the root context path. The exact URL depends on your configuration:
# Standard Jenkins (no context path)
curl -I https://jenkins.example.com/login
# With context path /jenkins
curl -I https://jenkins.example.com/jenkins/login
The login page returns HTTP 200 when Jenkins is running and its session management is functioning. It's the most user-facing test — if the login page is broken, no developer can log in or trigger builds.
For a minimal health check (available since Jenkins 2.4+), use the built-in health endpoint:
curl https://jenkins.example.com/health
# or
curl https://jenkins.example.com/jenkins/health
A healthy Jenkins returns HTTP 200. This endpoint is designed for load balancers and monitoring tools — it's unauthenticated and lightweight.
Jenkins Configuration as Code users: If you restrict anonymous access, the
/loginpage is still publicly reachable (it doesn't require authentication). The/healthendpoint may require authentication depending on your security configuration — test withcurl -Ifirst.
Step 2: Create a Vigilmon HTTP Monitor for the Login Page
- Log in to Vigilmon → Add Monitor → HTTP.
- URL:
https://jenkins.example.com/login. - Check interval: 60 seconds.
- Response timeout: 15 seconds (Jenkins can be slow to respond under heavy build load).
- Expected status:
200. - Keyword:
Jenkins(appears in the page title of every Jenkins login page). - Click Save.
This monitor catches:
- Jenkins process crashes or JVM out-of-memory kills
- Plugin loading failures that prevent Jenkins from starting
- JENKINS_HOME corruption that breaks startup
- Deployment failures after upgrades
Alert sensitivity: Set alerts to trigger after 2 consecutive failures. Jenkins can momentarily return errors during garbage collection or heavy plugin activity; two consecutive failures reliably indicate a real outage.
Step 3: Create a Vigilmon HTTP Monitor for the Health Endpoint
The /health endpoint is faster and lighter than the login page — ideal for high-frequency checks:
- Add Monitor → HTTP.
- URL:
https://jenkins.example.com/health. - Check interval: 60 seconds.
- Response timeout: 10 seconds.
- Expected status:
200. - Label:
Jenkins health check. - Click Save.
Context path: If Jenkins runs at
/jenkins, usehttps://jenkins.example.com/jenkins/health. Check your Jenkins configuration under Manage Jenkins → System → Jenkins URL.
When the health endpoint fires but the login page is green, Jenkins is returning health errors but still serving the UI — this can indicate plugin failures, executor exhaustion, or build queue overflow that the health check catches before the UI degrades.
Step 4: Create a TCP Monitor for the JNLP Agent Port
Jenkins build agents connect to the Jenkins controller over TCP — typically on port 50000 (the JNLP inbound agents port). If this port is blocked, build agents disconnect and queued builds hang indefinitely:
# Verify the JNLP port is open
nc -zv jenkins.example.com 50000
- Add Monitor → TCP.
- Host:
jenkins.example.com. - Port:
50000(or your configured JNLP port — check Manage Jenkins → Security → TCP port for inbound agents). - Check interval: 2 minutes.
- Response timeout: 10 seconds.
- Label:
Jenkins JNLP agent port. - Click Save.
This monitor fires when:
- A firewall change blocks port 50000
- Jenkins restarts with a different or disabled JNLP port
- The host is unreachable at the network level
When the TCP monitor fires but HTTP monitors are green, you have a port-specific firewall or routing issue — Jenkins is running but agents can't connect.
SSH agents and Docker agents: If you use SSH-based build agents or Docker containers instead of JNLP, you may not need this check. Monitor the SSH port (22) instead if your agents connect via SSH.
Step 5: Monitor SSL Certificates
Jenkins webhooks (from GitHub, GitLab, Bitbucket) are sent over HTTPS to your Jenkins instance. An expired certificate causes:
- All incoming webhooks to fail silently (the VCS provider gets a TLS error and stops retrying)
- Build agent connections via HTTPS to fail
- Developers to get browser warnings when accessing the UI
- Add Monitor → SSL Certificate.
- Domain:
jenkins.example.com. - Alert when expiry is within: 30 days.
- Alert again: 14 days, 7 days, 3 days, 1 day.
- Click Save.
Jenkins-managed certificates: Some Jenkins installations manage their own TLS certificates (e.g., via the embedded Jetty server). Check the certificate expiry with
openssl s_client -connect jenkins.example.com:443 2>/dev/null | openssl x509 -noout -datesand make sure the Certbot or ACME renewal process covers the certificate that Jenkins presents.
Step 6: Monitor the Jenkins API
The Jenkins REST API powers integrations, status dashboards, and build-triggering scripts. A degraded API may not affect the web UI immediately but breaks all automation:
curl https://jenkins.example.com/api/json?tree=mode,nodeDescription
# Returns 401 if authentication is required (means the API is up)
# Returns 200 with JSON if anonymous access is allowed
- Add Monitor → HTTP.
- URL:
https://jenkins.example.com/api/json. - Check interval: 5 minutes.
- Expected status:
401(unauthenticated requests confirm the API is up and requiring auth). - Label:
Jenkins API. - Click Save.
Configure expected status as
401if you have anonymous access disabled — a401from Jenkins means the API is healthy and requiring authentication as expected. A connection error,502, or503means the API is down.
Step 7: Configure Alerting
In Vigilmon under Settings → Notifications, configure your alert channels:
| Monitor | Trigger | Action |
|---|---|---|
| Login page | Non-200 or Jenkins missing | Check systemctl status jenkins; inspect Jenkins logs at $JENKINS_HOME/logs/ |
| Health endpoint | Non-200 | Jenkins health checks failing; check plugin status and executor availability |
| JNLP TCP port | Connection refused or timeout | Check firewall rules; verify JNLP port setting in Jenkins security config |
| SSL certificate | < 30 days to expiry | Renew certificate; check webhook deliveries from VCS providers |
| API | Non-401/200 | API layer issue; check Jenkins log for exceptions |
Alert after: 2 consecutive failures for HTTP monitors. 1 failure for TCP monitors — TCP timeouts are rarely transient.
Common Jenkins Failure Modes and What Vigilmon Catches
| Scenario | Vigilmon monitor | |---|---| | Jenkins JVM crash / OOM | Login page unreachable; alert within 60 s | | Plugin loading failure on startup | Login page returns error; health endpoint fails | | Build agents disconnect | JNLP TCP monitor fires; HTTP monitors stay green | | Firewall blocks JNLP port | TCP monitor fires; login page monitor stays green | | SSL certificate expires | SSL monitor alerts at 30-day threshold; webhooks fail silently | | Jenkins upgrade breaks UI | Login page keyword missing; health endpoint may pass | | JENKINS_HOME disk full | Jenkins starts failing builds; health endpoint may return errors | | Reverse proxy misconfiguration | Login page returns 502/503; direct TCP port may still respond | | DNS misconfiguration | All monitors fire simultaneously |
Jenkins downtime has a delayed blast radius: developers don't notice immediately, but builds pile up in the queue and deployments stop silently. By the time someone notices, there may be hours of backlog to untangle. Vigilmon gives you early warning through the health endpoint, login page availability, agent port reachability, and SSL certificate expiry — so you catch Jenkins failures in seconds rather than discovering them when a deployment was supposed to run.
Start monitoring Jenkins in under 5 minutes — register free at vigilmon.online.