Monitoring HashiCorp Consul with Vigilmon (Free, Multi-Region)
HashiCorp Consul is the control plane for your microservices. Service discovery, health checking, distributed configuration, and the service mesh all flow through it. Most teams think of Consul as infrastructure that monitors other things — but Consul itself can fail, and when it does, the consequences cascade fast.
This guide covers how to monitor the Consul HTTP API and UI with Vigilmon, giving you an external uptime check that operates independently of Consul's own health system.
Why Consul needs external monitoring
Consul has excellent built-in health checking for services registered in its catalog. But there's a blind spot: Consul's own availability.
If Consul becomes unreachable — because the node crashed, a network partition isolated it, or the server process hung — the built-in health checks stop reporting. Services that rely on Consul for discovery start failing. But nothing in Consul's own system can report the outage, because the reporter is down.
External monitoring solves this. Vigilmon probes from outside your infrastructure, so it detects Consul failures regardless of what's happening inside your cluster.
Step 1: Understand Consul's health endpoints
Consul exposes a comprehensive HTTP API. Three endpoints are particularly useful for monitoring.
/v1/status/leader
Returns the current cluster leader's address. If there's no leader (split-brain, quorum loss), the response changes:
curl http://consul.internal:8500/v1/status/leader
# "10.0.0.1:8300"
An empty string or error response here means Consul has lost quorum — a serious cluster event.
/v1/health/service/consul
Returns the health status of the Consul agents themselves:
curl http://consul.internal:8500/v1/health/service/consul
Returns an array of node health objects. Any node with status "critical" is failing.
/v1/agent/self
Returns information about the local Consul agent. Useful for monitoring a specific node:
curl http://consul.internal:8500/v1/agent/self
# {"Config":{"Datacenter":"dc1","NodeName":"..."},...}
Returns 200 when the agent is healthy, 500 or connection error when it's not.
Which endpoint to monitor
For external uptime monitoring, use /v1/status/leader. It's lightweight, requires no authentication by default, and the presence of a leader address confirms the cluster is functional.
Step 2: Set up Vigilmon monitoring
With the endpoint identified, point Vigilmon at Consul:
- Sign up at vigilmon.online — free tier, no credit card
- Click New Monitor → HTTP
- Enter
http://your-consul-host:8500/v1/status/leader - Set the check interval (1 or 5 minutes depending on your tier)
- Save
Vigilmon probes from multiple geographic regions. A connection failure or non-200 response from any region triggers an incident.
Add a keyword monitor for leader validation
The /v1/status/leader endpoint returns 200 even when there's no leader (it returns an empty string body). A keyword monitor lets you assert that a leader address is actually present:
- New Monitor → Keyword
- URL:
http://your-consul-host:8500/v1/status/leader - Keyword:
:8300"(the Raft port — present in every valid leader address) - If absent, Vigilmon treats it as a failure
This catches quorum loss, where Consul responds but has no elected leader.
Monitor the Consul UI
If your team uses the Consul UI for service discovery visibility, add a monitor for it:
- URL:
http://your-consul-host:8500/ui/ - Type: HTTP
- Expected status:
200
Recommended monitor set for Consul
| Monitor | URL | Type | What it catches |
|---|---|---|---|
| Cluster leader | /v1/status/leader | HTTP | Consul process down |
| Leader present | /v1/status/leader | Keyword | Quorum loss, split brain |
| Agent health | /v1/agent/self | HTTP | Specific node failure |
| Consul UI | /ui/ | HTTP | Web UI unreachable |
Step 3: Consul ACL considerations
If you've enabled Consul ACLs (which you should in production), the health endpoints may require a token. You have two options:
Option A: Create a read-only monitoring token
# monitoring-policy.hcl
node_prefix "" {
policy = "read"
}
service_prefix "" {
policy = "read"
}
consul acl policy create -name monitoring -rules @monitoring-policy.hcl
consul acl token create -description "Vigilmon monitoring" -policy-name monitoring
Add the token to your Vigilmon monitor as a header:
- Header name:
X-Consul-Token - Header value: your monitoring token
Option B: Expose an unauthenticated health endpoint
Configure a Consul agent to allow anonymous access to the status endpoint specifically. Less recommended for security-sensitive environments.
Step 4: Configure alert delivery
Set up where Vigilmon sends alerts when Consul goes down.
Slack:
- Create an incoming webhook in your Slack workspace
- In Vigilmon: Notifications → New Channel → Slack
- Enable the channel on your Consul monitors
When Consul becomes unavailable, you'll get:
🔴 DOWN: consul.internal:8500/v1/status/leader
Status: Connection refused
Regions: US-East, EU-West
Started: 2 minutes ago
Alert your on-call engineer immediately — a Consul outage is a P1 incident. Services relying on Consul for DNS-based discovery will start failing within seconds of Consul going down.
Step 5: Multi-datacenter monitoring
If you run Consul across multiple datacenters, monitor each one independently. A WAN federation issue can take down one datacenter's Consul without affecting others:
| Datacenter | Monitor URL |
|---|---|
| us-east | http://consul-us-east:8500/v1/status/leader |
| eu-west | http://consul-eu-west:8500/v1/status/leader |
| ap-south | http://consul-ap-south:8500/v1/status/leader |
Create a separate Vigilmon monitor for each datacenter so alerts are datacenter-specific and actionable.
Step 6: Create an internal status page for your ops team
Consul availability affects every team running services in your infrastructure. Give your engineering teams visibility:
- Status Pages → New Status Page in Vigilmon
- Name it "Infrastructure Status"
- Add your Consul monitors (and Vault, if you have it)
- Share the internal URL with your engineering org
Teams can check the status page themselves during incidents rather than pinging the ops team.
What you've built
| What | How |
|---|---|
| Process health | HTTP monitor on /v1/status/leader |
| Quorum validation | Keyword monitor asserting leader address present |
| Specific node check | HTTP monitor on /v1/agent/self |
| UI availability | HTTP monitor on /ui/ |
| ACL-safe monitoring | Read-only monitoring token |
| Multi-datacenter | Separate monitor per datacenter |
| Instant alerts | Slack/email on down + recovery |
Your Consul cluster is now observable from outside. The next time a node loses quorum, a network partition isolates your Consul servers, or the agent process crashes under load, you'll know about it in minutes — not after your services start returning DNS lookup failures to end users.
Get started free at vigilmon.online — your first monitor is running in under a minute.