tutorial

Google Cloud Platform (GCP) Uptime Monitoring Guide

Google Cloud Platform provides powerful infrastructure for running applications globally, but its built-in monitoring capabilities have a structural blind sp...

Google Cloud Platform provides powerful infrastructure for running applications globally, but its built-in monitoring capabilities have a structural blind spot: they're designed to observe your infrastructure from the inside. GCP Cloud Monitoring can tell you that your Cloud Run service is healthy, that your GKE pods are running, and that your App Engine instance is serving requests — but it can't reliably confirm that customers in Frankfurt, São Paulo, or Singapore can actually reach your endpoints from the public internet.

This guide explains how to monitor GCP-hosted applications effectively with Vigilmon: covering Cloud Run services, App Engine deployments, GKE endpoints, Cloud Functions heartbeats, and integration with Google Cloud Pub/Sub for incident workflows. It also explains where GCP Cloud Monitoring's native uptime checks fall short, and how multi-region external monitoring fills those gaps.


GCP Cloud Monitoring's Limitations for External Users

GCP Cloud Monitoring (formerly Stackdriver) is a solid platform for internal observability. It collects metrics from GCP services automatically, supports custom metrics via the Cloud Monitoring API, centralizes logs from Cloud Logging, and provides alerting policies with Google Cloud notification channels.

Its uptime check feature lets you configure HTTP, HTTPS, and TCP checks against your services. But several limitations matter for teams that need reliable external availability monitoring:

Single or limited probe regions: GCP uptime checks are executed from Google's probe locations, but alerting triggers when a configurable percentage of probes fail — with limited transparency into which regions are failing and why. You don't get the independent multi-region consensus that distinguishes a real outage from a probe blip.

Internal perspective bias: GCP probes have Google-native routing to your GCP endpoints. A CDN misconfiguration, a Cloud Load Balancer rule error, or a DNS propagation failure affecting external users may not be visible the same way to Google's own probe network as it is to users on other ASNs.

No customer-facing status page: GCP Cloud Monitoring doesn't generate a public status page for your application. Teams that want to communicate availability to customers during incidents need a separate solution.

Alert routing complexity: Wiring GCP alerting to Slack, PagerDuty, or other developer notification channels requires Cloud Pub/Sub and Cloud Functions configuration — more infrastructure to manage for what should be a simple "alert me on Slack when this URL is down" requirement.

Cost at scale: GCP uptime checks are billed per check request at $0.10/1,000 checks after the free tier. At 1-minute intervals, a single monitor generates ~43,200 checks/month — well within the free tier, but multiple monitors and multi-region configurations add up.


Setting Up Vigilmon for GCP Applications

Vigilmon provides multi-region external uptime monitoring for your GCP endpoints with no GCP-side configuration required. It monitors your public URLs exactly as your customers do — from multiple independent geographic regions — and alerts only when a quorum of probes confirms a failure.

Step 1: Create Your Vigilmon Account

Navigate to vigilmon.online and create a free account. The free tier includes 5 monitors, 1-minute check intervals, multi-region consensus, a status page, and Slack alerts.

Step 2: Add Your GCP Endpoints as Monitors

In the Vigilmon dashboard, click Add Monitor and enter your endpoint URL. Vigilmon will immediately begin checking from multiple regional probes.


Monitoring Cloud Run Services

Cloud Run is GCP's fully managed container execution environment. Cloud Run services expose HTTPS endpoints automatically — these are exactly what Vigilmon monitors.

For a standard Cloud Run service:

https://my-service-<hash>-uc.a.run.app

Or with a custom domain mapped via Cloud Run's domain mapping feature:

https://api.yourdomain.com

Add this URL as a Vigilmon monitor. Configure a path that returns a meaningful health signal — avoid monitoring the root path if it's not representative. A dedicated /health or /livez endpoint that returns 200 OK with a lightweight response is ideal:

// Cloud Run health endpoint (Node.js/Express)
app.get('/health', (req, res) => {
  res.status(200).json({ status: 'ok', timestamp: new Date().toISOString() });
});

Why not rely on GCP Cloud Monitoring alone for Cloud Run?

Cloud Run reports service health based on container instance status and request success rates within GCP's infrastructure. A Cloud Run service can appear healthy in GCP Console while being unreachable to external users due to:

  • IAM policy misconfiguration blocking unauthenticated access
  • Cloud Armor security policy blocking traffic from certain regions
  • Custom domain routing misconfiguration
  • Network egress policy changes

Vigilmon's external probes catch all of these because they access the endpoint exactly as external users do — HTTP requests from outside Google's network, with no special routing or credentials.


Monitoring App Engine Applications

App Engine applications expose a default service endpoint at https://<project-id>.appspot.com and custom domains via App Engine custom domains configuration.

Add your App Engine endpoint to Vigilmon:

https://your-project-id.appspot.com/health

For App Engine applications with multiple services (default, api, worker), add a monitor for each service endpoint that serves external traffic:

https://api-dot-your-project-id.appspot.com/health
https://your-project-id.appspot.com/health

App Engine version traffic splitting caveat: If you're using App Engine's traffic splitting to canary a new version, Vigilmon's monitors will hit whichever version is receiving traffic. This is intentional — you want to monitor what your users get, including the canary version. If a new version breaks the monitored endpoint, Vigilmon will alert you before your traffic split completes.


Monitoring GKE Endpoints

Google Kubernetes Engine applications typically expose endpoints via:

  1. GKE Ingress (backed by Google Cloud Load Balancer) with an external IP or custom domain
  2. Cloud Load Balancing directly in front of GKE workloads
  3. Istio Gateway or Anthos Service Mesh for service mesh deployments

For any of these, the externally reachable endpoint is what you add to Vigilmon. For a GKE Ingress:

https://api.yourdomain.com/health

Multi-region GKE monitoring: For GKE deployments spanning multiple regions (e.g., a GKE cluster in us-central1 and another in europe-west1 behind a Global Load Balancer), a single Vigilmon monitor pointing at the global load balancer address monitors the globally routed service. For regional failover scenarios, consider adding individual monitors for region-specific health endpoints if your load balancer exposes them, to distinguish a regional GKE failure from a global load balancer issue.

GKE Autopilot health checks: GKE Autopilot handles node provisioning automatically but still requires your application containers to respond to HTTP health checks. Configure a /readyz or /healthz endpoint and add it to Vigilmon for external confirmation that your GKE-hosted service is accepting traffic:

# Kubernetes deployment readiness probe + the same endpoint for external monitoring
livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 10

Add https://your-gke-ingress.yourdomain.com/healthz to Vigilmon alongside the Kubernetes-internal probe configuration.


Cloud Functions Heartbeat Monitoring

Cloud Functions (and Cloud Run functions in 2nd gen) are event-driven and don't maintain persistent running processes — they execute on invocation. Monitoring whether your Cloud Functions are healthy requires a different approach: heartbeat monitoring via scheduled invocations.

Pattern: Heartbeat with Scheduled Cloud Scheduler trigger

  1. Create an HTTP-triggered Cloud Function that accepts GET requests and returns 200 OK:
# functions/health_ping/main.py
import functions_framework

@functions_framework.http
def health_ping(request):
    return ('OK', 200)
  1. Deploy the function:
gcloud functions deploy health-ping \
  --runtime python311 \
  --trigger-http \
  --allow-unauthenticated \
  --region us-central1
  1. Add the function's HTTPS URL to Vigilmon:
https://us-central1-your-project-id.cloudfunctions.net/health-ping

This confirms that the Cloud Functions service in that region is executing correctly, that your function code is deployed and running, and that the HTTPS endpoint is reachable. For critical business logic functions, deploy a more meaningful health check that exercises a representative code path (e.g., a lightweight database read) rather than just returning 200.

Multi-region Cloud Functions: If you deploy the same function to multiple regions for latency or redundancy, add a Vigilmon monitor for each regional URL:

https://us-central1-your-project-id.cloudfunctions.net/health-ping
https://europe-west1-your-project-id.cloudfunctions.net/health-ping
https://asia-northeast1-your-project-id.cloudfunctions.net/health-ping

Regional monitoring gives you visibility into which deployment is failing when an incident occurs, rather than a single alert that could be any of your regions.


Webhook Integration with Google Cloud Pub/Sub

For teams that want Vigilmon alerts to trigger workflows inside GCP — creating incidents in incident management tools, writing availability events to BigQuery, updating a GCP-hosted status system, or triggering a Cloud Run job to investigate failures — webhook integration via Cloud Pub/Sub is the pattern.

Architecture

Vigilmon Alert → Webhook → Cloud Run HTTP endpoint → Pub/Sub publish → Subscribers

Or more directly:

Vigilmon Alert → Webhook → Cloud Run HTTP endpoint → (BigQuery / PagerDuty / custom logic)

Step 1: Create a Cloud Run Webhook Receiver

Deploy a Cloud Run service that accepts Vigilmon's webhook payload:

// webhook-receiver/index.js (Node.js)
const express = require('express');
const { PubSub } = require('@google-cloud/pubsub');

const app = express();
app.use(express.json());

const pubsub = new PubSub();
const topicName = 'vigilmon-alerts';

app.post('/webhook', async (req, res) => {
  const alert = req.body;
  
  // Publish to Pub/Sub for downstream processing
  const dataBuffer = Buffer.from(JSON.stringify(alert));
  await pubsub.topic(topicName).publishMessage({ data: dataBuffer });
  
  res.status(200).json({ received: true });
});

const port = process.env.PORT || 8080;
app.listen(port, () => console.log(`Webhook receiver on ${port}`));

Step 2: Configure the Webhook in Vigilmon

In the Vigilmon dashboard, navigate to Settings > Notifications > Webhooks and add your Cloud Run endpoint URL:

https://webhook-receiver-<hash>-uc.a.run.app/webhook

Vigilmon will send a POST request to this URL with a JSON payload on monitor state changes (up → down, down → up).

Step 3: Subscribe to the Pub/Sub Topic

Create subscribers for your downstream processing:

# Create the topic
gcloud pubsub topics create vigilmon-alerts

# Create a push subscription to a Cloud Function for automated incident response
gcloud pubsub subscriptions create vigilmon-incidents \
  --topic=vigilmon-alerts \
  --push-endpoint=https://us-central1-your-project-id.cloudfunctions.net/handle-incident

This pattern enables complex incident workflows: automatically creating JIRA tickets, posting to a dedicated incident Slack channel, writing to a BigQuery availability table for SLA reporting, or triggering runbooks via Cloud Workflows.


Multi-Region Monitoring for GCP Global Load Balancers

GCP's Global External HTTP(S) Load Balancer routes traffic to backends across multiple regions based on user proximity and backend health. From GCP's perspective, the load balancer appears healthy as long as at least one backend region is serving traffic. But if your European backends go down, European users are degraded even if your US backends are fine and GCP reports the load balancer as operational.

Vigilmon's multi-region probe network catches this where GCP cannot.

When Vigilmon's European probes detect your endpoint as unreachable while US probes see it as healthy, the alert identifies a regional availability issue — not a full outage. This information is immediately actionable: your European GCP backends are failing, and you need to investigate Cloud Armor rules, regional backend health, or backend service configuration for europe-west* backends.

Recommended monitoring setup for GCP Global Load Balancer deployments:

  1. Global endpoint monitor: Add the load balancer's external IP or custom domain — https://api.yourdomain.com/health
  2. Regional backend health endpoints (if exposed): For each regional backend service, add a monitor if the backends have direct health endpoints accessible externally
  3. SSL certificate monitor: Your load balancer terminates TLS — Vigilmon's SSL certificate monitoring will alert you before certificate expiry causes an outage

For Global Load Balancers using Google-managed SSL certificates, certificate renewal is automatic — but bugs in renewal, custom certificate configurations, and SNI mismatches still happen. Vigilmon catches them before your users do.


Recommended GCP Monitoring Stack

For a well-monitored GCP deployment, the combination of GCP-native tools and Vigilmon covers both internal and external observability:

| Concern | Tool | |---|---| | Internal service metrics (CPU, memory, request rate) | GCP Cloud Monitoring | | Application traces and performance | Cloud Trace / OpenTelemetry → Cloud Trace | | Log aggregation and analysis | Cloud Logging | | Structured error tracking | Cloud Error Reporting | | External uptime and availability | Vigilmon | | Multi-region reachability confirmation | Vigilmon | | SSL certificate expiry alerts | Vigilmon | | Customer-facing status page | Vigilmon |

GCP Cloud Monitoring handles the inside-out view — what's happening within your GCP infrastructure. Vigilmon handles the outside-in view — what your users actually experience when they try to reach your application from the public internet.


Quick-Start Checklist for GCP Teams

  • [ ] Add your primary external domain to Vigilmon (https://yourdomain.com/health)
  • [ ] Add each Cloud Run service that serves external traffic
  • [ ] Add App Engine default service + any named services with external traffic
  • [ ] Add GKE Ingress / Load Balancer endpoints
  • [ ] Create heartbeat endpoints for critical Cloud Functions and add to Vigilmon
  • [ ] Connect Vigilmon to your team's Slack workspace
  • [ ] Configure a Vigilmon webhook → Cloud Run receiver → Pub/Sub for GCP-native incident workflows (optional)
  • [ ] Set up your Vigilmon status page and map a custom domain (status.yourdomain.com)
  • [ ] Verify SSL certificate monitoring is active for all HTTPS monitors

Conclusion

GCP Cloud Monitoring is a capable internal observability platform that gives you deep visibility into your GCP infrastructure health. But external uptime monitoring — confirming that customers in different parts of the world can actually reach your application — is a structurally different problem that requires probes outside of Google's network.

Vigilmon's multi-region consensus architecture is purpose-built for this: independent probes across multiple geographic regions, filtering transient blips and only alerting when multiple vantage points agree that something is down. For GCP teams, it takes two minutes to set up, integrates cleanly with existing GCP incident workflows via webhooks and Pub/Sub, and covers the external visibility gap that GCP Cloud Monitoring's design leaves open.

Start monitoring your GCP services externally for free at vigilmon.online — 5 monitors, 1-minute check intervals, multi-region consensus, status page included, no credit card required.


Tags: #gcp #googlecloud #monitoring #cloudrun #appengine #gke #devops #uptime #sre

Monitor your app with Vigilmon

Free plan — 5 monitors, no credit card required. Up and running in 60 seconds.

Start free →