Your nightly test suite stopped running three weeks ago. Nobody noticed until a critical regression shipped to production. GitHub Actions had no errors to report — the workflow simply wasn't being triggered anymore. Your email notifications only fire on failures, and you can't fail if you never run.
This is the silent CI/CD failure problem. HTTP monitors can't catch it because there's nothing to ping. The fix is heartbeat monitoring: your workflow pings a unique URL at the end of every successful run, and if Vigilmon doesn't receive that ping within the expected window, you get alerted.
This tutorial shows you how to instrument GitHub Actions workflows with Vigilmon heartbeat monitors to catch failures before your team does.
What You'll Cover
- Heartbeat monitoring for scheduled workflows
- Catching workflows that stop triggering entirely
- Alerting on long-running or stuck jobs
- Multi-environment CI monitoring (staging and production)
- Detecting scheduled workflow drift
Prerequisites
- A GitHub repository with Actions workflows
- A free account at vigilmon.online
The Problem: What Standard CI Monitoring Misses
GitHub Actions has built-in email notifications — but they only fire when a job fails. They won't alert you when:
- A
scheduletrigger stops firing (GitHub rate-limits or drops cron triggers on inactive repos) - A workflow is accidentally disabled
- A required status check is removed from branch protection, so nobody notices CI is skipped
- A deployment pipeline runs but silently skips the actual deployment step
Vigilmon's heartbeat pattern closes all of these gaps.
Step 1: Create a Heartbeat Monitor in Vigilmon
A heartbeat monitor expects a ping on a regular interval. No ping → alert.
- Log in to Vigilmon and click New Monitor → Heartbeat.
- Give it a name like
Nightly Test Suite — main. - Set the Expected interval to match your workflow schedule. For a nightly cron, use 24 hours. For hourly CI, use 90 minutes (adds a 50% buffer).
- Set the Grace period to 30 minutes for short intervals, or 1 hour for daily jobs. This prevents false alerts from slight schedule drift.
- Save and copy the Ping URL — it looks like
https://vigilmon.online/api/heartbeat/<unique-id>.
Step 2: Add the Heartbeat Ping to a Scheduled Workflow
Here's a complete example of a nightly test workflow with Vigilmon instrumentation:
# .github/workflows/nightly-tests.yml
name: Nightly Tests
on:
schedule:
- cron: "0 2 * * *" # 02:00 UTC every night
workflow_dispatch: # manual trigger for testing
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: "20"
cache: "npm"
- name: Install dependencies
run: npm ci
- name: Run tests
run: npm test
- name: Ping Vigilmon heartbeat
if: success()
run: |
curl -fsS -X POST "${{ secrets.VIGILMON_NIGHTLY_HEARTBEAT_URL }}" \
--max-time 10 \
--retry 3 \
--retry-delay 2
# Heartbeat is only sent on success.
# A failed or cancelled job skips this step, triggering a Vigilmon alert
# after the grace period expires.
Key points about this setup:
if: success()— the heartbeat ping is only sent when all previous steps pass. If tests fail, the job fails, and no ping is sent.--retry 3— transient network issues won't cause a false "missed heartbeat" alert.--max-time 10— the ping step won't hang and block the runner.
Store the secret
Go to your GitHub repo → Settings → Secrets and variables → Actions → New repository secret:
- Name:
VIGILMON_NIGHTLY_HEARTBEAT_URL - Value: the ping URL from Step 1
Step 3: Monitor Your Deployment Pipeline
Heartbeats are even more valuable for deployment pipelines than for tests — a silently stuck deploy leaves production stale without any error to page you.
# .github/workflows/deploy-production.yml
name: Deploy to Production
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
environment: production
steps:
- uses: actions/checkout@v4
- name: Build
run: npm run build
- name: Deploy
run: |
# your deployment command here
./scripts/deploy.sh production
- name: Run smoke tests
run: npm run test:smoke
- name: Ping Vigilmon deployment heartbeat
if: success()
env:
HEARTBEAT_URL: ${{ secrets.VIGILMON_DEPLOY_PROD_HEARTBEAT_URL }}
run: |
curl -fsS -X POST "$HEARTBEAT_URL" \
--max-time 10 \
--retry 3
Create a separate heartbeat monitor for this workflow with an interval of 48 hours (or however often you expect to deploy). If no successful deployment reaches production within that window, Vigilmon pages you.
This is useful for catching deploy freezes — situations where commits pile up on main but deployments silently stop going out.
Step 4: Detect Scheduled Workflow Drift
GitHub's schedule trigger has a known limitation: workflows on repositories with low activity may have their scheduled triggers delayed or skipped by GitHub. If your repo goes quiet for a few days, GitHub may decide not to trigger nightly crons.
Vigilmon's heartbeat monitor catches this automatically — if the cron misses a run for any reason, no ping arrives, and you get alerted.
To make the signal even more reliable, add a timestamp to the heartbeat payload:
- name: Ping Vigilmon heartbeat with metadata
if: success()
run: |
curl -fsS -X POST "${{ secrets.VIGILMON_NIGHTLY_HEARTBEAT_URL }}" \
-H "Content-Type: application/json" \
-d "{\"workflow\": \"${{ github.workflow }}\", \"run_id\": \"${{ github.run_id }}\", \"sha\": \"${{ github.sha }}\"}" \
--max-time 10 \
--retry 3
Vigilmon accepts the body but doesn't require it — it only cares whether the ping arrived within the grace period.
Step 5: Multi-Environment Monitoring
For teams with staging and production pipelines, create one heartbeat monitor per environment per workflow:
| Workflow | Environment | Monitor name | Interval |
|---|---|---|---|
| Deploy workflow | Production | Deploy → Production | 48 h |
| Deploy workflow | Staging | Deploy → Staging | 24 h |
| Nightly tests | main | Nightly Tests — main | 25 h |
| Weekly security scan | — | Weekly Security Scan | 8 days |
Use separate secrets per environment:
- name: Ping heartbeat
if: success()
run: |
curl -fsS -X POST "${{ secrets[format('VIGILMON_DEPLOY_{0}_HEARTBEAT', env.ENVIRONMENT)] }}" \
--max-time 10
env:
ENVIRONMENT: ${{ github.ref == 'refs/heads/main' && 'PROD' || 'STAGING' }}
Step 6: Alert Channels
In Vigilmon, go to Notifications → New Channel and configure:
- Email — immediate alert when a heartbeat is missed
- Slack webhook — ping your
#alertsor#ci-cdchannel
When a workflow stops pinging:
🔴 MISSED HEARTBEAT: Nightly Tests — main
Last ping: 26 hours ago
Expected interval: 24 hours
When it resumes:
✅ HEARTBEAT RECOVERED: Nightly Tests — main
Gap: 26 hours
Step 7: Protect Against Accidental Workflow Disabling
One more failure mode: a developer accidentally disables a workflow in the GitHub Actions UI (the Disable workflow button is easy to click). Since the workflow never runs, no ping arrives, and Vigilmon alerts you within one interval.
No code change needed — the heartbeat monitor already covers this case.
Complete Workflow Template
Here's a reusable template you can copy into any workflow:
# Add this step at the end of any job you want heartbeat monitoring on
- name: Ping Vigilmon heartbeat
if: success()
run: |
curl -fsS -X POST "${{ secrets.VIGILMON_HEARTBEAT_URL }}" \
--max-time 10 \
--retry 3 \
--retry-delay 2
Replace VIGILMON_HEARTBEAT_URL with the specific secret name for that workflow's monitor (use one heartbeat monitor per workflow).
What You're Now Catching
| Silent failure mode | How Vigilmon detects it | |---|---| | Scheduled cron skipped by GitHub | No ping arrives → alert after grace period | | Workflow accidentally disabled | No ping arrives → alert after grace period | | Tests pass but deployment step skipped | No ping on deploy job → alert | | Workflow hung on a stuck step | Job timeout → no ping → alert | | Branch protection removed, CI bypassed | No ping on missed runs → alert | | Deploy pipeline stopped shipping | Deploy heartbeat missed → alert |
GitHub Actions is reliable — until it quietly stops working. Vigilmon's heartbeat monitors give you the external signal that GitHub itself can't provide: a definitive alert when your CI/CD pipeline hasn't run in longer than expected.
Add heartbeat monitoring to your CI/CD pipelines today — register free at vigilmon.online.