A GitHub Action for collecting system metrics during workflows.
Note: This is a fork of dev-hato/actions-workflow-metrics with improvements focused on accessibility and reliability. This fork improves accessibility by presenting data in clear tables instead of relying on Mermaid graphs to convey information. It also improves handling by not running a server in the background, making it less likely to conflict with operations under examination.
- System Metrics Collection: Collects CPU load, memory usage, and disk usage in real-time during workflow execution
- Timestamp-Based Tables: Displays all collected metrics with timestamps in clear, easy-to-read tables
- Threshold Alerts: Automatically detects and highlights when resource usage exceeds configurable thresholds
- Job Summary Output: Automatically displays tables and alerts in GitHub Actions job summary
GitHub Actions runners have fixed resource limits. When workflows run slow or fail unexpectedly, it's often unclear whether you're hitting CPU, memory, or disk constraints. This action gives you immediate visibility into resource usage patterns during your workflow execution.
What you'll learn:
- Identify bottlenecks: See resource consumption patterns over time during your workflow
- Optimize efficiency: Spot opportunities to reduce resource usage in your builds or tests
- Make informed decisions: Determine whether to optimize your workflow or upgrade to larger runners
- Prevent failures: Catch resource exhaustion before it causes workflow failures
The metrics are displayed directly in your GitHub Actions job summaryβno external services or complex setup required. Your team can see the data immediately after each workflow run.
- Your team manages the project: The same engineers writing code are monitoring workflows
- You need quick insights: You want immediate feedback in GitHub Actions job summaries
- Simplicity matters: You prefer zero-configuration monitoring without external dependencies
- Cost is a concern: You want free, built-in monitoring without additional services
For comprehensive monitoring needs, consider enterprise solutions:
- DataDog CI Visibility: When you need centralized monitoring across multiple repositories, historical trend analysis, and correlation with production metrics
- OpenTelemetry: When you require standardized instrumentation, integration with existing observability platforms, or custom metrics collection
If an external team monitors CI/CD performance or you need rich historical analysis and alerting, these enterprise solutions are more appropriate. This action is designed for teams who want straightforward, in-context resource monitoring without the overhead of external systems.
The output rendered into a step summary contains the metrics collected and if any alerts were triggered those are rendered as well.
A line entry in the alert area is added for each resource that triggered an alert. If no alerts were triggered, then this section is not rendered at all.
Warning
Note
Metrics are displayed in collapsible sections.
Shows CPU utilization over time with timestamps.
| Timestamp | Used | Available |
|---|---|---|
| 2026-02-16T11:25:30.123Z | 15.45% | 84.55% |
| 2026-02-16T11:25:35.456Z | 23.78% | 76.22% |
| 2026-02-16T11:25:40.789Z | 8.12% | 91.88% |
Shows memory utilization over time with timestamps.
| Timestamp | Used | Available |
|---|---|---|
| 2026-02-16T11:25:30.123Z | 1024.50 MB | 14965.98 MB |
| 2026-02-16T11:25:35.456Z | 2048.75 MB | 13941.73 MB |
| 2026-02-16T11:25:40.789Z | 1536.25 MB | 14454.23 MB |
Shows disk usage over time with timestamps.
| Timestamp | Used | Available |
|---|---|---|
| 2026-02-16T11:25:30.123Z | 57.73 GB | 86.51 GB |
| 2026-02-16T11:25:35.456Z | 62.85 GB | 81.39 GB |
| 2026-02-16T11:25:40.789Z | 58.12 GB | 86.12 GB |
This action is designed to be executed at the beginning of a workflow.
name: Example Workflow
on: [push]
jobs:
example:
runs-on: ubuntu-latest
steps:
# Run runner-resource-usage at the beginning of the workflow
- name: Start Workflow Telemetry
uses: garbee/runner-resource-usage@v1
# Subsequent regular steps
- name: Checkout
uses: actions/checkout@v6
- name: Run tests
run: npm test
# ... other stepsThe action will automatically:
- Collect CPU load, memory usage, and disk usage metrics at regular intervals
- Display metrics with timestamps in clear tables
- Generate alerts for threshold violations
For teams that want metric collection available but not always running, you can conditionally enable the action using the runner.debug context. This allows you to leave the action in your workflow without incurring the performance overhead on every run.
name: Example Workflow
on: [push]
jobs:
example:
runs-on: ubuntu-latest
steps:
# Only collect metrics when debug mode is enabled
- name: Start Workflow Telemetry
if: ${{ runner.debug == '1' }}
uses: garbee/runner-resource-usage@v1
# Regular workflow steps
- name: Checkout
uses: actions/checkout@v6
- name: Run tests
run: npm test
# ... other stepsEnabling debug mode:
When you need to collect metrics, enable debug mode by:
- Go to your failed or completed workflow run
- Click "Re-run jobs" in the top-right corner
- Check the "Enable debug logging" checkbox
- Click "Re-run jobs"
This approach provides several benefits:
- No re-instrumentation needed: The action stays in your workflow, ready to use when needed
- Performance optimization: Metrics collection only runs when you need it, avoiding overhead on regular runs
- On-demand debugging: Enable collection instantly when investigating slow builds or resource issues
- Team flexibility: Any team member can enable metrics collection without modifying the workflow file
The runner.debug context is documented in the GitHub Actions contexts reference.
| Input | Description | Required | Default |
|---|---|---|---|
interval_seconds |
Interval between metrics collection in seconds | No | 5 |
memory_alert_threshold |
Memory utilization threshold percentage (0-100) | No | 80 |
cpu_alert_threshold |
Sustained CPU usage threshold percentage (0-100) | No | 85 |
cpu_alert_duration |
Duration in seconds CPU must be sustained above threshold | No | 60 |
disk_alert_threshold |
Disk usage threshold percentage (0-100) | No | 90 |
- main (workflow start): Starts metrics collection as a background process that stores metrics in memory
- Workflow steps: Execute normally while metrics are collected in the background
- post (workflow end): Collector saves state to GitHub Actions, post action reads metrics from state, renders metrics as tables with timestamps and threshold indicators, and outputs to job summary
Note
Metrics are displayed with timestamps rather than correlated with specific workflow steps. You can manually correlate metrics with your workflow steps by matching timestamps with the execution times shown in your workflow run logs.
npm ciThis automatically runs gitleaks on commit. It checks for sensitive information like API keys or tokens.
# Bundle for operation in a workflow
npm run build
# Run unit tests (Node test runner)
npm testsrc/
βββ lib.ts # Common schema and configuration
βββ main/
β βββ index.ts # main entry point (collector startup)
β βββ collector.ts # Background metrics collection process
β βββ metrics.ts # Metrics class (metrics management)
β βββ metrics.test.ts # Metrics class tests
βββ post/
βββ index.ts # post entry point (job summary output)
βββ lib.ts # Metrics fetch, alert detection, and rendering
βββ lib.test.ts # Rendering logic tests
βββ renderer.ts # Table generation
βββ renderer.test.ts # Table generation tests
βββ alerts.test.ts # Alert detection tests
src/main/index.tsis executed- Node.js spawns
src/main/collector.tsas a detached background process Metricsclass collects CPU/memory/disk information every 5 seconds usingsysteminformationlibrary- Metrics data is stored in memory only (no disk writes during collection)
- On process termination (SIGTERM/SIGINT), metrics are saved to GitHub Actions state
src/post/index.tsis executed- Reads metrics data from GitHub Actions state via
getState() - Fetches workflow step information from GitHub API (token required)
- Detects threshold violations and generates alerts
Rendererclass generates tables with step-by-step metrics and threshold indicators- Outputs to job summary using
@actions/coresummaryAPI, including:- Alerts section for threshold violations
- Step summary table with durations
- CPU, Memory, and Disk usage tables with threshold exceeded indicators