Docker Healthcheck Failure After Migrating to Chainguard Node Images

Summary

Migrating from Node Alpine to Chainguard Node images caused the Docker healthcheck to fail silently. The container remained in an unhealthy state because the healthcheck command relied on wget, which is not available in Chainguard’s minimal image footprint. The solution involves switching to built-in Node.js mechanisms for health verification rather than depending on external utilities.

Root Cause

Chainguard images follow a “distroless” philosophy and ship with only the absolute minimum packages required to run the application
The original healthcheck used wget --spider http://localhost:3000/ which expected the wget binary to be present
Chainguard Node images do not include wget by default—they use Wolfi as the base instead of Alpine
The healthcheck command exits with a non-zero status when the binary is missing, causing Docker to mark the container as unhealthy
No error message is surfaced to the logs because Docker treats the missing command as a healthcheck failure, not a configuration error

Why This Happens in Real Systems

Teams copy healthcheck configurations from one image to another without verifying tool availability in the new base image
Alpine Linux includes many common utilities (wget, curl, bash) that other minimal images omit
Chainguard images prioritize security and minimal attack surface over convenience, intentionally excluding tools that could be security risks
Documentation often fails to highlight these breaking changes when switching base images
The migration from Alpine to Chainguard is becoming common due to Chainguard’s security-first approach, making this a widespread issue

Real-World Impact

Containers fail to start properly in production environments that require health checks for load balancer registration
Deployment pipelines may hang or timeout waiting for healthy containers
Service discovery systems like Kubernetes or Docker Swarm will not route traffic to unhealthy containers
Monitoring alerts may fire incorrectly, creating noise and masking real issues
Teams waste debugging time assuming the application itself is broken rather than the healthcheck configuration

Example or Code

The original failing configuration:

healthcheck:
  test: ["CMD", "wget", "--spider", "http://localhost:3000/"]
  interval: 5s
  timeout: 3s
  retries: 30

A working solution using Node.js built-in HTTP module:

healthcheck:
  test: ["CMD", "node", "-e", "require('http').get('http://localhost:3000/', (r) => process.exit(r.statusCode === 200 ? 0 : 1)).on('error', () => process.exit(1))"]
  interval: 5s
  timeout: 3s
  retries: 30

Alternative using a dedicated health endpoint in the application:

healthcheck:
  test: ["CMD", "node", "-e", "fetch('http://localhost:3000/health').then(r => { if (!r.ok) throw new Error(); process.exit(0); }).catch(() => process.exit(1))"]
  interval: 5s
  timeout: 3s
  retries: 30

How Senior Engineers Fix It

Audit all external dependencies in healthcheck commands when changing base images
Prefer native language solutions (Node.js HTTP module) over system utilities for portability
Create a dedicated /health or /healthz endpoint in the application for explicit health verification

Add a small healthcheck script to the project that can be reused across environments:

// healthcheck.js
const http = require('http');
const port = process.argv[2] || 3000;
const req = http.get(`http://localhost:${port}/`, (res) => {
process.exit(res.statusCode === 200 ? 0 : 1);
});
req.on('error', () => process.exit(1));

Document base image requirements in the project’s README
Use multi-stage builds to include debugging tools only in development images while keeping production minimal

Why Juniors Miss It

Assume all Linux distributions include standard utilities like wget, curl, and bash
Focus only on application code functionality and overlook infrastructure configuration
Lack awareness of the differences between Alpine, Debian, and distroless-based images
Do not test healthchecks in non-production environments before deploying to production
Trust that migration guides cover all edge cases (they often do not)
May not understand that healthcheck failures prevent container orchestration systems from working correctly

Fix Docker Healthcheck Failures Migrating to Chainguard Node Images