Summary
This incident examines the risks of replacing Supervisor with a custom infinite-loop Bash script to run Laravel’s schedule:run command every minute. While the script appears simple, it introduces subtle reliability and operational hazards that become significant in real production systems.
Root Cause
The core issue is relying on a hand-rolled process manager instead of a battle‑tested, fault‑tolerant supervisor designed to keep long‑running processes healthy.
Key contributing factors include:
- No automatic restart on failure beyond
set -e - No memory or resource monitoring
- No logging rotation or structured output handling
- No protection against runaway processes
- No built‑in backoff or throttling
Why This Happens in Real Systems
Engineers often underestimate how fragile long-running shell loops can be. Real systems experience:
- Transient failures (network hiccups, PHP segfaults, OOM kills)
- Environment drift (updated PHP binaries, changed paths)
- Unexpected output that breaks loops or pipes
- Zombie processes when child processes aren’t reaped
- Cron drift when sleep intervals accumulate over time
These issues accumulate silently until the scheduler stops running entirely.
Real-World Impact
Teams relying on DIY loops often encounter:
- Missed scheduled tasks (backups, billing cycles, cleanup jobs)
- Silent failures with no alerting
- High CPU usage if the loop spins unexpectedly
- Memory leaks from PHP or the shell process
- Operational confusion during deploys or restarts
In production, these failures can cascade into:
- Stale caches
- Unsent emails
- Failed invoices
- Data corruption from skipped maintenance tasks
Example or Code (if necessary and relevant)
Below is the user-provided loop, shown exactly as executable code:
#!/usr/bin/env bash
set -e
while true
do
php artisan schedule:run
sleep 60
done
How Senior Engineers Fix It
Experienced engineers avoid reinventing process management. They use Supervisor, systemd, or Docker health checks because these tools provide:
- Automatic restarts on crash or exit
- Configurable backoff strategies
- Resource limits (memory, CPU)
- Structured logging
- Process isolation
- Graceful shutdown handling
- Monitoring hooks for alerts
They also:
- Run
schedule:runvia cron every minute, which is perfectly valid - Or run a queue worker under Supervisor and keep the scheduler under cron
The key is using mature, observable, self-healing infrastructure.
Why Juniors Miss It
Junior engineers often:
- Focus on “it works on my machine” rather than long-term reliability
- Underestimate how often processes fail in production
- Assume
while trueis equivalent to a real process manager - Don’t consider logging, monitoring, or restart semantics
- Haven’t yet experienced the pain of silent failures at scale
They see simplicity; seniors see operational risk.