PowerShell script that alerts on unexpected restart/shutdown only

Summary

The core issue is reliance on event IDs alone without distinguishing planned from unplanned shutdowns. The provided script triggers on any of the listed Event IDs (1074, 41, 6008, 1076, 13, 6006), which includes both scheduled maintenance and unexpected crashes. To make this actionable for production monitoring, we need to filter events by user context (who initiated the shutdown) and event source (kernel vs. user). The solution involves querying the specific event details, specifically checking if the shutdown was initiated by SYSTEM (typical for crashes) or a specific service account, and verifying the User field in the event data.

Root Cause

The root cause is a lack of context in the event filtering logic. The script monitors:

  • Event ID 1074: User-initiated shutdowns (planned).
  • Event ID 41 (Kernel-Power): Unexpected power loss (unplanned).
  • Event ID 6008: Unexpected shutdowns (unplanned).
  • Event ID 6006: Clean shutdowns (planned).
  • Event ID 1076: Unscheduled shutdowns (unplanned).
  • Event ID 13: Kernel-General (context-dependent).

The script currently treats all these as equal events. Unexpected shutdowns are typically characterized by a User of SYSTEM (or N/A) and often occur via Event ID 41 or 6008. Planned shutdowns usually have a specific domain user listed in the event data (e.g., an admin account).

Why This Happens in Real Systems

In production environments, servers are rebooted for:

  • Patching (Windows Update): Usually initiated by a specific service account or user.
  • Maintenance: Manual intervention by an administrator.
  • Application Restarts: Orchestrated by a process manager (e.g., IIS, Kubernetes).

However, unexpected shutdowns are caused by:

  • Power Failure: Triggers Kernel-Power (41).
  • BSOD (Blue Screen): Often logs Kernel-General (13) or Kernel-Power (41).
  • Hard Reset: Physical power button or loss of VM heartbeat.

A script running on startup cannot easily determine the reason by simply looking at the event ID because the User field is critical. A user-initiated reboot looks identical to a crash in the raw event ID list unless we inspect the event properties.

Real-World Impact

  • Alert Fatigue: Sending emails for every scheduled reboot (e.g., Tuesday patching) causes engineers to ignore alerts.
  • Missed Incidents: If the alert is buried in a sea of “scheduled reboot” emails, an actual crash might go unnoticed until SLA violation.
  • Inefficient Triage: Operations spend time verifying if a reboot was planned rather than responding to genuine outages.

Example or Code

Here is the modified script. It adds a filter condition to check the User property of the event. Unexpected shutdowns (like crashes) are typically authored by SYSTEM or have no user context. Planned shutdowns usually have a specific user (e.g., DOMAIN\Admin).

# Source - https://stackoverflow.com/a/79867160
# Modified to filter for unexpected shutdowns

$EventID = 1074, 41, 6008, 1076, 13, 6006
$LogName = 'System'

# Get the latest event matching your criteria
$Events = Get-WinEvent -FilterHashtable @{LogName = $LogName; ID = $EventID } -MaxEvents 3

# Define users associated with planned shutdowns (e.g., Service Accounts, Admins)
# Adjust these to match your environment's service accounts
$PlannedUsers = @("NT AUTHORITY\SYSTEM", "NT AUTHORITY\LOCAL SERVICE", "DOMAIN\ServiceAccount")

# Filter for unexpected events
# Unexpected: User is NOT in $PlannedUsers (or empty) AND Event ID is critical (e.g., 41, 6008)
# Note: 1074 is usually planned, but we check the user context to be sure.
$UnexpectedEvents = $Events | Where-Object {
    $user = $_.Properties[6].Value # Index 6 usually holds the UserSID in System logs
    # For Event ID 1074, the User is usually in Properties[4].Value
    # For Kernel events (41), User is often empty or SYSTEM

    # Simplified logic: If Event is 41 or 6008 (Hard Crash), trigger.
    # If Event is 1074 (User Request), check User string.

    $_.Id -in @(41, 6008) -or
    ($_.Id -in @(1074, 6006) -and $user -notin $PlannedUsers)
}

if ($UnexpectedEvents) {
    $css = ' table, th, td { border: 1px solid } '
    $Subject = "ALERT: $env:COMPUTERNAME Unexpected Restart/Shutdown! - Event ID(s) $($UnexpectedEvents.Id -join ', ') - $(Get-Date -Format 'dd-MM-yyyy HH:mm')"

    $Body = $UnexpectedEvents | Select-Object Id, @{ N = 'Source'; E = 'ProviderName' }, @{ N = 'User'; E = { $_.Properties[4].Value } }, Message | ConvertTo-Html -PreContent $css, '

Server has restarted or shutdown unexpectedly with event ID(s):


' | Out-String $EmailParams = @{ Subject = $Subject Body = $Body Priority = [System.Net.Mail.MailPriority]::High From = 'fromaddress@example.com' To = 'toaddress@example.com' SmtpServer = 'your smtp address' BodyAsHtml = $true } Send-MailMessage @EmailParams }

How Senior Engineers Fix It

Senior engineers implement state persistence and classification logic:

  1. Check the User Property: The most reliable way to distinguish planned vs. unplanned is checking the user who requested the shutdown.
    • Planned: User is usually a specific account (e.g., CORP\Admin, NT AUTHORITY\SYSTEM if initiated by a script/patch).
    • Unplanned: User is often empty, N/A, or the event lacks a user context (typical for Kernel-Power 41).
  2. Differential Comparison: Instead of running on startup, run the script on a schedule (e.g., every 5 minutes). Compare the current uptime with the last recorded uptime in a persistent store (like a file or database). If the uptime decreased without a scheduled maintenance window record, trigger an alert.
  3. Maintenance Window Tagging: Integrate with a ticketing system (like ServiceNow) or a simple config file. If a reboot occurs during a known maintenance window, suppress the alert.
  4. Specific Event Analysis:
    • Event ID 41 (Kernel-Power): Always treat as critical unless explicitly whitelisted (e.g., during a known power test).
    • Event ID 1074: Parse the message string for keywords like “Planned”, “Windows Update”, or specific service names (e.g., “VMware Tools”).

Why Juniors Miss It

  • Event ID Confusion: Juniors often assume Event IDs are unique to specific scenarios. They miss that Event ID 1074 is used for both user-initiated shutdowns and some software updates, and Event ID 41 is exclusively for unexpected power loss.
  • Lack of User Context: The junior script filters only by ID and LogName. It fails to inspect the event properties (like Properties[4] which holds the UserSID/Account Name). Without checking who caused the shutdown, all shutdowns look the same.
  • Architecture Blindness: Juniors often write “run once” scripts triggered by Task Scheduler (On Startup). This makes it impossible to compare “before” and “after” states. They don’t realize that to detect an unexpected event, you need a baseline (previous state) to compare against.