Summary
A critical resource exhaustion incident was identified in a production SCADA environment running on Windows Server 2022. The system experienced a continuous handle leak within the host process responsible for executing VBScript via COM automation. Specifically, the repeated instantiation of WScript.Shell and Scripting.FileSystemObject led to a cumulative increase in kernel handles, eventually reaching millions of handles and causing OS instability. The phenomenon exhibited non-deterministic behavior across identical OS builds, suggesting a deep-seated interaction between the COM runtime and specific patch levels or system configurations.
Root Cause
The investigation points to a failure in the COM Reference Counting mechanism and the underlying Process Lifecycle Management when bridging the script engine to external shell processes. Key factors include:
- Dangling COM References: While
Set sh = Nothingis called in the script, the underlying unmanaged COM wrapper fails to decrement the reference count of the host process’s handle table. - Process Detachment Latency: When
WScript.Shell.Runis called with thebWaitOnReturnparameter set toFalse, the shell process is spawned as a detached child. In certain Windows Server 2022 builds, the parent-child handle relationship remains active in the kernel even after the script object is “released,” preventing the handle from being reclaimed. - Kernel-Mode Object Retention: The leak occurs not in the script’s memory space, but in the Kernel Handle Table. This explains why application-level memory profilers often fail to detect the issue while
Process Explorershows massive growth. - Environmental Non-Determinism: Variations in Windows Update KB articles or specific Security Baselines (which may hook into process creation via EDR/Antivirus) appear to influence whether the COM subsystem successfully cleans up the process handles.
Why This Happens in Real Systems
In controlled development environments, scripts run once or twice. In High-Availability (HA) Industrial Systems like SCADA, these patterns repeat every few seconds for months.
- The “Ghost” Handle Problem: When automation interacts with the Windows Shell, it creates a bridge between a managed script engine and unmanaged OS primitives. If the bridge is not perfectly symmetrical, handles “leak” into the kernel.
- Dependency on Patch Levels: Modern Windows versions frequently update the RPC (Remote Procedure Call) and COM+ infrastructure. A security patch might change how process handles are inherited, inadvertently breaking the cleanup logic of older COM objects like
WScript.Shell. - Indirect Leakage: The leak isn’t in the code logic, but in the Operating System’s bookkeeping of the automation interface.
Real-World Impact
- System Instability: As the handle count reaches millions, the kernel spends increasing amounts of CPU time traversing the Handle Table, leading to high System CPU usage.
- Resource Starvation: Eventually, the process hits the Windows Handle Limit, preventing the creation of new threads, sockets, or files, effectively crashing the SCADA service.
- Cascading Failures: Because the OS becomes unstable, other critical services on the same Windows Server may fail to respond, leading to a total loss of control in the industrial environment.
Example or Code (if necessary and relevant)
' VULNERABLE PATTERN
Sub ExecuteCommand()
Dim sh
' Every execution creates new handles that are not properly reaped by the kernel
Set sh = CreateObject("WScript.Shell")
sh.Run "cmd /c echo task_running", 0, False
Set sh = Nothing
End Sub
' PROBABLE REMEDIATION PATTERN (If possible within the engine)
' Use a singleton pattern to reuse the object, or move logic to a native DLL
How Senior Engineers Fix It
Senior engineers move away from “convenience” objects toward deterministic resource management.
- Singleton Pattern: Instead of creating and destroying the object every execution, instantiate the
WScript.Shellobject once at the start of the application lifecycle and reuse it. This stops the cycle of allocation/deallocation. - Decoupling via Middleware: Move the heavy lifting (shell commands, file system manipulation) out of the VBScript engine and into a compiled C++ or C# Service. Communication is handled via a stable IPC (Inter-Process Communication) like Named Pipes or gRPC.
- Process Monitoring & Recycling: Implement an automated watchdog that monitors the
HandleCountvia WMI. If the count exceeds a predefined threshold, the service is gracefully restarted. - Environment Standardization: Use Infrastructure as Code (IaC) to ensure that the Windows Server builds are identical, specifically auditing the specific KB updates related to the COM Runtime.
Why Juniors Miss It
- Focus on Memory vs. Handles: Juniors typically look for Memory Leaks (RAM). They see the application’s private bytes are stable and assume there is no leak, overlooking the Kernel Handle Count.
- Reliance on “Set = Nothing”: There is a common misconception that setting an object to
Nothingin VBScript is a guarantee of immediate resource cleanup. In COM,Nothingonly decrements the language-level reference count; it does not guarantee the Kernel-level cleanup if the COM runtime is buggy. - Isolation of Logic: Juniors assume that if their code is syntactically correct and follows “best practices,” the underlying OS must be behaving correctly. They fail to account for the Leaky Abstraction of the Windows COM layer.