Summary
A graphics application using Vulkan, GLFW, and the X11 windowing system experienced progressive instability during window resizing. The failure pattern began with intermittent failed_create_swapchain errors, eventually escalating to a critical X11 BadValue error, which caused the windowing system to crash the application process. The root cause is a failure to handle the Zero-Extent state (minimization or rapid resizing) and a resource leak involving the Old Swapchain mechanism.
Root Cause
The failure is driven by three interlocking issues:
- Zero-Dimension Extents: During rapid resizing or minimization, the window dimensions can momentarily become
0x0. Passing a width or height of zero to a swapchain creation function is invalid under the Vulkan specification and causes the driver to return an error. - Improper Old Swapchain Management: The code attempts to use
swapchainBuilder.set_old_swapchain(swapChain). While this is a valid optimization, if theold_swapchainis passed but the creation of the new one fails, the application enters an undefined state where the previous swapchain might be destroyed prematurely or left in a “zombie” state. - Resource Leak/Race Condition: The error log shows
Failed to create swapchainfollowed by the destruction of the existing swapchain. However, when X11 throws aBadValueerror, it indicates that the application sent a command to the X Server (likely via GLFW or the Vulkan surface extension) with parameters that are out of bounds—specifically, attempting to create a surface with an invalid size or an invalid handle that was already invalidated by a failed previous attempt.
Why This Happens in Real Systems
In production environments, hardware and software interfaces are asynchronous and non-atomic.
- Event Buffering: OS window managers (like KDE/X11) buffer resize events. The application might receive a “resize to 0” event that the user never intended, but the software must handle it.
- Driver State Machines: Graphics drivers maintain complex internal states. If an application fails to clean up a
VkSwapchainKHRcorrectly before attempting to create a new one with the “old swapchain” flag, the driver’s internal state machine can become desynchronized. - Window Manager Latency: There is a temporal gap between the window being resized and the surface capabilities being updated. If the application queries capabilities and immediately acts on them without checking for zero-values, it hits a race condition.
Real-World Impact
- Application Crashes: Instead of a smooth resize, the end-user experiences a complete desktop crash or an immediate “Application has stopped working” dialog.
- System Instability: In extreme cases involving X11, invalid opcodes or
BadValueerrors can lead to ghost windows or hanging window managers, forcing a user to restart their entire desktop session. - Degraded UX: Intermittent failures during resizing make the software feel “unpolished” and unreliable.
Example or Code (if necessary and relevant)
// The flawed logic in the provided snippet
void VulkanWindow::createSwapChain(uint32_t &width, uint32_t &height) {
// ... (fetching capabilities)
// BUG: If width or height is 0 (minimization), this proceeds to invalid creation
width = capabilities.maxImageExtent.width;
height = capabilities.maxImageExtent.height;
// BUG: Passing an old swapchain that might already be invalid/destroyed
if (swapChain != VK_NULL_HANDLE) {
swapchainBuilder.set_old_swapchain(swapChain);
}
auto swapchainResult = swapchainBuilder.build();
if (!swapchainResult.has_value()) {
// BUG: Destroying the old swapchain here without ensuring the
// new one actually exists can lead to losing the only valid surface handle.
if (swapChain) vkDestroySwapchainKHR(nri.getDevice(), swapChain, nullptr);
// ...
}
}
How Senior Engineers Fix It
A senior engineer implements defensive programming and state validation:
- Zero-Extent Guard: Immediately check if
widthorheightis zero. If so, the application should pause rendering and wait for the next frame/resize event rather than attempting to create a swapchain. - Atomic Swapchain Re-creation: Ensure the old swapchain is only destroyed after the new swapchain is successfully created and validated.
- Strict Capability Validation: Instead of blindly using
maxImageExtent, ensure the requested dimensions fall within theminImageExtentandmaxImageExtentprovided by the hardware. - Error Recovery Path: If swapchain creation fails, implement a fallback that clears all associated image views and framebuffers before attempting a retry, rather than just trying to reuse the broken state.
Why Juniors Miss It
- Happy Path Bias: Juniors often write code assuming the window will always have a valid, positive size. They test by dragging the window corner, not by minimizing it or resizing it at extreme speeds.
- API Misunderstanding: They see
set_old_swapchainas a “magic” optimization provided by the API and don’t realize that it increases the complexity of the error-handling state machine. - Ignoring Edge Cases: They treat
VK_ERROR_OUT_OF_DATE_KHRorfailed_create_swapchainas fatal errors rather than expected lifecycle events that require specific cleanup protocols.