Summary
During the development of a 16-bit bootloader, a critical bug was identified where the print function failed to output the intended string, instead printing only a single character (or truncated output) depending on the state of the stack. The issue was traced back to stack corruption caused by the improper use of pusha and popa in conjunction with a specific interrupt behavior in Real Mode. This postmortem explores why “correct-looking” assembly code fails in low-level environments.
Root Cause
The root cause is a stack pointer misalignment and register corruption during the execution of the BIOS interrupt int 0x10.
- The
pusha/popaTrap: The developer usedpushato save all general-purpose registers before the loop andpopato restore them. - Stack Pointer Shift: The
mov sp, 0x7C00instruction sets the stack to grow downwards from the bootloader start address. - Interrupt Side Effects: BIOS interrupts like
int 0x10(Teletype output) often manipulate the stack or rely on specific register states. - The “S” Symptom: When
popais called, if the stack pointer (sp) has been modified by the BIOS or if the stack was not perfectly aligned with the pushed values, the registers are restored with garbage data. Specifically, thesiregister (which holds the string pointer) gets overwritten with a value that points to a memory location containing only the character ‘s’ or an invalid address, causing the loop to terminate prematurely.
Why This Happens in Real Systems
In modern operating systems, the stack is managed by a robust kernel with guard pages and strict abstractions. In Real Mode (16-bit), you are operating in a “wild west” environment:
- No Memory Protection: Any instruction can overwrite any memory address, including the stack.
- BIOS Non-Determinism: BIOS interrupts are written in assembly and are not guaranteed to be “transparent.” They may use the stack for their own internal logic, potentially clobbering values if the stack is too close to the code or data segments.
- Segment/Offset Confusion: In 16-bit mode, memory is addressed via
segment:offset. Ifdsoresare modified incorrectly during an interrupt, thelodsbinstruction (which usesds:si) will fetch data from the wrong physical memory location.
Real-World Impact
- Silent Failures: The code does not crash with a “Segmentation Fault”; it simply produces incorrect output, making it incredibly difficult to debug without a hardware debugger or QEMU logs.
- Unpredictable Bootstrapping: A bootloader that prints the wrong characters might appear to work on one machine but fail on another due to different BIOS implementations, leading to non-portable software.
- System Instability: If the stack pointer is corrupted, the next
retinstruction will jump to a random memory address, leading to a complete system hang or a reboot loop.
Example or Code (if necessary and relevant)
The faulty logic involves saving the entire state when only specific registers are needed, creating a dependency on the stack integrity.
print:
pusha ; Save ALL registers (AX, CX, DX, BX, SP, BP, SI, DI)
mov ah, 0x0E ; BIOS Teletype function
.loop:
lodsb ; Loads [ds:si] into AL and increments SI
test al, al ; Check for null terminator
jz .done
int 0x10 ; Call BIOS
jmp .loop
.done:
popa ; RESTORES registers, but SI might be corrupted if stack shifted
ret
How Senior Engineers Fix It
Senior engineers apply the Principle of Least Privilege to register management. Instead of saving everything, only save what is strictly necessary.
- Selective Pushing: Instead of
pusha, onlypush siandpush di(or whatever is actually being used). This reduces the “surface area” for stack corruption. - Stack Isolation: Ensure the stack is placed in a known-safe memory region (like
0x7C00or higher) and verify that the data segment (ds) is explicitly set to avoidlodsbfetching from the wrong segment. - Explicit Register Management: Avoid relying on the side effects of
popa. If you needsito remain constant for a caller, manually save and restore it.
Corrected Implementation Pattern:
print:
push ax ; Only save what we actually change
push dx
push si
mov ah, 0x0E
.loop:
lodsb
test al, al
jz .done
int 0x10
jmp .loop
.done:
pop si ; Restore SI specifically
pop dx
pop ax
ret
Why Juniors Miss It
- Over-reliance on “Safe” Patterns: Juniors often see
pusha/popaas a “magic bullet” to make functions re-entrant or safe, not realizing it creates a heavy dependency on the stack’s integrity. - Abstracted Thinking: Most programmers are used to high-level languages where the stack is an abstraction. They struggle to visualize the physical movement of the Stack Pointer during a BIOS interrupt.
- Debugging Tool Gap: Juniors often debug using
printfor high-level debuggers. In bootloader development, you must debug using register inspection (e.g.,gdbwith QEMU), which requires understanding exactly whatsiholds at every single clock cycle.