Summary
This postmortem analyzes a wkhtmltopdf layout defect where repeated table headers overlap the first row on subsequent pages when rows contain long, multi‑line text. Although the HTML renders correctly in browsers, wkhtmltopdf’s patched Qt layout engine fails to recalculate row heights before placing repeated headers, causing truncation and overlap.
Root Cause
The failure stems from wkhtmltopdf’s outdated WebKit/Qt rendering engine, which has known limitations around:
- Incorrect page-break calculations when table rows dynamically expand due to wrapped text
- Non‑reflowing table layout: the engine computes row heights after header placement
- Inaccurate multi-page table segmentation when
theadis used with large column counts - Lack of support for modern CSS such as
display: table-header-groupandpage-break-inside: avoidin complex tables
The result: the header is rendered using stale layout metrics, so it overlaps the first row on the new page.
Why This Happens in Real Systems
wkhtmltopdf is widely used in production pipelines, and these issues appear when:
- Tables span multiple pages
- Rows contain variable-height content
- There are many columns, forcing aggressive width/height recalculation
- The PDF engine uses patched Qt 4.8, which is over a decade old
- The system relies on pixel-perfect alignment for compliance or reporting
In real systems, wkhtmltopdf cannot fully emulate modern browser layout behavior, especially for multi-page tables.
Real-World Impact
Teams typically observe:
- Overlapping headers on page breaks
- Truncated first rows on new pages
- Unreadable PDFs in compliance or audit workflows
- Inconsistent output depending on text length
- Production incidents when customer-facing PDFs break unexpectedly
These failures often appear only with real data, not synthetic test cases.
Example or Code (if necessary and relevant)
A minimal workaround example using forced page-break isolation:
tr {
page-break-inside: avoid;
}
thead {
display: table-header-group;
}
tbody {
display: table-row-group;
}
This does not fix the underlying engine bug but reduces the frequency of header overlap in some layouts.
How Senior Engineers Fix It
Experienced engineers know wkhtmltopdf’s limitations and apply multi-layered mitigations:
- Reduce column count or increase page width/margins to stabilize layout
- Force predictable row heights using
min-heightor fixed-height wrappers - Insert manual page breaks before extremely tall rows
- Split long text into multiple rows to avoid single-row expansion
- Switch to a modern rendering engine such as:
- WeasyPrint
- Paged.js
- Chromium-based PDF generation (headless Chrome)
- Avoid wkhtmltopdf for complex tables entirely when reliability is required
The most robust fix is migrating away from wkhtmltopdf, as the rendering engine is no longer maintained at the level needed for complex pagination.
Why Juniors Miss It
Less experienced engineers often assume:
- wkhtmltopdf behaves like a modern browser
theadrepetition is fully supported- CSS rules like
page-break-inside: avoidalways work - Layout issues are caused by “bad HTML,” not engine limitations
- Increasing padding or tweaking CSS will fix the issue
They miss the deeper truth: wkhtmltopdf’s rendering engine is fundamentally incapable of correct multi-page table layout under certain conditions, and no amount of CSS can fully compensate.
Senior engineers recognize this pattern quickly because they’ve seen wkhtmltopdf fail in similar ways across many systems.