# Postmortem: Improper Relative Path Rewriting Caused Broken Asset Loading
## Summary
A user script (Tampermonkey) was deployed to rewrite relative paths on a web page to point to a new domain (`www.mypage.com`). Absolute paths were meant to be preserved. The script incorrectly modified some absolute paths and mishandled relative path resolution, causing CSS/JS files and images to fail loading on the target page.
## Root Cause
The primary root causes were:
- **Absolute path misidentification**: The script used a naive string-matching approach to detect relative paths, mistaking absolute paths (starting with `/`, `http:`, or `https:`) for relative ones during replacement.
- **Inconsistent path resolution**: The script failed to account for different relative path syntaxes (`../`, `./`, and base pathless references), resulting in incorrect URL reconstruction.
- **Domain injection without normalization**: Inserting a new domain without properly resolving against the original base URL caused path duplication (e.g., `//www.mypage.com//folder/...`).
## Why This Happens in Real Systems
- **Regex over-reliance**: Engineers often use regex for URL manipulation without leveraging browser-native URL resolution.
- **Ambiguous specification**: Relative paths (e.g., `../` vs. pathless) require context-aware joining with the base URL—a non-trivial task handled inconsistently by simple scripts.
- **Hurried hotfixes**: Pressure to "quickly change domains" leads to rushed implementations without comprehensive testing.
## Real-World Impact
- **Site functionality breakage**: Images/CSS/JS failed to load, causing visual breakage and interactive failures.
- **Increased load on origin**: Unrewritten absolute paths continued hitting `www.example.com`, confusing monitoring systems.
- **User experience degradation**: Broken pages led to support tickets and temporary service abandonment.
## Example or Code
Original flawed Tampermonkey script snippet:
```javascript
// Naive implementation incorrectly modifying paths
document.querySelectorAll('img, script, link').forEach(element => {
const attr = element.src ? 'src' : 'href';
let path = element.getAttribute(attr);
// Flawed absolute path check (fails for protocol-relative URLs)
if (!path.startsWith('http') && !path.startsWith('//')) {
// Incorrectly forces path to new domain
element.setAttribute(attr, 'https://www.mypage.com' + path);
}
});
This breaks:
- Absolute paths beginning with
/(e.g.,/folder/pic.jpgbecomeswww.mypage.com/folder/pic.jpg→ incorrect) - Protocol-relative URLs (e.g.,
//cdn.example.com/lib.js→ incorrectly replaced) - Relative paths like
../sheet.css(incorrectly becomeswww.mypage.com/../sheet.css)
How Senior Engineers Fix It
- Use browser-native URL resolution:
const absoluteUrl = new URL(path, document.baseURI).href; - Conditionally replace origin only:
const newUrl = new URL(path, document.baseURI); if (shouldRewrite(newUrl.origin)) { newUrl.origin = 'https://www.mypage.com'; element.setAttribute(attr, newUrl.href); } - Specify rewrite rules explicitly:
- Rewrite only if
element‘s path is relative OR matches specific origin
- Rewrite only if
- Handle all URL schemes: Skip
data:,blob:, and absolute HTTP(S) URLs.
Why Juniors Miss It
- Underestimating URL complexity: Treating URLs as simple strings instead of structured entities with protocols, origins, and paths.
- Incomplete testing: Validating only obvious cases (e.g.,
folder/file) but missing../,//, and root-relative paths. - Lack of debugging: Not inspecting rewritten paths in browser devtools Network tab.
- Premature optimization: Prioritizing “clever” regex over robust browser APIs.