Change all relative paths

# Postmortem: Improper Relative Path Rewriting Caused Broken Asset Loading

## Summary
A user script (Tampermonkey) was deployed to rewrite relative paths on a web page to point to a new domain (`www.mypage.com`). Absolute paths were meant to be preserved. The script incorrectly modified some absolute paths and mishandled relative path resolution, causing CSS/JS files and images to fail loading on the target page.

## Root Cause
The primary root causes were:
- **Absolute path misidentification**: The script used a naive string-matching approach to detect relative paths, mistaking absolute paths (starting with `/`, `http:`, or `https:`) for relative ones during replacement.
- **Inconsistent path resolution**: The script failed to account for different relative path syntaxes (`../`, `./`, and base pathless references), resulting in incorrect URL reconstruction.
- **Domain injection without normalization**: Inserting a new domain without properly resolving against the original base URL caused path duplication (e.g., `//www.mypage.com//folder/...`).

## Why This Happens in Real Systems
- **Regex over-reliance**: Engineers often use regex for URL manipulation without leveraging browser-native URL resolution.
- **Ambiguous specification**: Relative paths (e.g., `../` vs. pathless) require context-aware joining with the base URL—a non-trivial task handled inconsistently by simple scripts.
- **Hurried hotfixes**: Pressure to "quickly change domains" leads to rushed implementations without comprehensive testing.

## Real-World Impact
- **Site functionality breakage**: Images/CSS/JS failed to load, causing visual breakage and interactive failures.
- **Increased load on origin**: Unrewritten absolute paths continued hitting `www.example.com`, confusing monitoring systems.
- **User experience degradation**: Broken pages led to support tickets and temporary service abandonment.

## Example or Code
Original flawed Tampermonkey script snippet:
```javascript
// Naive implementation incorrectly modifying paths
document.querySelectorAll('img, script, link').forEach(element => {
  const attr = element.src ? 'src' : 'href';
  let path = element.getAttribute(attr);

  // Flawed absolute path check (fails for protocol-relative URLs)
  if (!path.startsWith('http') && !path.startsWith('//')) {
    // Incorrectly forces path to new domain
    element.setAttribute(attr, 'https://www.mypage.com' + path);
  }
});

This breaks:

  • Absolute paths beginning with / (e.g., /folder/pic.jpg becomes www.mypage.com/folder/pic.jpg → incorrect)
  • Protocol-relative URLs (e.g., //cdn.example.com/lib.js → incorrectly replaced)
  • Relative paths like ../sheet.css (incorrectly becomes www.mypage.com/../sheet.css)

How Senior Engineers Fix It

  1. Use browser-native URL resolution:
    const absoluteUrl = new URL(path, document.baseURI).href;
  2. Conditionally replace origin only:
    const newUrl = new URL(path, document.baseURI);
    if (shouldRewrite(newUrl.origin)) {
      newUrl.origin = 'https://www.mypage.com';
      element.setAttribute(attr, newUrl.href);
    }
  3. Specify rewrite rules explicitly:
    • Rewrite only if element‘s path is relative OR matches specific origin
  4. Handle all URL schemes: Skip data:, blob:, and absolute HTTP(S) URLs.

Why Juniors Miss It

  • Underestimating URL complexity: Treating URLs as simple strings instead of structured entities with protocols, origins, and paths.
  • Incomplete testing: Validating only obvious cases (e.g., folder/file) but missing ../, //, and root-relative paths.
  • Lack of debugging: Not inspecting rewritten paths in browser devtools Network tab.
  • Premature optimization: Prioritizing “clever” regex over robust browser APIs.