Building a Document Automation Tool for blind

Summary

This postmortem analyzes the technical considerations behind building an accessible document‑automation tool for blind and visually impaired users. The original question asked which technologies—python-docx, VSTO, Office Scripts, Google Apps Script, or others—are best suited for automating formatting tasks in Microsoft Word and Google Docs.

Root Cause

The underlying issue is that document‑formatting APIs differ drastically across ecosystems, and many developers underestimate how fragmented the automation landscape is. This leads to:

  • Choosing tools that cannot perform required formatting operations
  • Running into API limitations too late in the project
  • Over‑engineering solutions that could have been simpler
  • Building tools that are not screen‑reader‑friendly

Why This Happens in Real Systems

Real systems suffer from this because:

  • Microsoft Word has multiple automation stacks (python-docx, COM/VSTO, Office Scripts, Graph API), each with different capabilities.
  • Google Docs exposes only one real automation path (Google Apps Script), but it has strict quotas and limited formatting APIs.
  • Accessibility requirements add constraints that typical automation tutorials never address.
  • Cross‑platform automation is inherently messy—desktop Word ≠ Word Online ≠ Google Docs.

Real-World Impact

When the wrong stack is chosen, teams experience:

  • Incomplete automation (e.g., python-docx cannot modify existing images in a .docx created by Word)
  • Inconsistent formatting output across platforms
  • High maintenance cost due to API instability
  • Blocked accessibility workflows, forcing blind users to rely on sighted assistance again

Example or Code (if necessary and relevant)

Below is a minimal example showing how python-docx resizes an image. This illustrates both its usefulness and its limitations.

from docx import Document
from docx.shared import Inches

doc = Document("input.docx")

for shape in doc.inline_shapes:
    shape.width = Inches(4.0)

doc.save("output.docx")

How Senior Engineers Fix It

Experienced engineers approach this problem by choosing the stack based on capability, not convenience:

For Microsoft Word

  • python-docx

    • Great for generating new documents
    • Limited for editing complex existing documents
    • Cannot access all Word features (headers, footers, styles, image metadata)
  • VSTO (C#)

    • Most powerful and complete Word automation
    • Full access to the Word object model
    • Best for desktop-only workflows
  • Office Scripts (JavaScript)

    • Works in Word Online
    • Good for cloud automation
    • Limited compared to VSTO but more modern

Senior recommendation:
If you need maximum control, use VSTO.
If you need cross‑platform cloud automation, use Office Scripts.
If you need simple batch processing, python-docx is fine.

For Google Docs

  • Google Apps Script is the only realistic choice
  • It integrates directly with Docs, Drive, and Sheets
  • It supports triggers, menus, and add‑ons
  • It is accessible and screen‑reader‑friendly

Senior recommendation:
Master Google Apps Script for anything involving Google Docs.

Why Juniors Miss It

Junior developers often overlook:

  • API capability gaps (assuming python-docx can do everything Word can)
  • Differences between Word desktop and Word Online
  • The importance of accessibility-first design
  • The need to test with real screen-reader workflows
  • The hidden complexity of document formatting engines

They tend to pick the tool they already know (Python or JavaScript) instead of the tool that actually solves the problem.


If you’d like, I can also generate a technology roadmap tailored to your skill level and accessibility needs, or help you design the architecture for your automation tool.

Leave a Comment