Issues converting Figma designs to a full WordPress theme using Open AI

Summary

The project aims to automate the conversion of Figma designs into functional WordPress themes using the Figma and OpenAI APIs. The primary technical hurdle is a pipeline fragility caused by unmanaged API rate limits and context erosion in the AI model during large file processing. While the intent is to replicate the fluid experience of tools like Cursor or Builder.io, the current implementation treats the process as a single-shot request rather than a resilient, state-aware data pipeline. The immediate failure involves the Figma API rate limiting requests and the OpenAI model generating structurally inconsistent code when context is split across multiple calls.

Root Cause

The root cause is twofold: infrastructure constraints (Figma API) and model limitations (OpenAI) manifesting as a systemic failure in the ingestion-to-generation pipeline.

Figma API Rate Limiting (HTTP 429): The Figma API enforces strict rate limits (e.g., 30-120 requests per minute depending on the endpoint and user tier). Large design files often require recursive fetching of node trees, styles, and assets. Sending these requests synchronously without a backoff strategy exhausts the quota immediately.
Context Window Saturation: When large Figma JSON payloads are sanitized and sent to OpenAI, they often exceed the context window limits or push the model into a state where it “hallucinates” or truncates output to save tokens.
Deterministic Parsing Failure: Passing loosely structured Figma data (frames, components) directly to the LLM results in non-deterministic HTML/CSS generation. Without a strict schema or intermediate representation, the AI fails to map visual constraints to CSS properties consistently.
MCP Tooling Null Responses: The use of Figma-Context-MCP resulting in null responses suggests that the MCP server is failing to retrieve the data or bridge the context properly, likely due to the same underlying rate limits or authentication issues, effectively breaking the tool-use loop.

Why This Happens in Real Systems

This scenario is a classic impedance mismatch problem between a graph-based design source (Figma) and a hierarchical code target (WordPress), mediated by a probabilistic engine (LLM).

Rate Limits are Immutable: Unlike internal APIs, third-party APIs (Figma) enforce global concurrency limits. Without asynchronous processing or queue-based architectures, applications will inevitably crash or hang during bulk operations.
LLMs are Not Databases: You cannot rely on an LLM to retain “design consistency” over long conversations without a grounding mechanism. If you split a design into 5 chunks and send them to the AI in 5 separate calls, the model has no memory of the first chunk when processing the fifth. It will generate different class names, spacing units, and layout strategies for each chunk.
Non-Determinism in Code Gen: LLMs are probabilistic token generators. Without rigid constraints (system prompts, JSON schemas, few-shot prompting), generating valid PHP syntax (which relies on exact opening/closing tags and strict naming conventions) is high-entropy and prone to failure.

Real-World Impact

Broken Build Artifacts: The resulting WordPress theme will likely have missing sections, CSS conflicts (e.g., margin-top: 10px in one chunk vs margin-top: 12px in another), and broken PHP hooks, rendering the theme unusable.
Exponential Latency: Retrying failed requests due to rate limits increases the total generation time significantly. A “quick” conversion becomes a long-running process that times out.
Token Burn/Cost: Repeatedly sending the same large Figma JSON payload to OpenAI to retry failed requests wastes API credits and increases latency.

Example or Code

The solution requires moving from a synchronous “Fetch -> Send -> Generate” loop to an asynchronous “Queue -> Process -> Assemble” architecture.

1. The Asynchronous Queue Worker (Python/Pseudo-code)
Instead of blocking the main thread, offload tasks to a queue (Redis/Celery) with exponential backoff to respect rate limits.

import time
import backoff
import requests

# Figma API Wrapper with Exponential Backoff
class FigmaClient:
    def __init__(self, token):
        self.token = token
        self.base_url = "https://api.figma.com/v1"

    @backoff.on_exception(backoff.expo, requests.exceptions.HTTPError, max_tries=5)
    def get_file_nodes(self, file_id, node_ids):
        url = f"{self.base_url}/files/{file_id}/nodes"
        headers = {"Authorization": f"Bearer {self.token}"}
        params = {"ids": ",".join(node_ids)}

        response = requests.get(url, headers=headers, params=params)

        # Explicitly handle 429 (Rate Limit)
        if response.status_code == 429:
            retry_after = int(response.headers.get("Retry-After", 60))
            print(f"Rate limited. Sleeping for {retry_after} seconds.")
            time.sleep(retry_after)
            raise requests.exceptions.HTTPError("Rate limited")

        response.raise_for_status()
        return response.json()

# Example usage in a worker
def process_figma_chunk(file_id, node_ids):
    client = FigmaClient("your_figma_token")
    # This call will retry automatically if rate limited
    data = client.get_file_nodes(file_id, node_ids)
    # ... send to OpenAI ...
    return data

2. Structured Output for OpenAI
To prevent “inconsistent HTML/CSS”, force the model to output valid JSON that can be parsed and validated before writing to files.

// System Prompt: "You are a WordPress theme generator. Output ONLY valid JSON."
{
  "file_path": "templates/homepage.php",
  "content": "...",
  "assets": ["hero-bg.jpg"],
  "block_mapping": {
    "figma_component_id": "wp-block/hero"
  }
}

How Senior Engineers Fix It

Senior engineers treat this not as a “prompting problem” but as a data pipeline engineering problem.

Implement a Queue System (Redis/Celery/SQS): Decouple the request from the work. If Figma rate limits, the job goes back into the queue to retry later, keeping the user interface responsive.
Pre-Process Figma Data (Normalization): Do not send raw Figma JSON to the AI. Instead, write a “Normalizer” script that traverses the Figma tree and converts it into a simplified, flattened JSON structure containing only essential info (dimensions, colors, text content, hierarchy). This reduces token usage by 80% and improves context retention.
RAG (Retrieval-Augmented Generation) for Consistency: Store successfully generated code snippets (e.g., a specific button style) in a vector database (like Pinecone or even a local JSON file). Before generating a new chunk, query this database for “existing styles” and inject those into the system prompt to maintain consistency.
Sandboxing & Validation: Before writing files, run the generated PHP through a linter (e.g., php -l) and the HTML through a validator. If it fails, flag it for human review or auto-correct via a smaller, cheaper model.

Why Juniors Miss It

Assumption of Magic: Juniors often believe an LLM acts like a human designer who “remembers” the previous page. They fail to realize that every API call is stateless. Without explicit context injection, the AI forgets everything.
Linear Workflow Thinking: They build the tool as User Click -> Run Script -> Done. They don’t account for the “unhappy paths” (network errors, rate limits) that require asynchronous handling and user notification.
Prompt Engineering vs. Software Architecture: They spend days tweaking prompts to “fix” hallucinations, rather than fixing the upstream data (cleaner Figma extraction) or downstream constraints (JSON schemas) that guarantee correctness.