How to proxy large GZIP streams with FastAPI on AWS Lambda (Mangum) without decompression?

Summary

The goal is to create a proxy service using FastAPI deployed on AWS Lambda that can fetch a large GZIP-compressed response from an upstream API and stream it directly to the client without decompressing the data. The current implementation using StreamingResponse and passing the upstream headers directly is causing 502 Bad Gateway errors due to conflicts between the Content-Length header and chunked transfer encoding.

Root Cause

The root cause of the issue is the conflict between the Content-Length header in the upstream response and the chunked transfer encoding used by StreamingResponse. When StreamingResponse is used, FastAPI automatically sets the Transfer-Encoding header to chunked, which is incompatible with the Content-Length header.

Why This Happens in Real Systems

This issue occurs in real systems because:

  • AWS Lambda has limited memory, making it impossible to decompress large GZIP-compressed responses
  • FastAPI‘s StreamingResponse is designed to handle large responses, but it uses chunked transfer encoding, which conflicts with the Content-Length header
  • Mangum is used to handle the interaction between FastAPI and AWS Lambda, but it does not provide a built-in solution for this specific issue

Real-World Impact

The impact of this issue is:

  • 502 Bad Gateway errors, which can lead to a poor user experience
  • Increased latency and resource usage due to repeated requests
  • Potential data corruption or loss due to incorrect handling of the compressed stream

Example or Code

import requests
from fastapi.responses import StreamingResponse

def proxy_request(url):
    response = requests.get(url, stream=True)
    response.raw.decode_content = False

    def raw_stream_generator():
        while True:
            chunk = response.raw.read(8192, decode_content=False)
            if not chunk:
                break
            yield chunk

    # Remove the Content-Length header to avoid conflicts with chunked transfer encoding
    headers = dict(response.headers)
    headers.pop('Content-Length', None)

    return StreamingResponse(
        content=raw_stream_generator(),
        status_code=response.status_code,
        headers=headers
    )

How Senior Engineers Fix It

Senior engineers fix this issue by:

  • Removing the Content-Length header from the upstream response to avoid conflicts with chunked transfer encoding
  • Using StreamingResponse with a custom generator to yield the raw chunks of the compressed stream
  • Configuring Mangum to handle the streaming response correctly

Why Juniors Miss It

Juniors may miss this issue because:

  • They may not fully understand the implications of using StreamingResponse with chunked transfer encoding
  • They may not be aware of the potential conflicts between the Content-Length header and chunked transfer encoding
  • They may not have experience with handling large compressed streams in FastAPI and AWS Lambda