Summary
A production application using Duende Identity Server and the Microsoft.AspNetCore.Authentication.MicrosoftAccount library encountered a critical infrastructure failure. During the OAuth2 callback phase, Microsoft (Entra ID) returned an extremely large query string—exceeding 20,000 characters—to the /signin-microsoft endpoint. This resulted in HTTP 404.15 errors (Query String Too Long) at the IIS layer, effectively breaking the Single Sign-On (SSO) flow for all users.
Root Cause
The issue is not a bug in the application code, but a mismatch between the OAuth2 payload size sent by the Identity Provider (IdP) and the default web server configuration.
- Opaque Token Bloat: The
codeparameter in the callback is an authorization code. While typically small, certain configurations in Entra ID/Microsoft Account can result in extremely long, encoded strings if they encapsulate complex session states or nested claims. - Protocol Specification: The OAuth2/OpenID Connect protocol does not strictly define a maximum length for the
codeorstateparameters, leaving it up to the IdP and the RP (Relying Party) to handle the payload. - IIS/Kestrel Constraints: By default, web servers like IIS and Kestrel impose strict limits on URI length to prevent Denial of Service (DoS) attacks via large header/query string injections.
Why This Happens in Real Systems
In a perfect testing environment, tokens are small and predictable. In real-world distributed systems, complexity scales non-linearly:
- Identity Provider Variability: Different IdPs (Microsoft, Google, Okta) use different encoding schemes. Microsoft’s implementation can produce significantly larger strings when handling complex enterprise identities.
- Security Layers: Security appliances (WAFs, Load Balancers, Reverse Proxies) sit between the user and the app. Each layer has its own Max Query String limit.
- Protocol Evolution: As more metadata is packed into the authentication handshake to improve “statelessness,” the size of the redirect URL grows.
Real-World Impact
- Service Unavailability: Users are unable to log in, leading to a total loss of authentication functionality.
- Cascading Failures: If the authentication service is a dependency for other microservices, the entire ecosystem may appear “down.”
- Operational Overhead: Engineers often waste hours debugging the application code or the Identity Server configuration, when the issue is actually at the Web Server (IIS/Nginx) or Proxy layer.
Example or Code
To resolve this in an IIS-hosted environment, you must modify the web.config to allow larger query strings.
If using Kestrel directly (non-IIS), you must configure the Limits property:
public static IHostBuilder CreateHostBuilder(string[] args) =>
Host.CreateDefaultBuilder(args)
.ConfigureKestrel(serverOptions =>
{
serverOptions.Limits.MaxRequestLineSize = 32768;
})
.ConfigureWebHostDefaults(webBuilder =>
{
webBuilder.UseStartup();
});
How Senior Engineers Fix It
A senior engineer approaches this by looking at the entire request lifecycle, not just the code:
- Identify the Bottleneck: Use network traces (Fiddler/Wireshark) to confirm exactly which component (Browser -> WAF -> Load Balancer -> IIS -> Kestrel) is rejecting the request.
- Infrastructure as Code (IaC): Instead of a “quick fix” on a live server, update the Terraform or ARM templates to ensure the query string limits are globally consistent across all environments.
- Defense in Depth: Don’t just increase the limit; ensure that while the limit is higher, the application still performs strict validation on the incoming parameters to prevent buffer overflow or DoS attempts.
- Negotiate Protocol: If the payload is excessively large, a senior engineer would investigate if the IdP supports POST-based callbacks (Form Post Response Mode) instead of GET-based redirects, which moves the payload from the URL to the request body.
Why Juniors Miss It
- Narrow Scope: Juniors tend to assume the problem is in the Logic Layer (the C# code) rather than the Infrastructure Layer (IIS/Network).
- Looking at the Wrong End: They attempt to fix the “sender” (Microsoft) or the “receiver” (Identity Server) instead of recognizing that the “carrier” (the HTTP Request) is what failed.
- Ignoring Defaults: They assume standard library defaults (like
AddMicrosoftAccount) are “magic” and work out of the box, failing to realize that security-hardened defaults are often the cause of “unexpected” failures in production.