Summary
We encountered a deployment failure in our Infrastructure as Code (IaC) pipeline while attempting to solve a circular dependency involving Azure Container Apps and Managed Identities. The goal was to deploy a Container App that uses a System-Assigned Managed Identity, where that identity’s permissions are granted via an Entra ID Group membership rather than direct role assignment.
To bypass the “chicken-and-egg” problem (where the identity doesn’t exist until the app is created, but the app can’t pull its image until the identity has permissions), we attempted a two-stage deployment using two separate Pulumi resources targeting the same Azure resource name. This resulted in a cannot create already existing resource error, effectively breaking the state machine.
Root Cause
The failure is rooted in a fundamental misunderstanding of how IaC State Management interacts with Cloud Provider APIs:
- Resource Identity vs. Logical Name: In Pulumi, every resource is tracked by a unique URN (Uniform Resource Name). We defined two different logical resources (
orchestrator-bootstrapandorchestrator-final) but gave them the same physical Azure name. - The “Create” vs. “Update” Conflict: Because
orchestrator-finalwas a new logical resource in the Pulumi state, Pulumi’s engine issued aCREATEcommand to the Azure API. - API Rejection: The Azure Resource Manager (ARM) responded with an error because a resource with that specific name already existed from the first step.
- State Divergence: The second resource was never successfully “created” in the Pulumi state, meaning subsequent runs would attempt the same failing creation loop indefinitely.
Why This Happens in Real Systems
This specific pattern occurs when engineers attempt to model Temporal Dependencies (things that must happen in a specific order across multiple runs) using Declarative Syntax.
- Identity Latency: In Azure, creating a System-Assigned Identity is instantaneous at the API level, but the propagation of that identity to Entra ID and its availability for role assignments can take seconds or even minutes.
- The Circularity Trap:
- App needs Identity to pull Image.
- Identity needs Group Membership to get Permissions.
- Group Membership needs the Identity’s
principal_id. principal_idonly exists after the App is provisioned.
- Imperative Thinking in Declarative Tools: Engineers often try to “step through” a complex deployment by splitting it into multiple resource definitions, forgetting that the IaC tool expects a single source of truth for a single physical resource.
Real-World Impact
- Deployment Deadlocks: Automated CI/CD pipelines will fail consistently, requiring manual intervention to “triage” the state.
- Broken Rollbacks: Because the state is inconsistent (the resource exists in Azure but isn’t fully tracked in the “final” form in Pulumi), running
pulumi destroyorpulumi upmay result in orphaned resources or unpredictable errors. - Increased MTTR (Mean Time To Recovery): Teams spend hours debugging “provider errors” when the issue is actually a logical error in the dependency graph.
Example or Code (if necessary and relevant)
import pulumi
import pulumi_azure_native as azure_native
import pulumi_azuread as azuread
from pulumi_azure_native.app import (
ConfigurationArgs,
ContainerApp,
ContainerArgs,
ContainerResourcesArgs,
ManagedServiceIdentityArgs,
ManagedServiceIdentityType,
TemplateArgs,
ScaleArgs,
)
# ERROR PATTERN: Using two resources for one physical entity
# This causes the "already existing resource" error.
bootstrap = ContainerApp(
"orchestrator-bootstrap",
container_app_name="my-app", # Physical Name
resource_group_name="my-rg",
location="eastus",
managed_environment_id="env-id",
identity=ManagedServiceIdentityArgs(type=ManagedServiceIdentityType.SYSTEM_ASSIGNED),
template=TemplateArgs(
containers=[ContainerArgs(name="web", image="mcr.microsoft.com/azuredocs/containerapps-helloworld:latest")]
)
)
# This resource will FAIL because "my-app" already exists
app_final = ContainerApp(
"orchestrator-final",
container_app_name="my-app", # Physical Name (CONFLICT!)
resource_group_name="my-rg",
location="eastus",
managed_environment_id="env-id",
identity=ManagedServiceIdentityArgs(type=ManagedServiceIdentityType.SYSTEM_ASSIGNED),
template=TemplateArgs(
containers=[ContainerArgs(name="web", image="my-registry.azurecr.io/private-image:latest")]
)
)
How Senior Engineers Fix It
Senior engineers solve this by moving away from “two-resource” hacks and instead using Single Resource Lifecycle Management with State-Aware Overrides.
- Single Resource with
ignore_changes: Define oneContainerAppresource. Useignore_changeson thetemplateandconfigurationblocks during the initial bootstrap phase if necessary, or better yet, use a single resource and accept the “failed pull” transient state. - The “Intermediate Image” Strategy: Use a single resource definition but use a Conditional Image. Use a Pulumi
Configvariable to switch from thebootstrap_imageto theprivate_image. - Direct Role Assignment: If the goal is to avoid “Group Management” overhead, assign the
AcrPullrole directly to theprincipal_idof the identity. While this bypasses the group, it is a single-pass, declarative operation that Pulumi handles natively. - Side-Loading via Component Resources: If the multi-step process is truly required, wrap the logic in a Pulumi Component Resource that manages the transition via a single logical object, or use an external orchestrator (like a GitHub Action) to run
pulumi uptwice with different configuration files.
Why Juniors Miss It
- Focus on the “What” not the “How”: Juniors focus on getting the Azure resource to look right, whereas seniors focus on how the IaC State Engine perceives that resource.
- Misunderstanding the State File: Juniors often assume that if a resource exists in Azure, they can just “declare it again” with a different name in code. They fail to realize that the Logical Name in code is the primary key for the state file.
- Ignoring Idempotency: A core tenet of DevOps is Idempotency (running the same code multiple times should result in the same state). The two-resource approach is inherently non-idempotent; it works once (partially) and fails every time thereafter.