Fixing AWS BackupSelection Import Loops in CDK

Summary

The incident involved a failed resource import of an AWS::Backup::BackupSelection resource into an existing CloudFormation stack managed via AWS CDK. The engineer attempted to migrate existing resources (S3 buckets, Vaults, Plans, and Selections) into a managed lifecycle using cdk import. While individual resources like S3 buckets and Backup Plans imported successfully, the BackupSelection resource triggered a circular validation error.

The error manifested in two contradictory states:

  • Providing only the SelectionId resulted in a requirement error: “Both BackupPlanId and SelectionId are required.”
  • Providing both identifiers via a mapping file resulted in a schema error: “Only the id field is required.”

This created a technical deadlock where the CloudFormation import engine and the CDK abstraction layer were fundamentally misaligned on the unique identifier schema for this specific resource type.

Root Cause

The root cause is a mismatch between the CloudFormation Resource Specification and the CloudFormation Import API implementation for the AWS::Backup::BackupSelection resource.

  • Composite Identity: Unlike most AWS resources that use a single unique string as an identifier, AWS::Backup::BackupSelection is a logical child of a Backup Plan. Its identity is functionally composite, requiring both the BackupPlanId and the SelectionId to uniquely address the resource in the AWS backend.
  • API Schema Rigidity: The CloudFormation Import engine expects a single id field in the mapping.json to match the resource’s Logical ID. However, the underlying AWS Backup service validation layer performs a check that sees a “missing” parent ID when only the selection ID is provided.
  • UI/CLI Limitation: The AWS Console and the standard cdk import CLI command are designed around the assumption of a singular identifier, making it impossible for a user to pass the composite key required by the service-side validation logic.

Why This Happens in Real Systems

This phenomenon occurs due to Leaky Abstractions and Evolutionary Debt in cloud provider APIs:

  • Resource Dependencies: Many AWS services use “Parent-Child” relationships where the child does not have a globally unique ID, only a scoped ID.
  • Inconsistent API Surface: CloudFormation’s “Import” feature was implemented for many resources early on, but as new services (like AWS Backup) were added, the implementation of the Import Identifier logic did not always account for composite keys.
  • Schema Divergence: The control plane (CloudFormation) and the data plane (AWS Backup) have different definitions of what constitutes a “unique” resource during an import operation.

Real-World Impact

  • Deployment Deadlock: Engineers become stuck in a loop where automated tooling (CDK) and manual tools (Console) both fail, leading to significant delays in infrastructure migration.
  • Operational Risk: To bypass the error, engineers might feel pressured to delete and recreate resources (like the Backup Plan), which risks data loss or gaps in backup coverage if not handled with extreme care.
  • Broken CI/CD Pipelines: Since the import cannot be scripted via standard cdk import commands, the migration process becomes a manual, high-toil task that cannot be easily audited or repeated.

Example or Code (if necessary and relevant)

The failure occurs when the mapping.json is structured to satisfy the CloudFormation schema but violates the AWS Backup service requirement:

{
  "MyBackupSelection": {
    "id": "selection-12345",
    "BackupPlanId": "plan-67890"
  }
}

Result: Client Error: Only the 'id' field is allowed in the mapping.

VS.

{
  "MyBackupSelection": {
    "id": "selection-12345"
  }
}

Result: Client Error: Both BackupPlanId and SelectionId are required.

How Senior Engineers Fix It

A senior engineer avoids the “import loop” by recognizing that the standard import path is broken and seeks a side-channel migration strategy:

  • The “Adopt via Ref” Strategy: Instead of using the import command, define the resource in the CDK code with the exact same Logical ID and physical properties as the existing resource. Perform a cdk deploy. If the resource already exists and the properties match, CloudFormation may “adopt” the resource or error out gracefully without destruction, depending on the resource type.
  • The CloudFormation “Resource Import” Workaround: If the CLI/UI fails, use a Custom Resource (Lambda) to manually associate the selection or use the AWS CLI directly to manipulate the resource state, then use cloudformation import only for the parts that work.
  • Orchestrated Import: Attempt to import the entire dependency tree (Plan + Selection) in a single transaction. This requires crafting a single template where the Selection is defined as a child of the Plan within the same import operation, ensuring the BackupPlanId is available in the context of the transaction.
  • Manual State Management: In extreme cases, create the resource via CDK/CloudFormation to generate a new ID, then manually update the Backup Plan via the AWS CLI/SDK to point to the new Selection, effectively performing a “blue-green” migration of the backup logic.

Why Juniors Miss It

  • Symmetry Assumption: Juniors often assume that if an S3 bucket (a top-level resource) can be imported easily, all other resources will follow the same singular ID pattern.
  • Tool-Centric Thinking: They tend to focus on fixing the command (trying different mapping.json formats) rather than questioning if the underlying API capability exists.
  • Ignoring the Provider Documentation: They may overlook the nuances in the CloudFormation documentation that specify which resources require composite identifiers for imports.
  • Fear of Deletion: They may attempt to “brute force” the import rather than realizing that sometimes a controlled recreation or a manual state adjustment is the more professional and safer path.

Leave a Comment