Migrating AWS Terraform Provider from v5 to v6

This guide walks you through migrating TrueFoundry AWS infrastructure modules from AWS Terraform provider v5 to v6. The main module has submodules with dependencies that must be updated in a specific order to avoid downtime. The migration is performed in phases. Karpenter requires special handling through an intermediate version to ensure a seamless transition from IRSA to Pod Identity.

Prerequisites

Before starting the migration, ensure the following:

OpenTofu 1.10+ is installed (recommended). Terraform also works if OpenTofu is not available.
AWS CLI is installed and configured with appropriate credentials.
Your modules are currently on the versions listed in the From column of the Module Version Reference table below.
You have sufficient IAM permissions to manage EKS, IAM roles, SQS, and related resources.
Your current infrastructure has a clean plan — run tofu plan (or terraform plan) and confirm it shows no pending changes before starting. If there is drift between your state and your infrastructure, resolve it first to avoid mixing unrelated changes into the migration.

Back up your OpenTofu/Terraform state before starting this migration. If your state is stored in S3, ensure you can recover a previous version if needed.

We recommend confirming that S3 bucket versioning is enabled on your OpenTofu/Terraform state bucket as a best practice before beginning. This allows you to recover any prior state file version if something goes wrong.

Module Version Reference

The following table summarizes the version changes for each module. Modules marked with * have an intermediate version that must be applied before the final upgrade.

Module	From (starting)	Intermediate	To (v6 compatible)
network	v0.3.10	—	v0.4.1
eks *	v0.7.20	v0.7.21	v0.8.3
efs	v0.4.5	—	v0.5.2
aws-load-balancer-controller	v0.1.5	—	v0.2.1
karpenter *	v0.3.12	v0.3.13	v0.4.3
tfy-platform-features	v0.4.13	—	v0.5.0
control-plane (if applicable)	v0.4.24	—	v0.5.1

Before you begin the migration, verify that your configuration is at a clean plan state with all modules pinned to the starting versions listed above. Run tofu plan (or terraform plan) and confirm that there are no pending changes before proceeding.

Karpenter Upgrade Strategy

Upgrading Karpenter requires releasing an intermediate version before the final release to ensure zero downtime. The migration transitions Karpenter from IRSA to Pod Identity. How the phased upgrade works:

Version v0.3.13 is deployed first. It creates new resources (SQS queue, IAM role with Pod Identity) that run simultaneously alongside the older resources.
The Karpenter Helm chart values are updated to point to the newly created resources.
A disable_old_changes flag controls the cleanup of old resources. When set to true, the older IRSA-based resources are removed.
Version v0.4.3 is the final release that is fully AWS provider v6 compatible.

Resource transition details

The following table shows how each Karpenter-managed resource transitions during the migration:

Resource	Old (`disable_old_changes = false`)	New (`disable_old_changes = true`)
SQS queue	`<cluster_name>-karpenter`	`<cluster_name>-karpenter-queue`
Controller IAM role	`<cluster_name>-karpenter`	`<cluster_name>-karpenter-controller`
Role trust	IRSA only	Pod Identity
Instance profile	`<cluster_name>-karpenter-initial`	Same (unchanged, in-place update)
CloudWatch rules	Managed individually	Managed by sub-module
IRSA module	`module.karpenter_irsa_role[0]`	Removed

Migration Steps

Before applying any step, always run tofu plan (or terraform plan) and carefully review the output. If you see unexpected resource deletions, stop and investigate before proceeding.

Pin modules to intermediate versions

Ensure all modules are at the following versions before proceeding. If any module is on an older version, update it to the version shown here and apply first.

Module	Version
network	v0.3.10
eks	v0.7.21
efs	v0.4.5
aws-load-balancer-controller	v0.1.5
karpenter	v0.3.13
tfy-platform-features	v0.4.13
control-plane (if applicable)	v0.4.24

Run tofu plan (or terraform plan) and review the output before applying. Verify no unexpected changes are shown.

Prepare EKS and Karpenter modules

This step prepares the EKS cluster for Pod Identity and sets up the intermediate Karpenter version.1. Update the cluster moduleRemove the node_security_group_additional_rules block from the cluster module and bump it to v0.7.21 to install the EKS Pod Identity Agent:

# Remove this entire block from your cluster module configuration
node_security_group_additional_rules = {
  "ingress_control_plane_all" = {
    "description" = "Control plane to node all ports/protocols"
    "protocol"    = "-1"
    "from_port"   = 0
    "to_port"     = 0
    "type"        = "ingress"
    "cidr_blocks" = "${module.network.private_subnets_cidrs}"
  }
}

2. Update the Karpenter moduleMove the Karpenter module to v0.3.13 with the following settings:

module "karpenter" {
  source  = "truefoundry/truefoundry-karpenter/aws"
  version = "0.3.13"

  # Keep disable_old_changes as false to create new resources
  # alongside old ones during the transition
  disable_old_changes                              = false
  karpenter_iam_role_policy_prefix_enable_override = false

  # ... your existing configuration ...
}

Run tofu plan (or terraform plan) and review the output. You should see new resources being created (new SQS queue, new IAM role with Pod Identity) while existing resources remain unchanged.

Apply the changes after reviewing the plan.

Update Karpenter Helm chart values

These changes apply to the Karpenter Helm chart values, not the Karpenter Config Helm chart values.

1. Remove the serviceAccount annotationsFind and remove the following lines from your Karpenter Helm chart values:

# Remove these lines
serviceAccount:
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::<ACCOUNT_ID>:role/<KARPENTER_ROLE>

This annotation is no longer needed because Karpenter will now use Pod Identity instead of IRSA for authentication.2. Update the interruptionQueue nameAppend -queue to the end of the interruptionQueue value in your Karpenter Helm chart values:

# Before
settings:
  interruptionQueue: <cluster_name>-karpenter

# After
settings:
  interruptionQueue: <cluster_name>-karpenter-queue

3. Submit the Helm chart changes

You must submit and apply these Helm chart changes before proceeding to the next step. The Karpenter pods will restart and begin using the new SQS queue and Pod Identity. Verify that Karpenter pods are running and healthy before continuing.

Clean up old Karpenter resources

Run the Karpenter OpenTofu/Terraform module with disable_old_changes = true to remove the old IRSA-based resources:

module "karpenter" {
  source  = "truefoundry/truefoundry-karpenter/aws"
  version = "0.3.13"

  disable_old_changes                          = true
  karpenter_iam_role_policy_prefix_enable_override = false

  # ... your existing configuration ...
}

Run tofu plan (or terraform plan) and review the output. You should see the old IRSA module, old SQS queue, and old CloudWatch rules being destroyed. The new resources created in the previous step should remain untouched.

Apply the changes after confirming the plan only removes old resources.

Upgrade modules to final v6-compatible versions

Now upgrade all modules to their final AWS provider v6-compatible versions.1. Version-only bumpsUpdate the following modules to their new versions. These require only a version change with no other configuration modifications:

# Network
version = "0.4.1"

# Cluster (EKS)
version = "0.8.3"

# Platform Features
version = "0.5.0"

# Control Plane (if applicable)
version = "0.5.1"

The cluster module’s default cluster_version is 1.35 as of v0.8.3 (it was 1.34 in earlier releases). If you do not pin cluster_version explicitly in your configuration, bumping the module will trigger an in-place Kubernetes control-plane upgrade alongside the provider migration. Pin cluster_version to your current version to keep this a version-only bump, and upgrade Kubernetes separately afterward.

2. EFS moduleThe EFS module now requires the cluster_oidc_issuer_arn input:

module "efs" {
  # ... existing configuration ...
  version                 = "0.5.2"
  cluster_oidc_issuer_arn = module.eks.oidc_provider_arn  # add this line
  # ...
}

3. EBS module

If you are updating the module source to terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts, note that the required input variables may differ from previous versions.Carefully review and update your module configuration to match the example provided below, and adjust variable references as needed.The EBS module requires use_name_prefix = false to prevent the IAM role from being recreated, and a policy_name parameter.

module "ebs" {
  source              = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts"
  version             = "6.4.0"
  create              = true
  use_name_prefix     = false  # add this to prevent role recreation
  name                = "${var.cluster_name}-csi-ebs"
  policy_name         = "${var.cluster_name}-csi-ebs-policy"  # add this line
  attach_ebs_csi_policy = true
  oidc_providers = {
    ebs = {
      provider_arn               = module.eks.oidc_provider_arn
      namespace_service_accounts = ["aws-ebs-csi-driver:ebs-csi-controller-sa"]
    }
  }
  tags = var.tags
}

4. Karpenter module (final version)Upgrade Karpenter to the final v0.4.3 release. The module configuration is simplified since the migration is now complete:

Remove oidc_provider_arn = module.cluster.oidc_provider_arn from the karpenter module definition.

As of v0.4.3, the Karpenter controller uses an inline IAM policy by default (karpenter_enable_inline_policy = true), which avoids the PolicySize: 6144 quota error. If you are coming from a configuration that used a managed policy, tofu plan will show the controller’s IAM policy changing form (managed policy removed, inline policy added on the role). The permissions are equivalent — this diff is expected and safe to apply.

module "karpenter" {
  source                       = "truefoundry/truefoundry-karpenter/aws"
  version                      = "0.4.3"
  cluster_name                 = var.cluster_name
  controller_node_iam_role_arn = var.use_existing_cluster ? var.existing_cluster_node_role_arn : module.eks.eks_managed_node_groups.initial.iam_role_arn
  controller_nodegroup_name    = "initial"
  tags                         = var.tags
}

5. AWS Load Balancer Controller moduleUpgrade the AWS Load Balancer Controller module to the final v0.2.1 release. The module configuration is simplified since the migration is now complete:

module "aws-load-balancer-controller" {
  # ... existing configuration ...
  version = "0.2.1"
  elb_controller_use_name_prefix = false
  # ...
}

6. TrueFoundry moduleUpdate the TrueFoundry module to reference the new EBS IAM role ARN:

"awsEbsCsiDriver" = {
  "enabled" = true
  "roleArn" = "${module.ebs.arn}"
}

Run tofu plan (or terraform plan) and review the output carefully. If you see any unexpected resource deletions, investigate before applying. The use_name_prefix = false addition in the EBS module is specifically to avoid an unnecessary role recreation.

Apply all changes after reviewing the plan.

Update Karpenter Helm chart version

After all OpenTofu/Terraform changes have been applied, update the tfy-karpenter Helm chart to version 0.5.11.

This step ensures that the Karpenter deployment is using the compatible chart release after the infrastructure migration is complete. Make sure to upgrade only after applying all previous changes.

Post-migration validation

After completing all upgrade steps, verify that everything is working correctly.1. Verify Karpenter pods are healthy

kubectl get pods -n kube-system -l app.kubernetes.io/name=karpenter

All Karpenter pods should be in Running status with all containers ready.2. Verify nodes can be provisioned

kubectl get nodeclaims

Check that existing NodeClaims are in a healthy state. If you have pending pods that require new nodes, verify that Karpenter provisions them.3. Verify Pod Identity associations

aws eks list-pod-identity-associations --cluster-name <cluster_name>

Confirm that a Pod Identity association exists for the Karpenter service account.4. Verify all module resources

tofu plan

Run a final plan to confirm no further changes are pending. The output should show No changes. Your infrastructure matches the configuration.

Rollback

If you encounter issues during the migration, you can revert to the previous state. Before Step 3 (old resources still exist):

Revert the Karpenter Helm chart values to restore the serviceAccount.annotations and original interruptionQueue name (without the -queue suffix).
Revert the Karpenter module version to v0.3.12 in your .tf files.
Revert the cluster module to restore the node_security_group_additional_rules block and set it back to v0.7.20 if needed.
Run tofu plan (or terraform plan) to confirm the rollback scope, then apply.

After Step 3 (old resources have been removed):

Set disable_old_changes = false on the Karpenter module (still at v0.3.13) and apply to recreate the old resources.
Revert the Karpenter Helm chart values to restore the serviceAccount.annotations and original interruptionQueue name.
Once Karpenter is healthy with the old resources, revert the module version to v0.3.12 and apply.

The phased Karpenter upgrade is designed so that you can revert the Helm chart values at any point before Step 3 to fall back to the old IRSA-based resources without disruption.

Troubleshooting

Karpenter pods crashlooping after cleaning up old resources

If Karpenter pods are crashlooping after Step 3, verify that:

The Karpenter Helm chart values were updated before running Step 3. The serviceAccount.annotations should be removed and interruptionQueue should have the -queue suffix.

The Pod Identity association was created successfully. Check with:

aws eks list-pod-identity-associations --cluster-name <cluster_name>

The new SQS queue exists:

aws sqs get-queue-url --queue-name <cluster_name>-karpenter-queue

If the Helm values were not updated before Step 3, revert disable_old_changes to false and apply to recreate the old resources, then follow the steps in order.

OpenTofu/Terraform plan shows unexpected resource deletions

If tofu plan (or terraform plan) shows resources being destroyed that you do not expect, do not apply. Common causes include:

IAM role recreation: Ensure the EBS module has use_name_prefix = false set. Without this, the role name gets a random suffix and OpenTofu/Terraform sees it as a new resource.
State drift: If resources were modified outside of OpenTofu/Terraform, the plan may show unexpected changes. Run tofu refresh (or terraform refresh) to sync state before re-running the plan.
Module source changes: Verify all module source and version fields match the values in this guide exactly.

SQS queue name mismatch or interruption handler not working

If Karpenter is not processing spot interruption events:

Confirm the interruptionQueue value in the Karpenter Helm chart matches the actual SQS queue name. After migration, it should be <cluster_name>-karpenter-queue.

Verify the queue exists and has the correct permissions:

aws sqs get-queue-attributes --queue-url <queue_url> --attribute-names All

Check Karpenter logs for queue-related errors:

kubectl logs -n kube-system -l app.kubernetes.io/name=karpenter | grep -i "sqs\|queue\|interruption"

Pod Identity not taking effect

If Karpenter is unable to assume its IAM role after the migration:

Verify the EKS Pod Identity Agent addon is installed and running:

kubectl get pods -n kube-system -l app.kubernetes.io/name=eks-pod-identity-agent

Confirm the Pod Identity association exists:

aws eks list-pod-identity-associations --cluster-name <cluster_name>

Restart the Karpenter pods to pick up the Pod Identity credentials:

kubectl rollout restart deployment -n kube-system karpenter

If the EKS Pod Identity Agent is missing, verify that the cluster module was upgraded to v0.7.21 or later in Step 1, which installs this addon.

​Prerequisites

​Module Version Reference

​Karpenter Upgrade Strategy

​Migration Steps

​Rollback

​Troubleshooting

Prerequisites

Module Version Reference

Karpenter Upgrade Strategy

Migration Steps

Rollback

Troubleshooting