This guide walks you through migrating TrueFoundry AWS infrastructure modules from AWS Terraform provider v5 to v6. The main module has submodules with dependencies that must be updated in a specific order to avoid downtime. The migration is performed in phases. Karpenter requires special handling through an intermediate version to ensure a seamless transition from IRSA to Pod Identity.Documentation Index
Fetch the complete documentation index at: https://www.truefoundry.com/llms.txt
Use this file to discover all available pages before exploring further.
Prerequisites
Before starting the migration, ensure the following:- OpenTofu 1.10+ is installed (recommended). Terraform also works if OpenTofu is not available.
- AWS CLI is installed and configured with appropriate credentials.
- Your modules are currently on the versions listed in the From column of the Module Version Reference table below.
- You have sufficient IAM permissions to manage EKS, IAM roles, SQS, and related resources.
- Your current infrastructure has a clean plan — run
tofu plan(orterraform plan) and confirm it shows no pending changes before starting. If there is drift between your state and your infrastructure, resolve it first to avoid mixing unrelated changes into the migration.
Module Version Reference
The following table summarizes the version changes for each module. Modules marked with * have an intermediate version that must be applied before the final upgrade.| Module | From (starting) | Intermediate | To (v6 compatible) |
|---|---|---|---|
| network | v0.3.10 | — | v0.4.0 |
| eks * | v0.7.20 | v0.7.21 | v0.8.1 |
| efs | v0.4.5 | — | v0.5.0 |
| aws-load-balancer-controller | v0.1.5 | — | v0.2.0 |
| karpenter * | v0.3.12 | v0.3.13 | v0.4.0 |
| tfy-platform-features | v0.4.13 | — | v0.5.0 |
| control-plane (if applicable) | v0.4.24 | — | v0.5.0 |
Karpenter Upgrade Strategy
Upgrading Karpenter requires releasing an intermediate version before the final release to ensure zero downtime. The migration transitions Karpenter from IRSA to Pod Identity. How the phased upgrade works:- Version v0.3.13 is deployed first. It creates new resources (SQS queue, IAM role with Pod Identity) that run simultaneously alongside the older resources.
- The Karpenter Helm chart values are updated to point to the newly created resources.
- A
disable_old_changesflag controls the cleanup of old resources. When set totrue, the older IRSA-based resources are removed. - Version v0.4.0 is the final release that is fully AWS provider v6 compatible.
Resource transition details
Resource transition details
| Resource | Old (disable_old_changes = false) | New (disable_old_changes = true) |
|---|---|---|
| SQS queue | <cluster_name>-karpenter | <cluster_name>-karpenter-queue |
| Controller IAM role | <cluster_name>-karpenter | <cluster_name>-karpenter-controller |
| Role trust | IRSA only | Pod Identity |
| Instance profile | <cluster_name>-karpenter-initial | Same (unchanged, in-place update) |
| CloudWatch rules | Managed individually | Managed by sub-module |
| IRSA module | module.karpenter_irsa_role[0] | Removed |
Migration Steps
Pin modules to intermediate versions
| Module | Version |
|---|---|
| network | v0.3.10 |
| eks | v0.7.21 |
| efs | v0.4.5 |
| aws-load-balancer-controller | v0.1.5 |
| karpenter | v0.3.13 |
| tfy-platform-features | v0.4.13 |
| control-plane (if applicable) | v0.4.24 |
tofu plan (or terraform plan) and review the output before applying. Verify no unexpected changes are shown.Prepare EKS and Karpenter modules
node_security_group_additional_rules block from the cluster module and bump it to v0.7.21 to install the EKS Pod Identity Agent:tofu plan (or terraform plan) and review the output. You should see new resources being created (new SQS queue, new IAM role with Pod Identity) while existing resources remain unchanged.Update Karpenter Helm chart values
serviceAccount annotationsFind and remove the following lines from your Karpenter Helm chart values:interruptionQueue nameAppend -queue to the end of the interruptionQueue value in your Karpenter Helm chart values:Clean up old Karpenter resources
disable_old_changes = true to remove the old IRSA-based resources:tofu plan (or terraform plan) and review the output. You should see the old IRSA module, old SQS queue, and old CloudWatch rules being destroyed. The new resources created in the previous step should remain untouched.Upgrade modules to final v6-compatible versions
cluster_oidc_issuer_arn input:terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts, note that the required input variables may differ from previous versions.Carefully review and update your module configuration to match the example provided below, and adjust variable references as needed.The EBS module requires use_name_prefix = false to prevent the IAM role from being recreated, and a policy_name parameter.oidc_provider_arn = module.cluster.oidc_provider_arn from the karpenter module definition.tofu plan (or terraform plan) and review the output carefully. If you see any unexpected resource deletions, investigate before applying. The use_name_prefix = false addition in the EBS module is specifically to avoid an unnecessary role recreation.Update Karpenter Helm chart version
tfy-karpenter Helm chart to version 0.5.11.Post-migration validation
Running status with all containers ready.2. Verify nodes can be provisionedNo changes. Your infrastructure matches the configuration.Rollback
If you encounter issues during the migration, you can revert to the previous state. Before Step 3 (old resources still exist):- Revert the Karpenter Helm chart values to restore the
serviceAccount.annotationsand originalinterruptionQueuename (without the-queuesuffix). - Revert the Karpenter module version to v0.3.12 in your
.tffiles. - Revert the cluster module to restore the
node_security_group_additional_rulesblock and set it back to v0.7.20 if needed. - Run
tofu plan(orterraform plan) to confirm the rollback scope, then apply.
- Set
disable_old_changes = falseon the Karpenter module (still at v0.3.13) and apply to recreate the old resources. - Revert the Karpenter Helm chart values to restore the
serviceAccount.annotationsand originalinterruptionQueuename. - Once Karpenter is healthy with the old resources, revert the module version to v0.3.12 and apply.
Troubleshooting
Karpenter pods crashlooping after cleaning up old resources
Karpenter pods crashlooping after cleaning up old resources
- The Karpenter Helm chart values were updated before running Step 3. The
serviceAccount.annotationsshould be removed andinterruptionQueueshould have the-queuesuffix. - The Pod Identity association was created successfully. Check with:
- The new SQS queue exists:
disable_old_changes to false and apply to recreate the old resources, then follow the steps in order.OpenTofu/Terraform plan shows unexpected resource deletions
OpenTofu/Terraform plan shows unexpected resource deletions
tofu plan (or terraform plan) shows resources being destroyed that you do not expect, do not apply. Common causes include:- IAM role recreation: Ensure the EBS module has
use_name_prefix = falseset. Without this, the role name gets a random suffix and OpenTofu/Terraform sees it as a new resource. - State drift: If resources were modified outside of OpenTofu/Terraform, the plan may show unexpected changes. Run
tofu refresh(orterraform refresh) to sync state before re-running the plan. - Module source changes: Verify all module
sourceandversionfields match the values in this guide exactly.
SQS queue name mismatch or interruption handler not working
SQS queue name mismatch or interruption handler not working
- Confirm the
interruptionQueuevalue in the Karpenter Helm chart matches the actual SQS queue name. After migration, it should be<cluster_name>-karpenter-queue. - Verify the queue exists and has the correct permissions:
- Check Karpenter logs for queue-related errors:
Pod Identity not taking effect
Pod Identity not taking effect
- Verify the EKS Pod Identity Agent addon is installed and running:
- Confirm the Pod Identity association exists:
- Restart the Karpenter pods to pick up the Pod Identity credentials:
- If the EKS Pod Identity Agent is missing, verify that the cluster module was upgraded to v0.7.21 or later in Step 1, which installs this addon.