Skip to content

[Bug] User defined multipart User Data scripts do not work properly - breaks nodeadm functionality #7895

@bradwatsonaws

Description

@bradwatsonaws

What were you trying to accomplish?

I am trying to create a manage node group with my own multipart user data script as part of an overrideBootstrapCommand. This multipart user date script should run a mix of bash commands and also fulfill requirement for nodeadm node initialization.

What happened?

When eksctl creates the launch template and takes the user data script defined by the user, it appears to add it's own multipart boundaries, which prevent the user defined multipart user data script from working as expected. The result is that the node group is created with a launch template as per usual. However, the nodes are unable to join the cluster because nodeadm defaults to using imds for its configuration, and the eksctl created boundaries of the multipart user data script prevent nodeadm from finding a configuration in imds.

Example user defined multipart user data script passed into overrideBootstrapCommand:

MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="BOUNDARY"

--BOUNDARY
Content-Type: application/node.eks.aws

---
apiVersion: node.eks.aws/v1alpha1
kind: NodeConfig
spec:
cluster:
    name: rhel-eks
    apiServerEndpoint: https://myclusterapi.gr7.us-gov-east-1.eks.amazonaws.com
    certificateAuthority: mysuperlongcertificatexyzabc
    cidr: 10.100.0.0/16

--BOUNDARY
Content-Type: text/x-shellscript;

#!/bin/bash
set -ex
systemctl enable kubelet.service
systemctl disable nm-cloud-setup.timer
systemctl disable nm-cloud-setup.service
reboot

--BOUNDARY--

Resulting user data script created by eksctl in the node group launch template:

MIME-Version: 1.0
Content-Type: multipart/mixed; boundary=478b56b7f407b2f8102862b68821d558cacbdf7575b0163bf3b5b98566a8

--478b56b7f407b2f8102862b68821d558cacbdf7575b0163bf3b5b98566a8
Content-Type: text/x-shellscript
Content-Type: charset="us-ascii"

MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="BOUNDARY"

--BOUNDARY
Content-Type: application/node.eks.aws

---
apiVersion: node.eks.aws/v1alpha1
kind: NodeConfig
spec:
cluster:
    name: rhel-eks
    apiServerEndpoint: https://myclusterapi.gr7.us-gov-east-1.eks.amazonaws.com
    certificateAuthority: mysuperlongcertificatexyzabc
    cidr: 10.100.0.0/16

--BOUNDARY
Content-Type: text/x-shellscript;

#!/bin/bash
set -ex
systemctl enable kubelet.service
systemctl disable nm-cloud-setup.timer
systemctl disable nm-cloud-setup.service
reboot

--BOUNDARY--

--478b56b7f407b2f8102862b68821d558cacbdf7575b0163bf3b5b98566a8--

As you can hopefully see, eksctl is generating it's own multipart script with it's own uniquely generated boundaries. This prevents the user defined boundaries from being respected.

How to reproduce it?

A zsh script with paramaters passed in that match the parameters defined at the top of this script:

#!/bin/zsh

EKS_CLUSTER=$1
AMI_ID=$2
MANAGED_NODE_GROUP=$3
AWS_REGION=$4
KEY_PAIR=$5
INSTANCE_TYPE=$6
MIN_SIZE=$7
DESIRED_SIZE=$8
MAX_SIZE=$9
API_ENDPOINT=$10
CIDR=$11
CERTIFICATE=$12
DATE_TIME=$(date +'%Y%m%d%H%M')

cat > managednodegroup-$DATE_TIME.yaml << EOF
---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: $EKS_CLUSTER
  region: $AWS_REGION

managedNodeGroups:
  - name: $MANAGED_NODE_GROUP
    minSize: $MIN_SIZE
    desiredCapacity: $DESIRED_SIZE
    maxSize: $MAX_SIZE
    ami: $AMI_ID
    amiFamily: AmazonLinux2023
    instanceType: $INSTANCE_TYPE
    labels:
      role: worker
    tags:
      nodegroup-name: $MANAGED_NODE_GROUP
    privateNetworking: true

    overrideBootstrapCommand: |
      MIME-Version: 1.0
      Content-Type: multipart/mixed; boundary="BOUNDARY"

      --BOUNDARY
      Content-Type: application/node.eks.aws

      ---
      apiVersion: node.eks.aws/v1alpha1
      kind: NodeConfig
      spec:
        cluster:
          name: $EKS_CLUSTER
          apiServerEndpoint: $API_ENDPOINT
          certificateAuthority: $CERTIFICATE
          cidr: $CIDR

      --BOUNDARY
      Content-Type: text/x-shellscript;

      #!/bin/bash
      set -ex
      systemctl enable kubelet.service
      systemctl disable nm-cloud-setup.timer
      systemctl disable nm-cloud-setup.service
      reboot

      --BOUNDARY--
EOF

eksctl create nodegroup --config-file=managednodegroup-$DATE_TIME.yaml --cfn-disable-rollback

Logs
2024-07-18 08:51:16 [ℹ] will use version 1.29 for new nodegroup(s) based on control plane version
2024-07-18 08:51:18 [ℹ] nodegroup "rhel-eks-nodeadmn-new" will use "ami-095c7b500f70da3d0" [AmazonLinux2/1.29]
2024-07-18 08:51:18 [ℹ] 2 existing nodegroup(s) (rhel-eks-github,rhel-eks-nodeadm) will be excluded
2024-07-18 08:51:18 [ℹ] 1 nodegroup (rhel-eks-nodeadmn-new) was included (based on the include/exclude rules)
2024-07-18 08:51:18 [ℹ] will create a CloudFormation stack for each of 1 managed nodegroups in cluster "rhel-eks"
2024-07-18 08:51:19 [ℹ]
2 sequential tasks: { fix cluster compatibility, 1 task: { 1 task: { create managed nodegroup "rhel-eks-nodeadmn-new" } }
}
2024-07-18 08:51:19 [ℹ] checking cluster stack for missing resources
2024-07-18 08:51:19 [ℹ] cluster stack has all required resources
2024-07-18 08:51:19 [ℹ] building managed nodegroup stack "eksctl-rhel-eks-nodegroup-rhel-eks-nodeadmn-new"
2024-07-18 08:51:20 [ℹ] deploying stack "eksctl-rhel-eks-nodegroup-rhel-eks-nodeadmn-new"
2024-07-18 08:51:20 [ℹ] waiting for CloudFormation stack "eksctl-rhel-eks-nodegroup-rhel-eks-nodeadmn-new"
2024-07-18 08:51:50 [ℹ] waiting for CloudFormation stack "eksctl-rhel-eks-nodegroup-rhel-eks-nodeadmn-new"
2024-07-18 08:52:42 [ℹ] waiting for CloudFormation stack "eksctl-rhel-eks-nodegroup-rhel-eks-nodeadmn-new"
2024-07-18 08:54:03 [ℹ] waiting for CloudFormation stack "eksctl-rhel-eks-nodegroup-rhel-eks-nodeadmn-new"
2024-07-18 08:55:08 [ℹ] waiting for CloudFormation stack "eksctl-rhel-eks-nodegroup-rhel-eks-nodeadmn-new"
2024-07-18 08:56:09 [ℹ] waiting for CloudFormation stack "eksctl-rhel-eks-nodegroup-rhel-eks-nodeadmn-new"
2024-07-18 08:56:59 [ℹ] waiting for CloudFormation stack "eksctl-rhel-eks-nodegroup-rhel-eks-nodeadmn-new"
2024-07-18 08:58:12 [ℹ] waiting for CloudFormation stack "eksctl-rhel-eks-nodegroup-rhel-eks-nodeadmn-new"
2024-07-18 08:59:49 [ℹ] waiting for CloudFormation stack "eksctl-rhel-eks-nodegroup-rhel-eks-nodeadmn-new"
2024-07-18 09:00:26 [ℹ] waiting for CloudFormation stack "eksctl-rhel-eks-nodegroup-rhel-eks-nodeadmn-new"
2024-07-18 09:01:30 [ℹ] waiting for CloudFormation stack "eksctl-rhel-eks-nodegroup-rhel-eks-nodeadmn-new"
2024-07-18 09:02:30 [ℹ] waiting for CloudFormation stack "eksctl-rhel-eks-nodegroup-rhel-eks-nodeadmn-new"
2024-07-18 09:03:54 [ℹ] waiting for CloudFormation stack "eksctl-rhel-eks-nodegroup-rhel-eks-nodeadmn-new"
2024-07-18 09:05:07 [ℹ] waiting for CloudFormation stack "eksctl-rhel-eks-nodegroup-rhel-eks-nodeadmn-new"
2024-07-18 09:06:54 [ℹ] waiting for CloudFormation stack "eksctl-rhel-eks-nodegroup-rhel-eks-nodeadmn-new"
2024-07-18 09:08:02 [ℹ] waiting for CloudFormation stack "eksctl-rhel-eks-nodegroup-rhel-eks-nodeadmn-new"
2024-07-18 09:09:12 [ℹ] waiting for CloudFormation stack "eksctl-rhel-eks-nodegroup-rhel-eks-nodeadmn-new"
2024-07-18 09:10:27 [ℹ] waiting for CloudFormation stack "eksctl-rhel-eks-nodegroup-rhel-eks-nodeadmn-new"
2024-07-18 09:12:07 [ℹ] waiting for CloudFormation stack "eksctl-rhel-eks-nodegroup-rhel-eks-nodeadmn-new"
2024-07-18 09:13:38 [ℹ] waiting for CloudFormation stack "eksctl-rhel-eks-nodegroup-rhel-eks-nodeadmn-new"
2024-07-18 09:14:53 [ℹ] waiting for CloudFormation stack "eksctl-rhel-eks-nodegroup-rhel-eks-nodeadmn-new"
2024-07-18 09:14:53 [ℹ] 1 error(s) occurred and nodegroups haven't been created properly, you may wish to check CloudFormation console
2024-07-18 09:14:53 [ℹ] to cleanup resources, run 'eksctl delete nodegroup --region=us-gov-east-1 --cluster=rhel-eks --name=' for each of the failed nodegroup
2024-07-18 09:14:53 [✖] waiter state transitioned to Failure
Error: failed to create nodegroups for cluster "rhel-eks"

Anything else we need to know?
OS: MacOS
Authentication: SSO through AWS CLI and Okta

Versions
0.187.0

$ eksctl info

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions