-
Notifications
You must be signed in to change notification settings - Fork 40
Pass API Endpoint & CA Cert directly to bootstrap.sh #309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pass API Endpoint & CA Cert directly to bootstrap.sh #309
Conversation
Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: Eytan Avisror <[email protected]>
Codecov Report
@@ Coverage Diff @@
## master #309 +/- ##
==========================================
+ Coverage 70.05% 70.17% +0.11%
==========================================
Files 19 19
Lines 3156 3168 +12
==========================================
+ Hits 2211 2223 +12
Misses 804 804
Partials 141 141
Continue to review full report at Codecov.
|
Signed-off-by: Eytan Avisror <[email protected]>
| sb.WriteString("--use-max-pods false ") | ||
| } | ||
| if state.Cluster != nil { | ||
| sb.WriteString(fmt.Sprintf("--b64-cluster-ca %v ", aws.StringValue(state.Cluster.CertificateAuthority.Data))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's some tricky behavior in the EKS AL2 AMI that we might want to solve for before merging this change.
See https://github.com/awslabs/amazon-eks-ami/blob/master/files/bootstrap.sh#L332
Basically, if the cluster ca and api server endpoint are defined, the AMI bypasses calling EKS - which is what we want! Unfortunately, that call also grabs the serviceIpv4Cidr field, which is used later in the bootstrap script to calculate the default ip that should be used for DNS (the kube-dns/core-dns service ip basically). We probably want to replicate this logic and inject the dns ip address in case the user is relying on this behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, will look into this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@backjo take a look again, I added discovery of cluster ip in the case of AL2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look pretty good! If I might suggest one more change - looking up the IPv4 CIDR range of the node that IM is running on adds a somewhat implicit assumption that Instance Manager is running in the same VPC - which I think is generally a fair assumption, though it could make debugging with a remote cluster from an IDE a bit more difficult :P
I think a cleaner approach to determine the VPC CIDR range for the nodes might be to lookup the VPC attached to the EKS cluster - and use the CIDR range associated with it for the fallback logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah.. technically we can remove the fallback all together, since reconcile depends on getting cluster information.
I think it's safe to assume we will always have the ServiceCIDR and remove the metadata fallback logic. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with it - the only open question I have is whether ServiceCIDR is always defined - I know that it was only added in October 2020 - and all of our clusters are newer than that - so could use some help verifying that even for older clusters, the value is defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can confirm I have clusters from 2019 which have this populated.
Makes sense when they add a new API to make sure it's populated across all objects otherwise this would be impossible to use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've also removed the ec2metadata discovery fallback since we guarantee to always have the cluster payload or fail the reconcile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool - sounds good!
Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: Eytan Avisror [email protected]
Fixes #306
This PR uses the payload from DescribeCluster to feed CA & API endpoint directly to userdata in order to avoid making those calls and potentially getting throttled on user-data.
This also fixes the argument passing in the case of Windows so that we wrap the Kubelet arguments same as we do for Linux
TBD