Skip to content

feat: allow cluster creation without node pool#2520

Merged
apeabody merged 12 commits intoterraform-google-modules:mainfrom
m0ps:main
Jan 23, 2026
Merged

feat: allow cluster creation without node pool#2520
apeabody merged 12 commits intoterraform-google-modules:mainfrom
m0ps:main

Conversation

@m0ps
Copy link
Contributor

@m0ps m0ps commented Jan 13, 2026

#2408 seems stuck, so I created this MR to allow creation of GKE clusters without node pools defined. This will help to resolve #2407 (which has been closed as non-planned due to the lack of attention).
I've also created a test case (which leverages simple_regional_private submodule) to validate functionality.

Closes: #2408
Fixes: #2407

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @m0ps, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the GKE Terraform module by allowing users to provision a GKE cluster without requiring an initial node pool definition. This change provides greater flexibility in cluster setup, particularly for scenarios where node pools might be managed separately or added post-creation. The implementation involves conditional logic to handle the absence of node pool configurations and is validated with a new dedicated example.

Highlights

  • Optional Node Pool Creation: The module now supports creating a GKE cluster without explicitly defining any node pools, addressing a previous limitation where a default node pool was always created.
  • Conditional Node Pool Referencing: References to the first node pool (var.node_pools[0]) have been updated to use a new local variable (local.head_node_pool), which is an empty object if no node pools are provided, ensuring graceful handling of optional node pools.
  • Module Version Update: Numerous examples across the repository have been updated to reference the new module version ~> 43.0.
  • New Example and Test Case: A new example (simple_regional_private_no_pool) and corresponding integration tests have been added to demonstrate and validate the functionality of creating a GKE cluster without node pools.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable feature to allow GKE cluster creation without any initial node pools. The implementation is solid, using a local.head_node_pool to gracefully handle cases where var.node_pools is empty. This change is consistently applied across the module's templates and generated files. The logic for deriving the cluster name has also been refactored to be more robust by not depending on the presence of node pools. A new example and corresponding integration tests have been added to validate this new capability. My review includes a couple of suggestions for the new example to ensure its correctness and consistency with the rest of the codebase.

@m0ps
Copy link
Contributor Author

m0ps commented Jan 13, 2026

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable feature to allow the creation of GKE clusters without any predefined node pools. The changes are well-implemented across the various modules and templates.

Key changes include:

  • A new local variable local.head_node_pool is introduced to safely handle an empty var.node_pools list, preventing errors when a cluster is created without node pools.
  • The logic for deriving the cluster name (cluster_name_computed) has been refactored to rely on the cluster ID instead of a node pool ID, making it more robust.
  • Dependencies on node pool resources in outputs have been correctly removed.
  • A new example simple_regional_private_no_pool has been added, along with a corresponding integration test, which effectively demonstrates and validates the new functionality.

The code is clean, consistent, and the changes are applied correctly throughout the repository. The version bumps across many examples are noted but seem appropriate for this feature release. Overall, this is a solid contribution that improves the module's flexibility.

@m0ps
Copy link
Contributor Author

m0ps commented Jan 13, 2026

Here is another report of the same issue: #2303

@apeabody
Copy link
Collaborator

/gcbrun

@m0ps
Copy link
Contributor Author

m0ps commented Jan 13, 2026

Could you please share details of the failed job @apeabody ?
I've validated tests on my local machine and everything is green. 🤔

@apeabody
Copy link
Collaborator

Thanks @m0ps - This is just a text fixture mismatch, let me re-run. We might need to skip validating the status path.

Step #22 - "verify simple-regional-private-no-pool-local":         	Error:      	Not equal: 
Step #22 - "verify simple-regional-private-no-pool-local":         	            	expected: "RUNNING"
Step #22 - "verify simple-regional-private-no-pool-local":         	            	actual  : "RECONCILING"
Step #22 - "verify simple-regional-private-no-pool-local":         	            	
Step #22 - "verify simple-regional-private-no-pool-local":         	            	Diff:
Step #22 - "verify simple-regional-private-no-pool-local":         	            	--- Expected
Step #22 - "verify simple-regional-private-no-pool-local":         	            	+++ Actual
Step #22 - "verify simple-regional-private-no-pool-local":         	            	@@ -1 +1 @@
Step #22 - "verify simple-regional-private-no-pool-local":         	            	-RUNNING
Step #22 - "verify simple-regional-private-no-pool-local":         	            	+RECONCILING
Step #22 - "verify simple-regional-private-no-pool-local":         	Test:       	TestSimpleRegionalPrivateNoPool
Step #22 - "verify simple-regional-private-no-pool-local":         	Messages:   	For path "status" expected "RECONCILING" to match fixture "RUNNING"

@m0ps
Copy link
Contributor Author

m0ps commented Jan 14, 2026

Ah… when I tested it in my local env, one of ~10 attempts failed with a similar issue. 🤷‍♂️

@m0ps
Copy link
Contributor Author

m0ps commented Jan 14, 2026

Hm... Still failing... same issue, @apeabody ?

@apeabody
Copy link
Collaborator

Yes @m0ps

Step #22 - "verify simple-regional-private-no-pool-local":         	Error:      	Not equal: 
Step #22 - "verify simple-regional-private-no-pool-local":         	            	expected: "RUNNING"
Step #22 - "verify simple-regional-private-no-pool-local":         	            	actual  : "RECONCILING"
Step #22 - "verify simple-regional-private-no-pool-local":         	            	
Step #22 - "verify simple-regional-private-no-pool-local":         	            	Diff:
Step #22 - "verify simple-regional-private-no-pool-local":         	            	--- Expected
Step #22 - "verify simple-regional-private-no-pool-local":         	            	+++ Actual
Step #22 - "verify simple-regional-private-no-pool-local":         	            	@@ -1 +1 @@
Step #22 - "verify simple-regional-private-no-pool-local":         	            	-RUNNING
Step #22 - "verify simple-regional-private-no-pool-local":         	            	+RECONCILING
Step #22 - "verify simple-regional-private-no-pool-local":         	Test:       	TestSimpleRegionalPrivateNoPool
Step #22 - "verify simple-regional-private-no-pool-local":         	Messages:   	For path "status" expected "RECONCILING" to match fixture "RUNNING"

I suggest pulling the status path from https://github.com/terraform-google-modules/terraform-google-kubernetes-engine/pull/2520/files#diff-a2f9e2d8dadefba8ea6a4e5b856baa3c458b81dc6aa5b3e921c33d09bfba0c0aR49 and instead use a custom validation similar to:

assert.Contains([]string{"RUNNING", "RECONCILING"}, cluster.Get("status").String())

@m0ps
Copy link
Contributor Author

m0ps commented Jan 15, 2026

Thanks for the hint, @apeabody. It absolutely makes sense... This test creates a cluster without a node pool, similar to autopilot clusters, and the test for autopilot also accepts both "RUNNING" and "RECONCILING" states. I've pushed the change.

Additionally, I noticed that one of the test clusters is prevented from being deleted because the deletion_policy is undefined (I guess it was just a typo). So I've added the following change to allow clean shutdown of test env:

deletion_policy = "DELETE"

https://github.com/terraform-google-modules/terraform-google-kubernetes-engine/pull/2520/changes#diff-e6f894e57c98c30aaae85b0bdfd549419cf36774082cfe2371103b3db3b5b2dfR124

@apeabody
Copy link
Collaborator

/gcbrun

@m0ps
Copy link
Contributor Author

m0ps commented Jan 16, 2026

All 4 checks are green now. Please let me know if something else is required, @apeabody. If not, I hope it will be ok to merge.

@apeabody
Copy link
Collaborator

All 4 checks are green now. Please let me know if something else is required, @apeabody. If not, I hope it will be ok to merge.

Hi @m0ps - What's the behavior when upgrading an existing cluster to this new version? Are any migrations notes required?

@m0ps
Copy link
Contributor Author

m0ps commented Jan 19, 2026

Thanks for the question, @apeabody. I forgot to clarify this important point: this change doesn't affect existing clusters.

Frankly, I haven't tested all examples with this change; I've only tested about 5 randomly selected ones. The nature of the change is pretty simple:
Currently, if someone doesn't want to manage any nodepools (just wants to rely on Custom Compute Classes to handle it), it's not possible because many modules' logic relies on the spec of the first node pool, e.g., lookup(var.node_pools[0], "image_type", "COS_CONTAINERD"). If var.node_pools is an empty list, lookup can't access the first element and can't fall back to the default value. Introducing local.head_node_pool solves this issue by returning an empty map if there are no node pools: length(var.node_pools) > 0 ? var.node_pools[0] : {}, so lookup can properly pick up the default value.

The only caveat is that if someone wants to create a brand-new cluster without a nodepool, the default compute class should be enabled: enable_default_compute_class = true. Originally, I planned to add validation that ensures the default CC is specified if var.node_pools is empty, but I feel like forcing ppl to have a default CC isn't right since it's possible to have a cluster without one. Users can create (multiple) custom CCs and use them to accommodate workloads.

Heh... you know... I think that what has been mentioned in the previous paragraph is exactly what you are looking for. I guess it makes sense to mention the following:

Starting ver XX.XX.XX, it's possible to create clusters without having manually-defined node pools. To use it, it either needs to enable the default compute class or create a custom compute class once the cluster is provisioned.

@m0ps
Copy link
Contributor Author

m0ps commented Jan 21, 2026

So... any feedback, @apeabody?

Copy link
Collaborator

@apeabody apeabody left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @m0ps!

@apeabody apeabody merged commit 8b65cd7 into terraform-google-modules:main Jan 23, 2026
8 of 10 checks passed
@apeabody apeabody self-assigned this Jan 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for GKE clusters with no default node pools

2 participants