Skip to content

Conversation

@LAVEEN
Copy link
Contributor

@LAVEEN LAVEEN commented Nov 11, 2025

This PR adds support for the gpu_topology attribute within the placement_policy for VM instances. This is necessary to support the upcoming A4x GPUs, which require a specific GPU topology to be defined for optimal performance.

@LAVEEN LAVEEN added the release-improvements Added to release notes under the "Improvements" heading. label Nov 11, 2025
@LAVEEN LAVEEN marked this pull request as ready for review November 11, 2025 15:33
@LAVEEN LAVEEN requested review from a team and samskillman as code owners November 11, 2025 15:33
gpu_topology = optional(string)
})
default = null
validation {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the new type check, the validation blocks are not needed. The type checking will ensure the placement_policy is either null or a map of provided keys with each value of the specified type.

Please ensure that this type validation is working on terraform version 1.5.7 (the ctk supported version). You can remove the valdiation blocks after testing to ensure that wrong values are flagged by the type checker

@cboneti
Copy link
Member

cboneti commented Nov 12, 2025

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for the gpu_topology attribute in the placement_policy for VM instances. The changes are well-structured, including updates to the main Terraform logic, variable definitions, and documentation. I've provided a few suggestions to improve documentation accuracy, code style, and validation logic.

| <a name="input_network_storage"></a> [network\_storage](#input\_network\_storage) | An array of network attached storage mounts to be configured. | <pre>list(object({<br/> server_ip = string,<br/> remote_mount = string,<br/> local_mount = string,<br/> fs_type = string,<br/> mount_options = string,<br/> client_install_runner = map(string)<br/> mount_runner = map(string)<br/> }))</pre> | `[]` | no |
| <a name="input_on_host_maintenance"></a> [on\_host\_maintenance](#input\_on\_host\_maintenance) | Describes maintenance behavior for the instance. If left blank this will default to `MIGRATE` except for when `placement_policy`, spot provisioning, or GPUs require it to be `TERMINATE` | `string` | `null` | no |
| <a name="input_placement_policy"></a> [placement\_policy](#input\_placement\_policy) | Control where your VM instances are physically located relative to each other within a zone.<br/>See https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_resource_policy#nested_group_placement_policy | `any` | `null` | no |
| <a name="input_placement_policy"></a> [placement\_policy](#input\_placement\_policy) | Control where your VM instances are physically located relative to each other within a zone.<br/>See https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_resource_policy#nested_group_placement_policy<br/><br/>Supported fields:<br/> vm_count (number) </br> availability_domain_count (number)</br> collocation (string), max_distance (number)</br> gpu_topology (string). | `any` | `null` | no |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The documentation for placement_policy has some issues:

  1. The HTML line break tags are incorrect (</br>). They should be <br/>.
  2. The list of supported fields is not very readable. Using a list format would improve clarity.
  3. The type column is still any, but it has been updated to a specific object type in variables.tf. The documentation should reflect this change, which may require regenerating the documentation.
Suggested change
| <a name="input_placement_policy"></a> [placement\_policy](#input\_placement\_policy) | Control where your VM instances are physically located relative to each other within a zone.<br/>See https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_resource_policy#nested_group_placement_policy<br/><br/>Supported fields:<br/> vm_count (number) </br> availability_domain_count (number)</br> collocation (string), max_distance (number)</br> gpu_topology (string). | `any` | `null` | no |
| <a name="input_placement_policy"></a> [placement\_policy](#input\_placement\_policy) | Control where your VM instances are physically located relative to each other within a zone.<br/>See https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_resource_policy#nested_group_placement_policy<br/><br/>Supported fields:<ul><li>`vm_count` (number)</li><li>`availability_domain_count` (number)</li><li>`collocation` (string)</li><li>`max_distance` (number)</li><li>`gpu_topology` (string)</li></ul> | <pre>object({<br/> vm_count = optional(number)<br/> availability_domain_count = optional(number)<br/> collocation = optional(string)<br/> max_distance = optional(number)<br/> gpu_topology = optional(string)<br/> })</pre> | `null` | no |

availability_domain_count = try(tostring(var.placement_policy.availability_domain_count), "")
max_distance = try(tostring(var.placement_policy.max_distance), "")
collocation = try(var.placement_policy.collocation, "")
gpu_topology = try(var.placement_policy.gpu_topology, "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There is a trailing whitespace at the end of this line. Please remove it to maintain code style consistency.

    gpu_topology              = try(var.placement_policy.gpu_topology, "")

Comment on lines 325 to 331
validation {
condition = var.placement_policy == null ? true : try(keys(var.placement_policy), null) != null
error_message = <<-EOT
The var.placement_policy should be either unset/null or be a map/object with
fields: vm_count (number), availability_domain_count (number), collocation (string), max_distance (number).
fields: vm_count (number), availability_domain_count (number), collocation (string), max_distance (number), gpu_topology (string).
EOT
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since you've changed the type of placement_policy from any to a specific object, this validation block can be improved. The current condition doesn't prevent providing an empty object (e.g., {}), which would cause a failure at apply time because google_compute_resource_policy requires at least one argument in group_placement_policy. A better validation would ensure that if placement_policy is provided, it contains at least one non-null value.

  validation {
    condition     = var.placement_policy == null || length([for v in values(var.placement_policy) : v if v != null]) > 0
    error_message = <<-EOT
    The var.placement_policy should be either unset/null or be an object with at least one non-null field. Supported fields are: vm_count, availability_domain_count, collocation, max_distance, gpu_topology.
    EOT
  }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-improvements Added to release notes under the "Improvements" heading.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants