-
Notifications
You must be signed in to change notification settings - Fork 259
Adding A4X and Future NPI Support to JBVM #4834
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
| gpu_topology = optional(string) | ||
| }) | ||
| default = null | ||
| validation { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the new type check, the validation blocks are not needed. The type checking will ensure the placement_policy is either null or a map of provided keys with each value of the specified type.
Please ensure that this type validation is working on terraform version 1.5.7 (the ctk supported version). You can remove the valdiation blocks after testing to ensure that wrong values are flagged by the type checker
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds support for the gpu_topology attribute in the placement_policy for VM instances. The changes are well-structured, including updates to the main Terraform logic, variable definitions, and documentation. I've provided a few suggestions to improve documentation accuracy, code style, and validation logic.
| | <a name="input_network_storage"></a> [network\_storage](#input\_network\_storage) | An array of network attached storage mounts to be configured. | <pre>list(object({<br/> server_ip = string,<br/> remote_mount = string,<br/> local_mount = string,<br/> fs_type = string,<br/> mount_options = string,<br/> client_install_runner = map(string)<br/> mount_runner = map(string)<br/> }))</pre> | `[]` | no | | ||
| | <a name="input_on_host_maintenance"></a> [on\_host\_maintenance](#input\_on\_host\_maintenance) | Describes maintenance behavior for the instance. If left blank this will default to `MIGRATE` except for when `placement_policy`, spot provisioning, or GPUs require it to be `TERMINATE` | `string` | `null` | no | | ||
| | <a name="input_placement_policy"></a> [placement\_policy](#input\_placement\_policy) | Control where your VM instances are physically located relative to each other within a zone.<br/>See https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_resource_policy#nested_group_placement_policy | `any` | `null` | no | | ||
| | <a name="input_placement_policy"></a> [placement\_policy](#input\_placement\_policy) | Control where your VM instances are physically located relative to each other within a zone.<br/>See https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_resource_policy#nested_group_placement_policy<br/><br/>Supported fields:<br/> vm_count (number) </br> availability_domain_count (number)</br> collocation (string), max_distance (number)</br> gpu_topology (string). | `any` | `null` | no | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation for placement_policy has some issues:
- The HTML line break tags are incorrect (
</br>). They should be<br/>. - The list of supported fields is not very readable. Using a list format would improve clarity.
- The
typecolumn is stillany, but it has been updated to a specificobjecttype invariables.tf. The documentation should reflect this change, which may require regenerating the documentation.
| | <a name="input_placement_policy"></a> [placement\_policy](#input\_placement\_policy) | Control where your VM instances are physically located relative to each other within a zone.<br/>See https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_resource_policy#nested_group_placement_policy<br/><br/>Supported fields:<br/> vm_count (number) </br> availability_domain_count (number)</br> collocation (string), max_distance (number)</br> gpu_topology (string). | `any` | `null` | no | | |
| | <a name="input_placement_policy"></a> [placement\_policy](#input\_placement\_policy) | Control where your VM instances are physically located relative to each other within a zone.<br/>See https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_resource_policy#nested_group_placement_policy<br/><br/>Supported fields:<ul><li>`vm_count` (number)</li><li>`availability_domain_count` (number)</li><li>`collocation` (string)</li><li>`max_distance` (number)</li><li>`gpu_topology` (string)</li></ul> | <pre>object({<br/> vm_count = optional(number)<br/> availability_domain_count = optional(number)<br/> collocation = optional(string)<br/> max_distance = optional(number)<br/> gpu_topology = optional(string)<br/> })</pre> | `null` | no | |
| availability_domain_count = try(tostring(var.placement_policy.availability_domain_count), "") | ||
| max_distance = try(tostring(var.placement_policy.max_distance), "") | ||
| collocation = try(var.placement_policy.collocation, "") | ||
| gpu_topology = try(var.placement_policy.gpu_topology, "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| validation { | ||
| condition = var.placement_policy == null ? true : try(keys(var.placement_policy), null) != null | ||
| error_message = <<-EOT | ||
| The var.placement_policy should be either unset/null or be a map/object with | ||
| fields: vm_count (number), availability_domain_count (number), collocation (string), max_distance (number). | ||
| fields: vm_count (number), availability_domain_count (number), collocation (string), max_distance (number), gpu_topology (string). | ||
| EOT | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you've changed the type of placement_policy from any to a specific object, this validation block can be improved. The current condition doesn't prevent providing an empty object (e.g., {}), which would cause a failure at apply time because google_compute_resource_policy requires at least one argument in group_placement_policy. A better validation would ensure that if placement_policy is provided, it contains at least one non-null value.
validation {
condition = var.placement_policy == null || length([for v in values(var.placement_policy) : v if v != null]) > 0
error_message = <<-EOT
The var.placement_policy should be either unset/null or be an object with at least one non-null field. Supported fields are: vm_count, availability_domain_count, collocation, max_distance, gpu_topology.
EOT
}
This PR adds support for the gpu_topology attribute within the placement_policy for VM instances. This is necessary to support the upcoming A4x GPUs, which require a specific GPU topology to be defined for optimal performance.