-
Notifications
You must be signed in to change notification settings - Fork 423
GCP-181: add infrastructure create and destroy CLI commands for GCP #7290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@cristianoveiga: This pull request references GCP-181 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Skipping CI for Draft Pull Request. |
WalkthroughAdds GCP infra support: new Cobra create/destroy subcommands, a GCP NetworkManager managing VPC, subnet, Cloud Router, and Cloud NAT lifecycles, helper utilities and tests, and wires the GCP commands into the infra command registration. Also introduces operation timeout constants (moved into a new constants file) and removes duplicate constants from iam.go. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro Cache: Disabled due to data retention organization setting Knowledge base: Disabled due to 📒 Files selected for processing (2)
💤 Files with no reviewable changes (1)
🧰 Additional context used📓 Path-based instructions (1)**⚙️ CodeRabbit configuration file
Files:
🔇 Additional comments (1)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Comment |
|
@cristianoveiga: This pull request references GCP-181 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: cristianoveiga The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
cmd/infra/gcp/create_infra.go (1)
98-129: Differentiate infra creation failures from output/serialization failuresRight now
Run()treats any error fromOutput()the same as a failure inCreateInfra(), and the top-level Cobra handler logs “Failed to create GCP infrastructure”. This can mislead users and automation when infrastructure has actually been created but writing the output file/stdout failed.Consider wrapping the output failure to make that distinction explicit (while still returning a non‑zero exit code), e.g.:
func (o *CreateInfraOptions) Run(ctx context.Context, logger logr.Logger) error { result, err := o.CreateInfra(ctx, logger) if err != nil { return err } - return o.Output(result) + if err := o.Output(result); err != nil { + return fmt.Errorf("infrastructure created successfully, but failed to write output: %w", err) + } + return nil }This keeps behavior backward-compatible while making logs and errors clearer when only the output step fails.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
⛔ Files ignored due to path filters (5)
vendor/google.golang.org/api/compute/v1/compute-api.jsonis excluded by!vendor/**,!**/vendor/**vendor/google.golang.org/api/compute/v1/compute-gen.gois excluded by!vendor/**,!**/vendor/**vendor/google.golang.org/api/compute/v1/compute2-gen.gois excluded by!vendor/**,!**/vendor/**vendor/google.golang.org/api/compute/v1/compute3-gen.gois excluded by!vendor/**,!**/vendor/**vendor/modules.txtis excluded by!vendor/**,!**/vendor/**
📒 Files selected for processing (7)
cmd/infra/create.go(2 hunks)cmd/infra/destroy.go(2 hunks)cmd/infra/gcp/create_infra.go(1 hunks)cmd/infra/gcp/create_infra_test.go(1 hunks)cmd/infra/gcp/destroy_infra.go(1 hunks)cmd/infra/gcp/destroy_infra_test.go(1 hunks)cmd/infra/gcp/networking.go(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.
Files:
cmd/infra/destroy.gocmd/infra/gcp/create_infra_test.gocmd/infra/gcp/destroy_infra.gocmd/infra/create.gocmd/infra/gcp/destroy_infra_test.gocmd/infra/gcp/create_infra.gocmd/infra/gcp/networking.go
🧬 Code graph analysis (5)
cmd/infra/destroy.go (1)
cmd/infra/gcp/destroy_infra.go (1)
NewDestroyCommand(22-53)
cmd/infra/gcp/create_infra_test.go (1)
cmd/infra/gcp/create_infra.go (3)
CreateInfraOptions(20-29)CreateInfraOutput(32-44)DefaultSubnetCIDR(16-16)
cmd/infra/gcp/destroy_infra.go (3)
cmd/infra/destroy.go (1)
NewDestroyCommand(12-25)cmd/log/log.go (1)
Log(9-11)cmd/infra/gcp/networking.go (1)
NewNetworkManager(37-60)
cmd/infra/gcp/destroy_infra_test.go (1)
cmd/infra/gcp/destroy_infra.go (1)
DestroyInfraOptions(14-19)
cmd/infra/gcp/create_infra.go (1)
cmd/log/log.go (1)
Log(9-11)
🔇 Additional comments (7)
cmd/infra/create.go (1)
3-22: GCP create subcommand wiring looks consistent and minimalImporting
cmd/infra/gcpand addinggcp.NewCreateCommand()is symmetric with existing AWS/Azure/PowerVS wiring; no issues from a correctness or maintainability standpoint.cmd/infra/destroy.go (1)
3-22: GCP destroy subcommand registration is aligned with existing platformsAdding
gcp.NewDestroyCommand()alongside AWS/Azure/PowerVS is straightforward and keeps the infra destroy CLI structure consistent.cmd/infra/gcp/destroy_infra_test.go (1)
10-158: Validation and helper tests provide solid coverageThe tests exercise all required-field validation paths plus the
formatOperationErrorsandjoinStringshelpers (nil/empty/single/multi cases), which aligns well with the implementation and should catch common regressions.cmd/infra/gcp/create_infra_test.go (1)
11-268: Create infra option/output and naming tests are thoroughThe suite covers validation, file/stdout output (including JSON structure and content), resource name formatting, and the
DefaultSubnetCIDRconstant, giving good confidence in the CLI surface and helper behavior.cmd/infra/gcp/destroy_infra.go (1)
21-107: Destroy flow and resource ordering look correct and idempotentFlag handling and
Validate()enforce required inputs, andRun()deletes resources in a sensible reverse dependency order (firewall → NAT → router → subnet → network), relying on the NetworkManager’s NotFound handling for idempotent re-runs. No blocking issues here.cmd/infra/gcp/create_infra.go (1)
46-96: Overall create CLI wiring and validation are solidThe GCP create command is wired consistently with other platforms, required flags are enforced both via Cobra and
Validate(), andCreateInfraOptionscleanly encapsulates the inputs and defaultVPCCidr. No correctness issues in this portion.cmd/infra/gcp/networking.go (1)
36-468: Please confirm helper definitions forisAlreadyExistsErrorand operation timeout constantsThis file relies on
isAlreadyExistsError,defaultOperationTimeout, anddefaultPollingIntervalin several places (create flows and wait loops), but their definitions are not visible in the provided gcp package files. If they are not already defined elsewhere in this package, you’ll get compile errors and unbounded operation waits.Can you double-check that:
isAlreadyExistsErrorcorrectly detects “already exists” responses from the GCP Compute API (typically HTTP 409),defaultOperationTimeoutis a reasonable upper bound for long‑running operations, anddefaultPollingIntervalis set to a sensible polling cadence?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (4)
cmd/infra/gcp/create_infra.go(1 hunks)cmd/infra/gcp/create_infra_test.go(1 hunks)cmd/infra/gcp/destroy_infra.go(1 hunks)cmd/infra/gcp/networking.go(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.
Files:
cmd/infra/gcp/create_infra_test.gocmd/infra/gcp/destroy_infra.gocmd/infra/gcp/networking.gocmd/infra/gcp/create_infra.go
🧬 Code graph analysis (2)
cmd/infra/gcp/destroy_infra.go (3)
cmd/infra/destroy.go (1)
NewDestroyCommand(12-25)cmd/log/log.go (1)
Log(9-11)cmd/infra/gcp/networking.go (1)
NewNetworkManager(37-60)
cmd/infra/gcp/create_infra.go (2)
cmd/log/log.go (1)
Log(9-11)cmd/infra/gcp/networking.go (1)
NewNetworkManager(37-60)
🔇 Additional comments (3)
cmd/infra/gcp/networking.go (1)
62-89: Network lifecycle, idempotency, and operation polling look solidThe create/delete flows for network, subnet, router, and NAT are consistent, handle “already exists” / “not found” cases cleanly, and correctly gate on operation completion with context-aware polling and bounded timeouts. Overall this abstraction is in good shape for CLI use.
Also applies to: 211-322, 324-376
cmd/infra/gcp/create_infra_test.go (1)
11-82: Good coverage for validation, output behavior, and naming helpersThe tests cover the main edge cases for flag validation, JSON output (both to disk and stdout), and the resource-naming helpers plus DefaultSubnetCIDR, which should catch most regressions in the CLI surface.
Also applies to: 84-203, 205-258
cmd/infra/gcp/destroy_infra.go (1)
21-53: Destruction flow and dependency ordering are appropriateValidation of required flags plus the ordered teardown (NAT → Router → Subnet → Network) aligns with GCP dependencies and should avoid common “resource in use” errors, while remaining idempotent via the underlying NetworkManager delete helpers.
Also applies to: 55-103
Add the GCP Compute API client library to vendor directory. This dependency is required for the new `hypershift create infra gcp` command that provisions GCP network infrastructure resources. Signed-off-by: Cristiano Veiga <[email protected]> Commit-Message-Assisted-by: Claude (via Claude Code)
Implement CLI command to provision GCP network infrastructure for
HyperShift hosted clusters. The command creates:
- VPC network (custom subnet mode)
- Subnet with Private Google Access enabled
- Cloud Router for NAT gateway
- Cloud NAT for outbound internet access
- Egress firewall rule (allow TCP/UDP 0-65535)
The command is idempotent - it safely handles existing resources by
detecting 409 AlreadyExists errors and retrieving the existing
resource instead of failing.
Features:
- Required flags: --project-id, --region, --infra-id
- Optional flags: --vpc-cidr (default: 10.0.0.0/24), --output-file
- JSON output with resource names and self-links
- Resource naming convention: {infraID}-{resource-type}
Example usage:
hypershift create infra gcp \
--project-id my-project \
--region us-central1 \
--infra-id my-cluster \
--output-file infra.json
Signed-off-by: Cristiano Veiga <[email protected]>
Commit-Message-Assisted-by: Claude (via Claude Code)
Add unit tests for the GCP infrastructure creation command covering: - CreateInfraOptions.Validate() - required field validation - CreateInfraOptions.Output() - JSON file output and stdout - NetworkManager format functions - resource naming conventions - DefaultSubnetCIDR constant verification Signed-off-by: Cristiano Veiga <[email protected]> Commit-Message-Assisted-by: Claude (via Claude Code)
Rename create.go to create_infra.go and create_test.go to create_infra_test.go to follow the established naming pattern used by other infrastructure commands. Signed-off-by: Cristiano Veiga <[email protected]> Commit-Message-Assisted-by: Claude
Add destroy infrastructure command for GCP that cleans up network resources created by `hypershift create infra gcp`: - VPC network - Subnet - Cloud Router (includes NAT configuration) - Egress firewall rule Resources are deleted in reverse dependency order to avoid conflicts. Missing resources are gracefully skipped with informational logging. Signed-off-by: Cristiano Veiga <[email protected]> Commit-Message-Assisted-by: Claude
Add unit tests for DestroyInfraOptions.Validate(), formatOperationErrors(), and joinStrings() functions. Signed-off-by: Cristiano Veiga <[email protected]> Commit-Message-Assisted-by: Claude
Fix import grouping and add missing newlines at end of files. Signed-off-by: Cristiano Veiga <[email protected]> Commit-Message-Assisted-by: Claude
GCP VPCs have an implied allow egress rule (priority 65535) that permits all outbound traffic by default. The explicit egress firewall rule was redundant and has been removed. Signed-off-by: Cristiano Veiga <[email protected]> Commit-Message-Assisted-by: Claude
Apply DefaultSubnetCIDR when VPCCidr is empty in CreateInfra(), ensuring consistent behavior for callers that don't go through NewCreateCommand. This matches the pattern used by the AWS infrastructure command. Signed-off-by: Cristiano Veiga <[email protected]> Commit-Message-Assisted-by: Claude
f4ad036 to
ea037d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
cmd/infra/gcp/create_infra.go (2)
71-77: Differentiate infra creation failures from output failures in logging
Runcurrently handles both resource creation and writing output, butRunEalways logs"Failed to create GCP infrastructure"on any error. If output writing (e.g.,--output-filepermission/disk issues) fails after infra is successfully created, the log message is misleading and may cause unnecessary retries.Consider clarifying the error path, for example:
func (o *CreateInfraOptions) Run(ctx context.Context, logger logr.Logger) error { - result, err := o.CreateInfra(ctx, logger) - if err != nil { - return err - } - return o.Output(result) + result, err := o.CreateInfra(ctx, logger) + if err != nil { + return err + } + if err := o.Output(result); err != nil { + // Infra was created; only output failed. + return fmt.Errorf("GCP infra created but failed to write output: %w", err) + } + return nil } @@ - if err := opts.Run(cmd.Context(), logger); err != nil { - logger.Error(err, "Failed to create GCP infrastructure") + if err := opts.Run(cmd.Context(), logger); err != nil { + logger.Error(err, "GCP infra create command failed") return err }This keeps behavior unchanged while making logs more accurate for operators.
Also applies to: 97-104
151-180: Verify that the egress firewall rule is actually created in the infra flow
CreateInfraexplicitly orchestrates VPC network, subnet, Cloud Router, and Cloud NAT, but there’s no obvious call here to create an egress firewall rule, even though the PR objectives list it as a managed resource.If firewall creation is handled elsewhere (e.g., inside
NewNetworkManageror one of its methods), that’s fine, but please double-check that:
- The egress firewall rule is indeed created as part of this flow, and
- Destroy infra mirrors that lifecycle.
If it isn’t created yet, you’ll likely want to wire a
CreateFirewall‑style call into this sequence (and optionally expose it viaCreateInfraOutput) to align with the documented behavior.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (1)
cmd/infra/gcp/create_infra.go(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.
Files:
cmd/infra/gcp/create_infra.go
🔇 Additional comments (1)
cmd/infra/gcp/create_infra.go (1)
15-43: Infra options and output schema look consistentThe default
DefaultSubnetCIDRand the JSON-taggedCreateInfraOutputfields form a clear, stable contract for callers; no issues spotted here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (3)
cmd/infra/gcp/create_infra.go (1)
115-117: Consider logging file Close() errors instead of discarding.The deferred
Close()error is silently discarded. For file writes, this could mask I/O errors (e.g., filesystem full).defer func(out *os.File) { - _ = out.Close() + if err := out.Close(); err != nil { + // Log but don't fail - data was already written + fmt.Fprintf(os.Stderr, "warning: failed to close output file: %v\n", err) + } }(out)cmd/infra/gcp/networking.go (2)
390-400: Usestrings.Joinfrom the standard library.The custom
joinStringsfunction reimplementsstrings.Join. Using the stdlib reduces code and improves clarity.import ( "context" "errors" "fmt" + "strings" "time" ... ) -// joinStrings joins a slice of strings with a separator. -func joinStrings(strs []string, sep string) string { - if len(strs) == 0 { - return "" - } - result := strs[0] - for i := 1; i < len(strs); i++ { - result += sep + strs[i] - } - return result -} // formatOperationErrors formats GCP operation errors into a readable string. func formatOperationErrors(errors []*compute.OperationErrorErrors) string { if len(errors) == 0 { return "unknown error" } var messages []string for _, e := range errors { messages = append(messages, fmt.Sprintf("%s: %s", e.Code, e.Message)) } - return fmt.Sprintf("[%s]", joinStrings(messages, ", ")) + return fmt.Sprintf("[%s]", strings.Join(messages, ", ")) }Note: This would also require updating
TestJoinStringsto teststrings.Joinbehavior or removing it.
347-348: Usetime.NewTimerwith context for cancellation-aware polling.The current
time.Sleepblocks until the interval elapses even if the context is canceled. Using a timer withselectallows immediate response to cancellation.- time.Sleep(defaultPollingInterval) + select { + case <-ctx.Done(): + return fmt.Errorf("operation canceled: %w", ctx.Err()) + case <-time.After(defaultPollingInterval): + }Apply similarly to
waitForRegionOperationat line 374.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
⛔ Files ignored due to path filters (5)
vendor/google.golang.org/api/compute/v1/compute-api.jsonis excluded by!vendor/**,!**/vendor/**vendor/google.golang.org/api/compute/v1/compute-gen.gois excluded by!vendor/**,!**/vendor/**vendor/google.golang.org/api/compute/v1/compute2-gen.gois excluded by!vendor/**,!**/vendor/**vendor/google.golang.org/api/compute/v1/compute3-gen.gois excluded by!vendor/**,!**/vendor/**vendor/modules.txtis excluded by!vendor/**,!**/vendor/**
📒 Files selected for processing (7)
cmd/infra/create.go(2 hunks)cmd/infra/destroy.go(2 hunks)cmd/infra/gcp/create_infra.go(1 hunks)cmd/infra/gcp/create_infra_test.go(1 hunks)cmd/infra/gcp/destroy_infra.go(1 hunks)cmd/infra/gcp/destroy_infra_test.go(1 hunks)cmd/infra/gcp/networking.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- cmd/infra/create.go
- cmd/infra/gcp/create_infra_test.go
- cmd/infra/destroy.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.
Files:
cmd/infra/gcp/destroy_infra.gocmd/infra/gcp/create_infra.gocmd/infra/gcp/destroy_infra_test.gocmd/infra/gcp/networking.go
🧬 Code graph analysis (3)
cmd/infra/gcp/destroy_infra.go (3)
cmd/infra/destroy.go (1)
NewDestroyCommand(12-25)cmd/log/log.go (1)
Log(9-11)cmd/infra/gcp/networking.go (1)
NewNetworkManager(37-60)
cmd/infra/gcp/create_infra.go (2)
cmd/log/log.go (1)
Log(9-11)cmd/infra/gcp/networking.go (1)
NewNetworkManager(37-60)
cmd/infra/gcp/destroy_infra_test.go (1)
cmd/infra/gcp/destroy_infra.go (1)
DestroyInfraOptions(14-19)
🔇 Additional comments (3)
cmd/infra/gcp/destroy_infra_test.go (1)
1-158: LGTM!Well-structured table-driven tests with good coverage of validation logic, error formatting, and string joining utilities. Tests cover edge cases (nil, empty, single, and multiple elements).
cmd/infra/gcp/networking.go (1)
76-77: I'll verify the claim about the missingisAlreadyExistsErrorfunction. Let me search the codebase to determine if this function is actually missing.
<function_calls>
#!/bin/bashCheck if isAlreadyExistsError is defined in the codebase
echo "=== Searching for isAlreadyExistsError definition ==="
rg -n "func isAlreadyExistsError" --type goecho ""
echo "=== Searching for isAlreadyExistsError calls ==="
rg -n "isAlreadyExistsError" --type go -C 2echo ""
echo "=== Checking for related error helper functions ==="
rg -n "func is.*Error" cmd/infra/gcp/ --type go
</function_calls>cmd/infra/gcp/destroy_infra.go (1)
79-100: I'm unable to access the repository to verify the specific claims about firewall rule deletion in the destroy_infra.go file. While web search confirms that firewall rules are a known concern in GCP infrastructure destruction, I cannot directly verify:
- Whether the create command actually creates an egress firewall rule (
{infraID}-egress-allow)- Whether a
DeleteFirewallRulemethod exists in the NetworkManager interface- Whether this deletion is currently missing from the destroy flow
Manual verification required.
The review comment claims that an egress firewall rule is created during infrastructure setup but not deleted during destruction. This is a plausible issue (firewall rules blocking network deletion is documented in GCP workflows), but cannot be confirmed without accessing the codebase to verify:
- The firewall rule creation logic in the create command
- The NetworkManager interface and available methods
- Whether the destroy flow already includes firewall rule deletion
|
@cristianoveiga: This pull request references GCP-181 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Move defaultOperationTimeout and defaultPollingInterval to a dedicated constants.go file for better discoverability. These constants are used by both iam.go and networking.go. Signed-off-by: Cristiano Veiga <[email protected]> Commit-Message-Assisted-by: Claude
|
/retest-required |
|
/lgtm |
|
/verified later @cristianoveiga |
|
@cristianoveiga: This PR has been marked to be verified later by In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
What this PR does / why we need it:
Adds GCP infrastructure CLI commands for creating and destroying network resources required by HyperShift hosted clusters:
hypershift create infra gcp- Creates VPC network, subnet, Cloud Router, Cloud NAThypershift destroy infra gcp- Destroys the above resources in reverse dependency orderThese commands follow the existing patterns from AWS and Azure implementations, enabling users to provision and clean up GCP network infrastructure for hosted clusters.
Resources managed:
{infraID}-network){infraID}-subnet){infraID}-router){infraID}-nat)Which issue(s) this PR fixes:
Fixes GCP-181
Special notes for your reviewer:
Checklist: