Skip to content

GCP-196: Implement GCPPrivateServiceObserver controller#7207

Merged
openshift-merge-bot[bot] merged 12 commits into
openshift:mainfrom
cristianoveiga:GCP-196
Nov 27, 2025
Merged

GCP-196: Implement GCPPrivateServiceObserver controller#7207
openshift-merge-bot[bot] merged 12 commits into
openshift:mainfrom
cristianoveiga:GCP-196

Conversation

@cristianoveiga
Copy link
Copy Markdown
Contributor

@cristianoveiga cristianoveiga commented Nov 10, 2025

What this PR does / why we need it:

This PR implements the GCPPrivateServiceObserver controller which watches private-router Service and creates GCPPrivateServiceConnect CR.

Commits:

  • Add LoadBalancerIP field to GCPPrivateServiceConnect CRD
  • Implement GCPPrivateServiceObserver controller with comprehensive tests

Which issue(s) this PR fixes:

GCP-196

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Nov 10, 2025
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Nov 10, 2025

@cristianoveiga: This pull request references GCP-196 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 10, 2025
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Nov 10, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Nov 10, 2025

Important

Review skipped

Review was skipped due to path filters

⛔ Files ignored due to path filters (1)
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/gcpprivateserviceconnect_types.go is excluded by !vendor/**, !**/vendor/**

CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including **/dist/** will override the default block on the dist directory, by removing the pattern from both the lists.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Adds LoadBalancerIP (spec) and EndpointIP (status) with IP validation; makes ForwardingRuleName optional; updates CRD printer column to GCPEndpointAvailable; introduces GCPPrivateServiceObserver and tests; adds client apply helper; registers observer at operator startup; updates generated CRDs and API docs.

Changes

Cohort / File(s) Summary
GCPPrivateServiceConnect Type
api/hypershift/v1beta1/gcpprivateserviceconnect_types.go
Added LoadBalancerIP to spec (json:"loadBalancerIP") with IP validation and maxLength; made ForwardingRuleName optional (json:"forwardingRuleName,omitempty"); added/adjusted EndpointIP in status (json:"endpointIP,omitempty") with IP validation and increased maxLength; updated CRD printcolumn to reference GCPEndpointAvailable.
Generated CRD Manifests
api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/.../GCPPlatform.yaml, cmd/install/assets/hypershift-operator/zz_generated.crd-manifests/gcpprivateserviceconnects-*.crd.yaml, api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests.yaml
Added loadBalancerIP schema property (description, maxLength: 45, x-kubernetes-validations isIP); removed minLength from forwardingRuleName and expanded its description; replaced forwardingRuleName with loadBalancerIP in required arrays where applicable; increased endpointIP maxLength to 45 and added IP validation; updated availability jsonPath to .status.conditions[?(@.type=="GCPEndpointAvailable")].status.
Client Apply Configuration
client/applyconfiguration/hypershift/v1beta1/gcpprivateserviceconnectspec.go
Added LoadBalancerIP *string field to apply-configuration and fluent setter WithLoadBalancerIP(value string).
Observer Controller + Tests
control-plane-operator/controllers/gcpprivateserviceconnect/observer.go, control-plane-operator/controllers/gcpprivateserviceconnect/observer_test.go
New GCPPrivateServiceObserver controller: watches Services, filters by configured service, checks Internal LoadBalancer type, extracts LoadBalancer IP, resolves HostedControlPlane owner, and creates/updates GCPPrivateServiceConnect CRs (sets LoadBalancerIP and consumer list). Added unit tests covering helpers, reconciliation scenarios, and edge cases.
Operator Startup
control-plane-operator/main.go
Registers the GCP private service observer at startup for GCP private HCPs when route capability exists; added import for the controller package.
Documentation
docs/content/reference/api.md
Documented loadBalancerIP in spec (Internal Load Balancer IP populated by observers); marked forwardingRuleName optional and noted population by reconciler/GCP API lookup; updated endpointIP description to require valid IPv4/IPv6.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Areas needing extra attention:
    • control-plane-operator/controllers/gcpprivateserviceconnect/observer.go — reconciliation flow, owner lookup, create-or-update usage, and edge-case handling.
    • control-plane-operator/controllers/gcpprivateserviceconnect/observer_test.go — test setup, coverage of scenarios, and mock CreateOrUpdate behavior.
    • Consistency between Go struct JSON tags and generated CRD schema (field names, required arrays, validation expressions).
    • CRD x-kubernetes-validations for loadBalancerIP and endpointIP (regex/validation correctness).
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot added area/api Indicates the PR includes changes for the API area/cli Indicates the PR includes changes for CLI area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/documentation Indicates the PR includes changes for documentation area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release and removed do-not-merge/needs-area labels Nov 10, 2025
Add required LoadBalancerIP field to GCPPrivateServiceConnectSpec to store
the Internal Load Balancer IP address extracted by the observer controller
from Kubernetes service status.

Signed-off-by: Cristiano Veiga <[email protected]>
Commit-Message-Assisted-by: Claude (via Claude Code)
…hensive tests

Add complete implementation of GCP-196 observer controller that:
- Watches private-router service with Internal Load Balancer annotation
- Extracts LoadBalancer IP from service status when ready
- Creates GCPPrivateServiceConnect CR with proper OwnerReference lifecycle
- Uses GCP project from HostedControlPlane as consumer accept list
- Follows AWS PrivateServiceObserver pattern for architecture consistency

Include comprehensive unit test coverage:
- Controller name generation utilities
- Consumer accept list extraction from HCP platform spec
- Service LoadBalancer validation (Internal vs External types)
- LoadBalancer IP extraction logic for various service states
- HostedControlPlane owner reference detection and parsing

Signed-off-by: Cristiano Veiga <[email protected]>
Commit-Message-Assisted-by: Claude (via Claude Code)
@cristianoveiga cristianoveiga marked this pull request as ready for review November 12, 2025 17:40
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 12, 2025
@openshift-ci openshift-ci Bot requested review from bryan-cox and sjenning November 12, 2025 17:40
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (2)
cmd/install/assets/hypershift-operator/zz_generated.crd-manifests/gcpprivateserviceconnects-TechPreviewNoUpgrade.crd.yaml (1)

97-100: Same initialization concern: loadBalancerIP marked as required.

This manifest has the same issue as the GCPPlatform.yaml manifest where loadBalancerIP is required. Ensure consistency across all CRD variants and verify that the initialization flow supports required fields populated by observers.

cmd/install/assets/hypershift-operator/zz_generated.crd-manifests/gcpprivateserviceconnects-CustomNoUpgrade.crd.yaml (1)

97-100: Consistent with other CRD manifests but same initialization issue.

This manifest mirrors the changes in the other CRD files. The same concern about loadBalancerIP being required applies here.

🧹 Nitpick comments (4)
control-plane-operator/controllers/gcpprivateserviceconnect/observer.go (2)

50-65: Consider using the newer sets API.

The code uses sets.NewString() which may be deprecated in newer versions of k8s.io/apimachinery. Consider using sets.New[string]() instead for future compatibility.

 func nameMapper(names []string) handler.MapFunc {
-	nameSet := sets.NewString(names...)
+	nameSet := sets.New(names...)
 	return func(ctx context.Context, obj client.Object) []reconcile.Request {

129-134: Document the single ingress assumption.

The code assumes the LoadBalancer has exactly one ingress entry and takes the first one (Ingress[0].IP). While this is likely correct for GCP Internal LoadBalancers, consider adding a comment explaining this assumption or validating that there's exactly one ingress entry.

 	// Extract LoadBalancer IP
+	// GCP Internal LoadBalancers have exactly one ingress entry
 	loadBalancerIP := svc.Status.LoadBalancer.Ingress[0].IP
docs/content/reference/api.md (2)

166-176: Clarify controller-managed semantics and IP format for GCP PSC fields.

  • loadBalancerIP: state whether it’s IPv4-only or IPv4/IPv6 and explicitly mark as controller-populated/read-only so users don’t try to set it.
  • forwardingRuleName (Optional): likewise, note it’s set by the reconciler and should not be user-provided.

Since this page is generated, please adjust the Go type comments in api/hypershift/v1beta1/gcpprivateserviceconnect_types.go so the generated docs reflect this. Example:

-// loadBalancerIP is the IP address of the Internal Load Balancer
-// Populated by the observer from service status
+// loadBalancerIP is the IP address of the internal load balancer (IPv4-only/IPv4+IPv6 — clarify actual support).
+// Managed by the observer and populated from Service status. Users should not set this field.
+// If this resource is operator-created, this field is required on creation by the operator, not by users.
 
-// forwardingRuleName is the name of the Internal Load Balancer forwarding rule
-// Populated by the reconciler via GCP API lookup
+// forwardingRuleName is the name of the internal load balancer forwarding rule.
+// Managed by the reconciler via GCP API lookup. Users should not set this field.
  • Confirm the intended IP family support and validations for loadBalancerIP.
  • Confirm that the CR is created by the operator (so a required loadBalancerIP won’t block user creation). If users may create it, consider not marking it required in the CRD.

Also applies to: 184-186


6014-6024: Mirror the same clarifications in the Spec section.

Repeat the controller-managed/read-only and IP family notes for loadBalancerIP and forwardingRuleName here by updating the source comments so both generated locations stay consistent.

Also applies to: 6032-6034

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 66a7017 and 2371929.

⛔ Files ignored due to path filters (1)
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/gcpprivateserviceconnect_types.go is excluded by !vendor/**, !**/vendor/**
📒 Files selected for processing (9)
  • api/hypershift/v1beta1/gcpprivateserviceconnect_types.go (1 hunks)
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/gcpprivateserviceconnects.hypershift.openshift.io/GCPPlatform.yaml (2 hunks)
  • client/applyconfiguration/hypershift/v1beta1/gcpprivateserviceconnectspec.go (2 hunks)
  • cmd/install/assets/hypershift-operator/zz_generated.crd-manifests/gcpprivateserviceconnects-CustomNoUpgrade.crd.yaml (2 hunks)
  • cmd/install/assets/hypershift-operator/zz_generated.crd-manifests/gcpprivateserviceconnects-TechPreviewNoUpgrade.crd.yaml (2 hunks)
  • control-plane-operator/controllers/gcpprivateserviceconnect/observer.go (1 hunks)
  • control-plane-operator/controllers/gcpprivateserviceconnect/observer_test.go (1 hunks)
  • control-plane-operator/main.go (2 hunks)
  • docs/content/reference/api.md (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • control-plane-operator/controllers/gcpprivateserviceconnect/observer_test.go
  • api/hypershift/v1beta1/gcpprivateserviceconnect_types.go
  • cmd/install/assets/hypershift-operator/zz_generated.crd-manifests/gcpprivateserviceconnects-TechPreviewNoUpgrade.crd.yaml
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/gcpprivateserviceconnects.hypershift.openshift.io/GCPPlatform.yaml
  • docs/content/reference/api.md
  • client/applyconfiguration/hypershift/v1beta1/gcpprivateserviceconnectspec.go
  • control-plane-operator/main.go
  • control-plane-operator/controllers/gcpprivateserviceconnect/observer.go
  • cmd/install/assets/hypershift-operator/zz_generated.crd-manifests/gcpprivateserviceconnects-CustomNoUpgrade.crd.yaml
🔇 Additional comments (12)
api/hypershift/v1beta1/gcpprivateserviceconnect_types.go (2)

49-54: LGTM!

The change to make ForwardingRuleName optional with omitempty is appropriate, given that it's now populated by the reconciler via GCP API lookup. The documentation clearly indicates the population method, and the validation constraints remain appropriate.


41-47: The concern does not apply—the observer only updates the resource after validating the LoadBalancerIP is available.

The code at lines 130–134 checks that the LoadBalancer IP is present in the service status before proceeding. The reconcileGCPPrivateServiceConnect call at line 163 only executes after this validation passes. Inside the CreateOrUpdate callback, the LoadBalancerIP field is populated (line 190) before the resource is saved, satisfying the +required constraint. There is no "chicken and egg" problem because the observer never attempts to create or update the resource without the required field already being available.

client/applyconfiguration/hypershift/v1beta1/gcpprivateserviceconnectspec.go (2)

23-23: LGTM!

The generated LoadBalancerIP field follows the correct pattern, using a pointer type with the appropriate JSON tag.


35-41: LGTM!

The WithLoadBalancerIP method correctly implements the fluent builder pattern, matching the style of other With* methods in the file.

control-plane-operator/controllers/gcpprivateserviceconnect/observer.go (3)

37-48: LGTM!

The observer type definition is well-structured with appropriate fields for client access, logging, configuration, and the upsert provider pattern.


75-102: LGTM!

The controller setup follows best practices with namespace-scoped informers, proper error handling, and lifecycle management through the manager.


136-157: LGTM!

The OwnerReference validation and HostedControlPlane fetch logic is well-implemented with proper error handling and early exit on deletion.

control-plane-operator/main.go (2)

15-15: LGTM!

The import for the new GCP Private Service Connect controller package is correctly placed alongside the similar AWS private link package.


524-539: Verify if GCP should also observe KubeAPIServer private service.

The AWS implementation sets up two observers: one for KubeAPIServerPrivateService and one for PrivateRouterService. The GCP implementation only sets up an observer for PrivateRouterService.

However, evidence shows that KubeAPIServerPrivateService is created for all private HCPs (including GCP) when using LoadBalancer service strategy. Since the service exists for GCP but isn't observed like in AWS, confirm whether GCP Private Service Connect infrastructure should also observe and reconcile the KubeAPIServer private service, or if the architecture legitimately differs.

api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/gcpprivateserviceconnects.hypershift.openshift.io/GCPPlatform.yaml (2)

79-86: LGTM: loadBalancerIP field specification is well-defined.

The field includes appropriate validation constraints (IP pattern, length limits) and clear documentation indicating it's populated by the observer.


94-97: No issues identified. The review comment is based on a misunderstanding of the resource creation flow.

The concern about a circular dependency does not apply here. The observer validates that loadBalancerIP is populated from the Service status (lines 133-137 of observer.go) before calling reconcileGCPPrivateServiceConnect. The CreateOrUpdate method executes the mutate closure (which populates Spec.LoadBalancerIP and Spec.ConsumerAcceptList) before submitting the resource to the API (support/upsert/upsert.go lines 72-76). Therefore, the resource is created with all required fields already set, and making loadBalancerIP required is correct and safe.

Likely an incorrect or invalid review comment.

control-plane-operator/controllers/gcpprivateserviceconnect/observer_test.go (1)

30-34: The review comment's concern is incorrect.

The test case accepts an empty string, but PrivateRouterService returns a Service with hardcoded Name "private-router", and all production calls to ControllerName pass either KubeAPIServerPrivateServiceName or manifests.PrivateRouterService("").Name, which are both non-empty values. The empty string test case is defensive edge-case testing that cannot occur in production, so "-observer" will never be generated at runtime. No validation or fixes are needed.

Likely an incorrect or invalid review comment.

Comment thread control-plane-operator/controllers/gcpprivateserviceconnect/observer_test.go Outdated
Comment thread control-plane-operator/controllers/gcpprivateserviceconnect/observer_test.go Outdated
Comment thread control-plane-operator/controllers/gcpprivateserviceconnect/observer.go Outdated
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Nov 12, 2025

@cristianoveiga: This pull request references GCP-196 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

What this PR does / why we need it:

  • This PR implements the GCPPrivateServiceObserver controller which watches private-router Service and creates GCPPrivateServiceConnect CR.

Which issue(s) this PR fixes:

GCP-196

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Nov 12, 2025

@cristianoveiga: This pull request references GCP-196 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

What this PR does / why we need it:

This PR implements the GCPPrivateServiceObserver controller which watches private-router Service and creates GCPPrivateServiceConnect CR.

Commits:

  • Add LoadBalancerIP field to GCPPrivateServiceConnect CRD
  • Implement GCPPrivateServiceObserver controller with comprehensive tests

Which issue(s) this PR fixes:

GCP-196

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

cristianoveiga and others added 2 commits November 12, 2025 13:07
Remove test cases with empty project and non-GCP platform as project is
mandatory for GCP PSC observer functionality.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
- Extract isInternalLoadBalancer and extractLoadBalancerIP functions from
  inline logic in observer controller
- Refactor tests to call actual validation functions instead of
  duplicating logic inline
- This ensures tests verify the production implementation and will
  catch changes to controller logic

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@bryan-cox
Copy link
Copy Markdown
Member

/uncc @bryan-cox

@openshift-ci openshift-ci Bot removed the request for review from bryan-cox November 12, 2025 18:17
- Extract extractHostedControlPlaneOwnerName function from inline logic
  in observer controller
- Refactor TestServiceOwnerReferenceValidation to call actual function
  instead of duplicating owner reference lookup logic
- Ensures tests verify production implementation and catch logic changes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
control-plane-operator/controllers/gcpprivateserviceconnect/observer_test.go (1)

30-34: Consider whether empty service name is a valid scenario.

The test documents that an empty service name produces "-observer". While this may be intentional for edge case handling, in practice an empty service name would be rejected by Kubernetes API validation. Consider whether this test case adds value or if it could be removed to keep tests focused on realistic scenarios.

control-plane-operator/controllers/gcpprivateserviceconnect/observer.go (1)

189-198: Consider defensive handling if GCP spec could be nil.

If hcp.Spec.Platform.GCP is nil (line 191), the function returns nil, which would be set as ConsumerAcceptList (line 180) and might fail CRD validation if the schema requires MinItems=1. While you noted that the project is required at the API level, a defensive check or comment documenting the assumption would clarify the contract.

 func getConsumerAcceptList(hcp *hyperv1.HostedControlPlane) []string {
+	// GCP platform spec is guaranteed to be non-nil for GCP HostedControlPlanes
 	if hcp.Spec.Platform.GCP == nil {
 		return nil
 	}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 2371929 and 46e8a99.

📒 Files selected for processing (2)
  • control-plane-operator/controllers/gcpprivateserviceconnect/observer.go (1 hunks)
  • control-plane-operator/controllers/gcpprivateserviceconnect/observer_test.go (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • control-plane-operator/controllers/gcpprivateserviceconnect/observer_test.go
  • control-plane-operator/controllers/gcpprivateserviceconnect/observer.go
🔇 Additional comments (5)
control-plane-operator/controllers/gcpprivateserviceconnect/observer_test.go (2)

201-211: Excellent improvement - tests now call actual implementation.

The tests now properly call the production helper functions (isInternalLoadBalancer, extractLoadBalancerIP) rather than duplicating the validation logic inline. This ensures that changes to the controller logic will be caught by the tests and provides confidence in the actual implementation.


267-271: Good refactoring - owner reference extraction now uses actual function.

The test correctly calls extractHostedControlPlaneOwnerName from the observer implementation, ensuring that the test validates the production code path rather than reimplementing the logic.

control-plane-operator/controllers/gcpprivateserviceconnect/observer.go (3)

76-103: LGTM - Standard controller setup pattern.

The controller setup follows Kubernetes best practices by using a shared informer with namespace filtering and proper event handling. The use of namedResourceHandler ensures only the specified service triggers reconciliation.


105-158: Well-structured reconciliation loop with clear validation steps.

The reconcile logic follows a clear progression: fetch service → validate internal LB type → validate IP readiness → extract HCP owner → fetch HCP → derive consumer list → reconcile CR. Early returns for transient conditions (IP not ready, deleted HCP) are appropriate since the watch will trigger re-reconciliation on updates.


200-229: Clean helper functions enable testability.

Extracting these validation and extraction functions (isInternalLoadBalancer, extractLoadBalancerIP, extractHostedControlPlaneOwnerName) into standalone functions enables the comprehensive unit tests in observer_test.go and follows good separation of concerns.

@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/retest-required

}

if hcp.Spec.Platform.Type == hyperv1.GCPPlatform && util.IsPrivateHCP(hcp) && mgmtClusterCaps.Has(capabilities.CapabilityRoute) {
controllerName := "GCPPrivateServiceObserver"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we not want to name this controller "PrivateKubeAPIServerServiceObserver" like the AWS one? Only one of them will be invoked because the platform type is different. Will we need a second one for the ingress later?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that we only need one Observer controller and perhaps it's a good idea to keep its name generic. Here is why:

Looking at the AWS implementation, I see that there are two observer controllers:

  • PrivateKubeAPIServerServiceObserver
  • PrivateIngressServiceObserver

But only the second one is actually creating a awsendpointservice CR because (the first doesn't have a load balancer associated to it):

if len(svc.Status.LoadBalancer.Ingress) == 0 {
r.log.Info("load balancer not provisioned yet")
return ctrl.Result{}, nil
}

Hence, this CR seems to be used for both. The actual configuration of PrivateIngress and happens in the AWSEndpointServiceReconciler, which we will implement the GCP equivalent in the next card, GCP-197.

Thoughts?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the customer wants public ingress but a private API server? Is that why there is two for AWS?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should be possible, but it doesn't seem it's to be controlled by the PrivateKubeAPIServerServiceObserver. Instead, this is achieved by the combination of the endpointAccess and the servicePublishingStrategy in the HostedCluster spec.

I didn't find any kube-apiserver-private service in our ROSA Clusters (even in private clusters) - that's why I think the first observer might not be required in our GCP PSC scenario (the observer that watches the private-router service would be enough).

Please let me know if you want me to rename it to PrivateIngressServiceObserver anyway. Otherwise, we can revisit this in the future (during the implementation of GCP 'fully' Private clusters).


_, err := r.CreateOrUpdate(ctx, r.Client, gcpPSC, func() error {
// Set OwnerReference to HostedControlPlane for lifecycle management
gcpPSC.OwnerReferences = []metav1.OwnerReference{{
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: it doesn't look like the AWS controller does this. This seems more correct, but could it cause problems?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I also noticed that the AWS controller doesn't do that. A explicit link/relationship between the PSC CR and the HCP can help with lookups (and keep things clean during deletion).

but could it cause problems ?

Are you concerned with anything in particular ? I can't think of any issues...

The ultimate protection against deletion will be provided by the finalizers, which I will add in the next PRs when we create the actual GCP resources.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No specific concerns, but I wanted to ask the question if perhaps there is a reason that there isn't an owner reference (e.g. resources being cleaned up in the wrong order on cluster uninstallation).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this and I confirmed that the deletion is going to work without any issues.

We will implement a deletion mechanism for the gcpprivateserviceconnect CRs that will be triggered by the hypershift operator, similar to the AWS - see logic here:

// deleteAWSEndpointServices loops over AWSEndpointServiceList items and sends a delete request for each.
// If the HC has no valid aws credentials it removes the CPO finalizer for each AWSEndpointService.
// It returns true if len(awsEndpointServiceList.Items) != 0.
func deleteAWSEndpointServices(ctx context.Context, c client.Client, hc *hyperv1.HostedCluster, namespace string) (bool, error) {
log := ctrl.LoggerFrom(ctx)
var awsEndpointServiceList hyperv1.AWSEndpointServiceList
if err := c.List(ctx, &awsEndpointServiceList, &client.ListOptions{Namespace: namespace}); err != nil && !apierrors.IsNotFound(err) {
return false, fmt.Errorf("error listing awsendpointservices in namespace %s: %w", namespace, err)
}
for _, ep := range awsEndpointServiceList.Items {
if ep.DeletionTimestamp != nil {
if platformaws.ValidCredentials(hc) && time.Since(ep.DeletionTimestamp.Time) < awsEndpointDeletionGracePeriod {
continue
}
// We remove the CPO finalizer if there's no valid credentials so deletion can proceed.
cpoFinalizer := "hypershift.openshift.io/control-plane-operator-finalizer"
if controllerutil.ContainsFinalizer(&ep, cpoFinalizer) {
controllerutil.RemoveFinalizer(&ep, cpoFinalizer)
if err := c.Update(ctx, &ep); err != nil {
return false, fmt.Errorf("failed to remove finalizer from awsendpointservice: %w", err)
}
}
log.Info("Removed CPO finalizer for awsendpointservice because the HC has no valid aws credentials", "name", ep.Name, "endpoint-id", ep.Status.EndpointID)
continue
}
if err := c.Delete(ctx, &ep); err != nil && !apierrors.IsNotFound(err) {
return false, fmt.Errorf("error deleting awsendpointservices %s in namespace %s: %w", ep.Name, namespace, err)
}
}
if len(awsEndpointServiceList.Items) != 0 {
// The CPO puts a finalizer on AWSEndpointService resources and should
// not be terminated until the resources are removed from the API server
return true, nil
}
return false, nil
}

exists, err := deleteAWSEndpointServices(ctx, r.Client, hc, controlPlaneNamespace)

The HO basically deletes the awsendpointservice and waits until the resources are cleaned up before deleting the HCPs.

With that, the Owner references will not play a part on the deletion process.


// Fetch HostedControlPlane for ConsumerAcceptList
hcp := &hyperv1.HostedControlPlane{}
if err := r.Get(ctx, types.NamespacedName{Namespace: r.HCPNamespace, Name: hcpName}, hcp); err != nil {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AWS does a list, and defends against multiple HCPs in the same namespace. Is that something we want to do too?

// Fetch the HostedControlPlane
hcpList := &hyperv1.HostedControlPlaneList{}
if err := r.List(ctx, hcpList, &client.ListOptions{Namespace: r.HCPNamespace}); err != nil {
return ctrl.Result{}, fmt.Errorf("failed to get resource: %w", err)
}
if len(hcpList.Items) == 0 {
// Return early if HostedControlPlane is deleted
return ctrl.Result{}, nil
}
if len(hcpList.Items) > 1 {
return ctrl.Result{}, fmt.Errorf("unexpected number of HostedControlPlanes in namespace, expected: 1, actual: %d", len(hcpList.Items))
}
hcp := hcpList.Items[0]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the AWS implementation does that because they don't have the owner reference (that we have been discussing in the another thread).

In the GCP approach, we perform a direct lookup (which should perform better anyway).

I can't think in a scenario where we have multiple HCPs in the namespace - and this probably should be enforced at another level.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No disagreement, but if it isn't enforced at another level today, could this leave us with no enforcement?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. We could end up without this protection. But on the other hand, should this be a responsibility of the PSC observer?

I am fine replicating the AWS logic here (as I have done for most of the other parts). I just want to make sure that we keep the logic clean and avoid unnecessary checks.

@csrwng
Copy link
Copy Markdown
Contributor

csrwng commented Nov 24, 2025

/retest-required

@patjlm
Copy link
Copy Markdown
Contributor

patjlm commented Nov 24, 2025

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Nov 24, 2025
@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/retest-required

4 similar comments
@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/retest-required

@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/retest-required

@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/retest-required

@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/retest-required

@csrwng
Copy link
Copy Markdown
Contributor

csrwng commented Nov 25, 2025

/override ci/prow/e2e-aks

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Nov 25, 2025

@csrwng: Overrode contexts on behalf of csrwng: ci/prow/e2e-aks

Details

In response to this:

/override ci/prow/e2e-aks

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@csrwng
Copy link
Copy Markdown
Contributor

csrwng commented Nov 25, 2025

/verified later @cristianoveiga

@openshift-ci-robot openshift-ci-robot added verified-later verified Signifies that the PR passed pre-merge verification criteria labels Nov 25, 2025
@openshift-ci-robot
Copy link
Copy Markdown

@csrwng: This PR has been marked to be verified later by @cristianoveiga.

Details

In response to this:

/verified later @cristianoveiga

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown

/retest-required

Remaining retests: 0 against base HEAD 34cc9c7 and 2 for PR HEAD 4e4b601 in total

@openshift-ci-robot
Copy link
Copy Markdown

/retest-required

Remaining retests: 0 against base HEAD 14ff82c and 1 for PR HEAD 4e4b601 in total

@openshift-ci-robot
Copy link
Copy Markdown

/retest-required

Remaining retests: 0 against base HEAD 5e0c7ea and 0 for PR HEAD 4e4b601 in total

@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/retest-required

@openshift-ci-robot
Copy link
Copy Markdown

/hold

Revision 4e4b601 was retested 3 times: holding

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 26, 2025
@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/hold cancel

@openshift-ci openshift-ci Bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 26, 2025
@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/retest-required

1 similar comment
@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/retest-required

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Nov 27, 2025

@cristianoveiga: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot Bot merged commit a13c49d into openshift:main Nov 27, 2025
21 checks passed
@billmvt
Copy link
Copy Markdown

billmvt commented Dec 3, 2025

/area platform/gcp

@openshift-ci openshift-ci Bot added the area/platform/gcp PR/issue for GCP (GCPPlatform) platform label Dec 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/api Indicates the PR includes changes for the API area/cli Indicates the PR includes changes for CLI area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/documentation Indicates the PR includes changes for documentation area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release area/platform/gcp PR/issue for GCP (GCPPlatform) platform jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria verified-later

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants