generated from kubernetes/kubernetes-template-project
-
Notifications
You must be signed in to change notification settings - Fork 621
Adding GEP 746: Replace Cert Refs on HTTPRoute with Cross Namespace Refs from Gateway #749
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,265 @@ | ||
| # GEP-746: Replace Cert Refs on HTTPRoute with Cross Namespace Refs from Gateway | ||
|
|
||
| * Issue: [#746](https://github.com/kubernetes-sigs/gateway-api/issues/746) | ||
| * Status: Implementable | ||
|
|
||
| ## TLDR | ||
|
|
||
| This GEP proposes that we should remove TLS Certificate references from | ||
| HTTPRoute and replace them with Cross Namespace Certificate references from | ||
| Gateways. Although that is not a complete replacement on its own, this GEP shows | ||
| how a controller could provide the rest of the functionality with this approach. | ||
|
|
||
| ## Goals | ||
|
|
||
| * Remove a confusing and underspecified part of the API - cert refs on | ||
| HTTPRoute. | ||
| * Add the ability to reference certificates in other namespaces from Gateways | ||
| to replace much of the functionality that was enabled by cert refs on | ||
| HTTPRoute. | ||
| * Describe how a controller could automate self service cert attachment to | ||
| Gateway listeners. | ||
|
|
||
| ## Non-Goals | ||
|
|
||
| * Actually provide a core implementation of a controller that can enable self | ||
| service cert attachment. This may be worth considering at a later point, but | ||
| is out of scope for this GEP. | ||
|
|
||
| ## Introduction | ||
|
|
||
| TLS Certificate references on HTTPRoute have always been a confusing part of the | ||
| Gateway API. In the v1alpha2 release, we should consider removing this feature | ||
| while we still can. This GEP proposes an alternative that is simpler to work | ||
| with and understand, while also leaving sufficient room to enable all the same | ||
| capabilities that certificate references on HTTPRoute enabled. | ||
|
|
||
| ### Attaching TLS Certificates with Routes is Confusing | ||
| One of the most confusing parts of the Gateway API is how certificates can be | ||
| attached to Routes. There are a variety of different factors that lead to | ||
| confusion here: | ||
|
|
||
| * It can be natural to assume that a certificate attached to a Route only | ||
| applies to that Route. In reality, it applies to the entire listener(s) | ||
| associated with that Route. | ||
| * This means that a Route can affect any other Routes attached to the same | ||
| Gateway Listener. By attaching a Route to a Gateway Listener, you’re | ||
| implicitly trusting all other Routes attached to that Gateway Listener. | ||
| * When multiple Routes specify a certificate for the same Listener, it’s | ||
| possible that they will conflict and create more confusion. | ||
|
|
||
| ### Why We Did It | ||
| To understand how we ended up with the ability to attach TLS certificates with | ||
| Routes, it’s helpful to look at the use cases for this capability: | ||
|
|
||
| 1. Some users want Route owners to be able to attach arbitrary domains and certs | ||
| to a Gateway listener. | ||
| [#103](https://github.com/kubernetes-sigs/gateway-api/issues/103) | ||
| 1. Some users want Route owners to control certs for their applications. | ||
|
|
||
| ### Alternative Solutions | ||
|
|
||
| #### 1. Automation with tools like Cert-Manager | ||
| When automation is acceptable, the first use case is entirely possible with | ||
| tools like cert-manager that can watch Routes, generate certs for them, and | ||
| attach them to a Gateway. | ||
|
|
||
| #### 2. Cross Namespace Cert Direct References from Gateways | ||
| With the already established ReferencePolicy concept, we have established a safe | ||
| way to reference resources across namespaces. Although this would require some | ||
| coordination between Gateway and App owners, it would enable App owners to | ||
| retain full control of the certs used by their app without the extra confusion | ||
| that certs in HTTPRoute have led to. | ||
|
|
||
| ### Enabling Self-Service Certificate Attachment for App Owners | ||
| Although this dramatically simplifies the API, it does not completely replace | ||
| the functionality that certs attached to HTTPRoutes enabled. Most notably, it | ||
| would be difficult to attach arbitrary self-provided certificates to a Gateway | ||
| listener without requiring manual changes from a Gateway admin. | ||
|
|
||
| There are a couple potential solutions here: | ||
|
|
||
| #### 1. Implement a selector for cert references instead of direct references | ||
| Although the simplicity of this approach is nice, it ends up with many of the | ||
| same problems as certificates attached to Routes have and feels inconsistent | ||
| with how Routes attach to Gateways. | ||
|
|
||
| #### 2. Implement a controller that attaches certificates to Gateway listeners | ||
| Similar to cert-manager, it could be possible to implement a controller that | ||
| watches for Secrets with a certain label, and attaches those to the specified | ||
| Gateway. Although it's out of scope for this GEP to completely define what a | ||
| controller like this could look like, it would likely need to include at least | ||
| one of the following safeguards: | ||
|
|
||
| 1. A way to configure which namespaces could attach certificates for each | ||
| domain. | ||
| 2. A way to configure which namespaces could attach certificates to each | ||
| Gateway (or Listener). | ||
| 3. A way to use ReferencePolicy to indicate where references from Secrets to | ||
| Gateways were trusted from and to. | ||
|
|
||
| ## API | ||
|
|
||
| The API changes proposed here are quite small, mostly removing fields. | ||
|
|
||
| ### Changes | ||
| 1. The `LocalObjectReference` used for the `CertificateRef` field in | ||
| `GatewayTLSConfig` would be replaced with an `ObjectReference`. | ||
| 1. `ReferencePolicy` would be updated to note that references from Gateways to | ||
| Secrets were part of the Core support level. | ||
|
|
||
| ### Removals | ||
|
|
||
| From HTTPRouteSpec: | ||
| ```go | ||
| // TLS defines the TLS certificate to use for Hostnames defined in this | ||
| // Route. This configuration only takes effect if the AllowRouteOverride | ||
| // field is set to true in the associated Gateway resource. | ||
| // | ||
| // Collisions can happen if multiple HTTPRoutes define a TLS certificate | ||
| // for the same hostname. In such a case, conflict resolution guiding | ||
| // principles apply, specifically, if hostnames are same and two different | ||
| // certificates are specified then the certificate in the | ||
| // oldest resource wins. | ||
| // | ||
| // Please note that HTTP Route-selection takes place after the | ||
| // TLS Handshake (ClientHello). Due to this, TLS certificate defined | ||
| // here will take precedence even if the request has the potential to | ||
| // match multiple routes (in case multiple HTTPRoutes share the same | ||
| // hostname). | ||
| // | ||
| // Support: Core | ||
| // | ||
| // +optional | ||
| TLS *RouteTLSConfig `json:"tls,omitempty"` | ||
| ``` | ||
|
|
||
| And the associated struct: | ||
| ```go | ||
| // RouteTLSConfig describes a TLS configuration defined at the Route level. | ||
| type RouteTLSConfig struct { | ||
| // CertificateRef is a reference to a Kubernetes object that contains a TLS | ||
| // certificate and private key. This certificate is used to establish a TLS | ||
| // handshake for requests that match the hostname of the associated HTTPRoute. | ||
| // The referenced object MUST reside in the same namespace as HTTPRoute. | ||
| // | ||
| // CertificateRef can reference a standard Kubernetes resource, i.e. Secret, | ||
| // or an implementation-specific custom resource. | ||
| // | ||
| // Support: Core (Kubernetes Secrets) | ||
| // | ||
| // Support: Implementation-specific (Other resource types) | ||
| // | ||
| CertificateRef LocalObjectReference `json:"certificateRef"` | ||
| } | ||
| ``` | ||
|
|
||
| From GatewayTlsConfig: | ||
| ```go | ||
| // RouteOverride dictates if TLS settings can be configured | ||
| // via Routes or not. | ||
| // | ||
| // CertificateRef must be defined even if `routeOverride.certificate` is | ||
| // set to 'Allow' as it will be used as the default certificate for the | ||
| // listener. | ||
| // | ||
| // Support: Core | ||
| // | ||
| // +optional | ||
| // +kubebuilder:default={certificate:Deny} | ||
| RouteOverride *TLSOverridePolicy `json:"routeOverride,omitempty"` | ||
| ``` | ||
|
|
||
| And the associated types: | ||
| ```go | ||
| type TLSRouteOverrideType string | ||
|
|
||
| const ( | ||
| // Allows the parameter to be configured from all routes. | ||
| TLSROuteOVerrideAllow TLSRouteOverrideType = "Allow" | ||
|
|
||
| // Prohibits the parameter from being configured from any route. | ||
| TLSRouteOverrideDeny TLSRouteOverrideType = "Deny" | ||
| ) | ||
|
|
||
| // TLSOverridePolicy defines a schema for overriding TLS settings at the Route | ||
| // level. | ||
| type TLSOverridePolicy struct { | ||
| // Certificate dictates if TLS certificates can be configured | ||
| // via Routes. If set to 'Allow', a TLS certificate for a hostname | ||
| // defined in a Route takes precedence over the certificate defined in | ||
| // Gateway. | ||
| // | ||
| // Support: Core | ||
| // | ||
| // +optional | ||
| // +kubebuilder:default=Deny | ||
| Certificate *TLSRouteOverrideType `json:"certificate,omitempty"` | ||
| } | ||
| ``` | ||
|
|
||
| ## Prior Art | ||
|
|
||
| OpenShift already supports configuring TLS certificates on Routes. Although | ||
| largely similar to the Gateway API approach, there are some notable differences: | ||
|
|
||
| * Each Route can specify a maximum of 1 hostname | ||
| * When a Route is attached to a hostname, newer Routes can't use the same | ||
| hostname unless all of the following are true: | ||
| * The Routes are in the same namespace or the Router is configured to allow | ||
| sharing hostnames across namespaces | ||
| * The Routes have unique, non-overlapping paths specified | ||
| * The Routes are not TCP or TLS routes | ||
|
|
||
| A typical configuration would involve a Router with `*.example.com` that has a | ||
| wildcard cert. Routes could be attached within those constraints without the | ||
| need for a cert. Routes can also use a different hostname if they also provide a | ||
| cert. | ||
|
|
||
| ## Alternatives | ||
|
|
||
| ### 1. Improved Documentation + Extended Support Level | ||
| My first attempt to improve this was to create a | ||
| [PR](https://github.com/kubernetes-sigs/gateway-api/pull/739) that would clarify | ||
| the documentation around how this works and lower the support level to extended. | ||
|
|
||
| Trying to improve the documentation around this feature made it clear how easy | ||
| it would be to get confused by how it worked. It would be only natural to assume | ||
| that a cert attached to a Route would only apply to that Route. The conflict | ||
| resolution semantics associated with this were both complicated and difficult to | ||
| surface to a user through status or other means. | ||
|
|
||
| Lowering the support level from core to extended also didn't make sense. | ||
| Although some implementers were uncomfortable with supporting this feature due | ||
| to the potential for vulnerabilities, that was not a sufficient reason to lower | ||
| the support level. An extended support level should only be used for features | ||
| that cannot be universally supported. That was not the case here. Instead there | ||
| were just very real questions around the safety of the feature. | ||
|
|
||
| The combination of those 2 factors led me to believe that this feature was not | ||
| well thought out and should be removed. Since this was essentially just a | ||
| shortcut to attaching certificates to a Gateway listener from different sources, | ||
| it seemed like there had to be a way that was both safer and easier to | ||
| understand. That led to this proposal. | ||
|
|
||
| ### 2. Implement Hostname Restrictions | ||
| Similar to the OpenShift approach described above, we could enforce the | ||
| following: | ||
|
|
||
| 1. Only a single hostname may be specified for HTTPRoutes with a certificate | ||
| reference. | ||
| 1. The oldest HTTPRoute to attach a certificate to a hostname would effectively | ||
| own that hostname. No other HTTPRoutes could be attached with the same | ||
| hostname unless they were explicitly allowed by that HTTPRoute. | ||
|
|
||
| The second condition would be difficult to validate. As we've seen elsewhere in | ||
| the API, it's difficult to determine which resource was first to claim a | ||
| hostname or path. Instead we have to rely on the oldest resource, which can | ||
| result in some weird and potentially breaking changes if an older resource | ||
| chooses to claim a hostname. | ||
|
|
||
| ## References | ||
|
|
||
| Docs: | ||
|
|
||
| * [Gateway API: Replacing TLS Certificates in Routes](https://docs.google.com/document/d/1Cv95XFCL6S_9pIyS0drnsDLsfinWc2tHOFl_x3-_SWI/edit) | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may end up being necessary for scalability. Using direct references, if I have a gateway with 1000 httproutes, and each httproute has a certificate, adding a direct reference to each one would be cumbersome—or am I missing an alternative here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're going to be limited to a cert per domain, or maybe a cert per domain per gateway. If you had 1000 httproutes with their own cert, would they also have unique hostnames? The end result in any case is the same, these certs need to be attached to a Gateway. The question is how we do that. It could be a direct ref from Route -> Cert, or as proposed here, a direct ref from Gateway Listener -> Cert.
I think automation could potentially help here. A controller could be configured to trust certs for *.example.com from foo namespace, and then could automatically attach them to a Gateway specified by some kind of label. That seems like it could be slightly easier than either Gateway or Route attachment when dealing with 1000+ domains. Though hopefully at that scale, cert generation can be automated with a tool like cert-manager.
Any way we do this, we'll need some kind of link between a Gateway and a cert. HTTPRoute provided one potential way to make that link, but I'm not sure it provided any significant scalability/usability improvements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be missing some nuance. Should "domain" be read as "hostname" here?
Yes.
I'm less concerned about automating the attachment and more concerned about fanout and the idea that the gateway would need 1000+ direct references to certificates. I suppose a controller could batch updates to mitigate contention issues, but it would be a bit inconvenient (especially for humans) to work with a gateway definition that had to list out explicitly every linked certificate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I just realized that gateway's
spec.listenerhas// +kubebuilder:validation:MaxItems=64, and each listener can have only one certificate reference.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I should've just said "hostname"
There's some nuance here in that a single resource may be linked per listener, that resource could potentially contain more than one cert (not saying that's a good idea though).
I think one of the problems with certs being attached to Routes was that there was a bit too much magic going on. It's difficult/impossible to see which certs are actually attached to a given Gateway listener. Although certs appeared to be attached to Routes, they were being attached to the listeners attached to those Routes, and that was just being hidden. As an implementation you'd still need to have some way to represent that many certs being attached to your listeners. I'd rather increase the max number of listeners and/or turn certificate refs into a list because I think those changes would be clearer and easier for users to understand.
I agree that that's never going to be a good experience. Would Gateway merging help here? The idea that compatible Gateways can be merged together by implementations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The godoc implies that the resource can have exactly one certificate+key pair:
I was thinking through this possibility. Gateway merging might be the best approach here. Either the cluster admin could give users permission to create gateways in their own namespaces, which would allow users to attach routes and certificates within their own respective namespaces (or others' namespaces, but only if permitted by some referencepolicy); or the controller that automated certificate management could do some sort of sharding with up to 64 certificates per shard/gateway. (I personally like the idea of allowing users to manage their own gateways, but I need to think through all the implications of that approach.)
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, I'm still on board with this GEP given how attaching a certificate to a route using the API as it stands without this GEP doesn't actually bind the certificate to the route, but the self-service use case still needs some clarification (not necessarily in this GEP though).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I personally really like the idea of more smaller Gateways than a single massive one. I'm really hoping that pattern will work for most organizations. I think it's worth digging into this self service model a bit more, especially around the scale aspect. I'd primarily been thinking in terms of ~dozens of certs per Gateway, not 1000+.
Sounds good, and agree that we need a clearer understanding of how self-service could work. Maybe @maelvls or @jakexks would have some insights here around how we can build a more scalable API around this.
Given the v1alpha2 timeline, I think I'm going to create a follow up issue to continue the discussion around this so we can get this change in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #763 to track this.