Skip to content

Conversation

@bexxmodd
Copy link
Contributor

@bexxmodd bexxmodd commented Jul 16, 2025

What type of PR is this?
/area conformance-test

What this PR does / why we need it:
To guide conformance test users in a right direction, we want to discourage manually supplying supported features for conformance tests, instead those features should be inferred from GWC status. This change will block report generation if above mentioned logic is not followed.

Does this PR introduce a user-facing change?:

For generating conformance report for Gateway features, there will be no need to use flags like `--supported-features` or `--exempt-features`, instead conformance suite will automatically read them from GWC status. This is the behavior we want to encourage to reduce manual intervention in the report generation process, those manually supplied (or excluded) features should only be used for local development/testing.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. area/conformance-test Issues or PRs related to Conformance tests. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 16, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: bexxmodd
Once this PR has been reviewed and has the lgtm label, please assign liorlieberman for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jul 16, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @bexxmodd. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jul 16, 2025
@bexxmodd
Copy link
Contributor Author

/cc @LiorLieberman @robscott

if suite.supportedFeaturesSource == supportedFeaturesSourceManual &&
!hasMeshFeatures(suite.SupportedFeatures) &&
!suite.conformanceProfiles.HasAny(MeshHTTPConformanceProfileName, MeshGRPCConformanceProfileName) {
return nil, fmt.Errorf("can't generate report: Gateway's supported features should be read from Status and not supplied through flags")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means its impossible to run the report if your gateway doesn't support the SupportFeatures feature. Which is a brand new feature. This seems far to early to do, if ever

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that we should not block in v1.4. I think the corresponding GEP said that while this would be the default/encouraged path for v1.4+, it would only become required in v1.5. Of course mesh features will also have a longer extension here because there's currently no alternative way of listing supported features.

Copy link
Contributor Author

@bexxmodd bexxmodd Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we decided to give grace period for Supported Features until v1.5 release, my intention with this PR was to merge it after Jul 22 so it falls under v1.5, when we'll start proactively blocking manually specified reports.

Copy link
Member

@robscott robscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bexxmodd!

if suite.supportedFeaturesSource == supportedFeaturesSourceManual &&
!hasMeshFeatures(suite.SupportedFeatures) &&
!suite.conformanceProfiles.HasAny(MeshHTTPConformanceProfileName, MeshGRPCConformanceProfileName) {
return nil, fmt.Errorf("can't generate report: Gateway's supported features should be read from Status and not supplied through flags")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that we should not block in v1.4. I think the corresponding GEP said that while this would be the default/encouraged path for v1.4+, it would only become required in v1.5. Of course mesh features will also have a longer extension here because there's currently no alternative way of listing supported features.

defer suite.lock.RUnlock()

if suite.supportedFeaturesSource == supportedFeaturesSourceManual &&
!hasMeshFeatures(suite.SupportedFeatures) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit orthogonal to this, but I think it would be useful to proactively ignore mesh features published in GatewayClass status so that it's clear that those should be reported separately and aren't really tied to the GatewayClass. Since it seems like a Mesh resource is ~imminent, it seems cleaner for Mesh features to be reported there in the future instead of both resources reporting all features.

Since I think this would exclusively affect Istio, would appreciate some feedback from that side on this idea:

/cc @howardjohn @mikemorris

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me, agreed clearing this up sooner sounds good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robscott currently if Mesh features are read from GWC we throw an error

Are you implying to have some extra check here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That impl seems good to me (and easy to catch/fix as an implementer) @bexxmodd

@robscott
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 17, 2025
@shaneutt shaneutt self-assigned this Aug 25, 2025
@shaneutt shaneutt moved this to Review in Release v1.4.0 Aug 25, 2025
@shaneutt shaneutt added this to the v1.4.0 milestone Aug 25, 2025
@shaneutt shaneutt linked an issue Aug 25, 2025 that may be closed by this pull request
@shaneutt shaneutt modified the milestones: v1.4.0, v1.5.0 Aug 25, 2025
@shaneutt shaneutt removed their assignment Aug 25, 2025
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 20, 2025
@kflynn
Copy link
Contributor

kflynn commented Oct 7, 2025

There was a lot of discussion about this on Slack back in July that never made it to this PR. Summarizing from that:

  1. I'm very much in favor of publishing supportedFeatures on Gateway or Mesh resources, depending on what the implementation supports.
  2. I'm very much in favor of having the conformance code default to reading which features to test from supportedFeatures for whatever profiles are being requested.
  3. I'm in favor of letting conformance detect mismatches between supportedFeatures and what tests pass or fail, and I'm in favor of somehow telling Ana when she tries to use a feature not in supportedFeatures. (I don't know that either of these is really practical to implement, but I'd be in favor of them.)
  4. I'm very, very strongly opposed to trying to block reports that use manually-specified features. There's no such thing as zero-cost code, nor zero-error code, and in every case I've seen similar things in the past, the costs have outweighed any benefits.

In this particular case, the benefit is particularly unclear, but the cost is not, so I'm opposed to including this at all. (I might be OK, much much later, with relying completely on features read from supportedFeatures -- but not yet.)

Copy link
Contributor

@howardjohn howardjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly agree with Flynn that this is not the right approach for the project

@robscott
Copy link
Member

robscott commented Oct 7, 2025

I'm very, very strongly opposed to trying to block reports that use manually-specified features. There's no such thing as zero-cost code, nor zero-error code, and in every case I've seen similar things in the past, the costs have outweighed any benefits.

This seems to be the crux of the case against this change, but I don't think I fully understand it. It seems like we're saying that code can have errors, and therefore we shouldn't make changes to our code?

In this particular case, the benefit is particularly unclear, but the cost is not, so I'm opposed to including this at all

I think the benefit is quite clear - we ensure that all conformant implementations are consistently publishing the features they support. I'm not really sure what the cost is here other than that there may be bugs in our implementation, which seems true of anything we could possibly do?

@howardjohn
Copy link
Contributor

Would you think it was acceptable to block an entire conformance report if some single feature wasnt supported? If not, why should this one do that?

we already have only like 8/40 implementations not submitting any conformance reports , why artificially punish some of those 8 just because they don't support 1 features (that is, IMO, the least useful feature in the API)?

additionally, requiring this on Mesh which is an experimental feature, would be unacceptable as well

@bexxmodd
Copy link
Contributor Author

bexxmodd commented Oct 7, 2025

additionally, requiring this on Mesh which is an experimental feature, would be unacceptable as well

This one is specific for GW features.

@kflynn
Copy link
Contributor

kflynn commented Oct 7, 2025

@robscott, if you default to reading supported features from supportedFeatures and leave everything else the same, you'll get to just about the same place as this idea that we need to police manually-supplied features. We simply don't need to solve this "problem" with code: the pressure that will push implementations to be accurate in what they claim to support is that the users of the implementation will either demand features that they don't see listed as supported, or yell if features that they do see listed as supported don't work.

Furthermore, the proposed code won't solve this problem (that we don't need to solve) anyway, since reports can still be edited by hand before their PR goes up. But it will require maintenance -- and adding code that has to be maintained in order to not solve a problem that doesn't need solving anyway is 100% incurring costs for no tangible benefit.

Neither is it hypothetical to talk about maintenance costs, because right now at this very moment, code introduced to (not) solve this problem (that we don't need to solve) is preventing me from running conformance at all for Linkerd... which is a thing that was never supposed to be affected anyway, but was.

At this point, Beka and I have already both separately spent time on this, so we are already racking up more costs for no wins -- and it's not fixed yet, either, so the costs will go higher still. If that doesn't resonate with you personally, I can start billing you per my normal hourly consulting rate for time I spend on this... 🙂

@kflynn
Copy link
Contributor

kflynn commented Oct 7, 2025

/hold

just to make sure this isn't accidentally merged while we're discussing it.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/conformance-test Issues or PRs related to Conformance tests. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GEP: SupportedFeatures

7 participants