-
-
Notifications
You must be signed in to change notification settings - Fork 371
fix: kube-core::Schema hoisting logic #1839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Fixes kube Schema conversion since schemars Schema's have changed. - Add tests for a variety of enum use-cases: - Tagged vs Untagged - Unit vs Tuple vs Structural variants - With and without doc-comments (descriptions) - Rewrite the hoisting logic - This is annotated with dev-comments to help understand intend and to ease future schemars changes. This also fixes other issues: - Untagged enum variant doc-comments were being applied to field descriptions. - Additional `null` entry added to enums. --------- Co-authored-by: Nick Larsen <[email protected]> Co-authored-by: Sebastian Bernauer <[email protected]> Co-authored-by: Techassi <[email protected]> Signed-off-by: Nick Larsen <[email protected]>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1839 +/- ##
=======================================
+ Coverage 74.7% 75.3% +0.6%
=======================================
Files 84 90 +6
Lines 7938 8133 +195
=======================================
+ Hits 5927 6120 +193
- Misses 2011 2013 +2
🚀 New features to boost your workflow:
|
|
I'm unsure what to do about the CI error: For me, I get [email protected]. Should I pin the dependency? Line 50 in 7c63f56
|
|
Created #1840 to address the |
|
I've just synced this PR branch to include the CI fixes (done in #1840). This PR should now be ready for review. |
|
(Also a small draft PR with a fix popped up: #1841) |
|
Heyyy, sorry not had much time to deeply look at things the last month. This is definitely on the larger size, so only done a quick pass now, but the testing rigor is definitely super helpful! Thanks a lot for all this. One thing that stands out as different in the approaches here is the use of As a quick check first. Would you be OK with rebasing (possibly also splitting to expedite) this if I were to merge #1853 first? That's a pretty digestible PR, although it does seem to clash with your |
No problem. I know how it gets.
Yeah, we are aware of that. We started off with very minimal changes (much like the other PR(s)) but after creating a set of tests we found at least three bugs which lead to a larger refactor after seeing a more general transformation pattern. The other PRs appear to only be solving one at a time and I believe they would not pass the test suite in this PR.
I just didn't think of that, but I think we can just make multiple transformers and move the chained calls inside the existing one into each new transformer. I think I can spend a little bit of time to see how easy it is to convert.
I guess I'd prefer not to, partly because of the possible merge conflicts, but also because I think they only solve part of the problem without seeing the broader issue. Given that, if you are not convinced to hold off merging the other PR(s), I understand. |
Signed-off-by: Nick Larsen <[email protected]>
77eb176 to
50d1b89
Compare
|
I have updated the implementation to split the single transformer into multiple transformers. The original I did try moving the body of the transform functions into the Transform impl but then the unit tests would need rewriting (I can try that is you prefer). Anyway, I will leave it as is and await your feedback. |
I think this makes the intent between the various functionality a bit clearer at the very least. 👍
yeah. that is a concern I have as well. i need to look a bit more tomorrow to understand the various test cases here, but afaikt the tests are fairly isolated as is, and each transformer now looks like something that can be reviewed independently, but maybe there's a subtletly i am missing. i'll get back with more tomorrow. |
Yeah, that is fair. Will hold off.
I like the idea of this if it doesn't make the testing too difficult. At least then we are testing the actual interface and not only all our internal functions/structs (which i at some point hoped we could reduce). Maybe it can make the tests a little easier to read as well if we try stuff like |
clux
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some initial comments on various bits. suggestions on some code-minimisation.
feels like i mostly want more use of slice-patterns (to remove some implicit panics), more let-else use, more comments on the invariants that cause us to panic, and possibly (possibly also testing of the interface direct as per previous comment).
otherwise, i think this approach is nice. it is very thorough, and you're dealing with a more complicated codebase at stackable than i have to deal with so happy to take changes like this from someone with a bit of skin in the game.
| // Apply this (self) transform to all subschemas | ||
| schemars::transform::transform_subschemas(self, transform_schema); | ||
|
|
||
| let mut schema: SchemaObject = match serde_json::from_value(transform_schema.clone().to_value()).ok() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this match might be able to be a let-else;
let mut Some(schema) = serde_json::... { else return; }if the compiler allows the mut in there with a type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 14457d1
| if any_of.len() != 2 { | ||
| return; | ||
| } | ||
|
|
||
| let entry_is_null: [bool; 2] = any_of | ||
| .iter() | ||
| .map(|schema| serde_json::to_value(schema).expect("schema should be able to convert to JSON")) | ||
| .map(|value| value == *NULL_SCHEMA) | ||
| .collect::<Vec<_>>() | ||
| .try_into() | ||
| .expect("there should be exactly 2 elements. We checked earlier"); | ||
|
|
||
| // Get the `any_of` subschema that isn't the null entry | ||
| let subschema_to_hoist = match entry_is_null { | ||
| [true, false] => &any_of[1], | ||
| [false, true] => &any_of[0], | ||
| _ => return, | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
technically 3 panic-using apis here; array indexing + 2 expects. indexing is checked, but would be nice to use slice-patterns to avoid the implicit linkage between the length check code and the indexing code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first expect should not happen unless everything is broken (a SchemaObject should be able to convert to a serde_json::Value).
The second expect should not be reachable due to the early return.
I'm happy to change things around to make this more compact. Could you elaborate what you mean about the slice-pattern? I tried a few things, but that introduced more conversions and clones.
| // TODO (@NickLarsenNZ): Return errors instead of panicking, leave panicking up to the | ||
| // infallible schemars::Transform |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
been thinking about the panicing here a bit, and i think at least the panicing here would generally apply to "setup code". even if it's not, i would rather have a panic here than to accidentally do a runtime "error" log that people miss and as a result perhaps modify a schema incorrectly (making people potentially have to do migrations).
the tradeoff here is that we could panic a controller that does something with schemas live, but controllers should be idempotent and resilient against temporary failures like a controller crashing..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's really unfortunate that the schemars::Transform is infallible.
As for the comment I left in the code, that was more about making the functions return errors (easier to tests too) and then panic on error inside the transform.
Would you like me to remove the comment?
| // TODO (@NickLarsenNZ): Return errors instead of panicking, leave panicking up to the | |
| // infallible schemars::Transform |
| /// | ||
| /// This will return early without modifications unless: | ||
| /// - There are `oneOf` subschemas (not empty). | ||
| /// - Each subschema contains an `enum` or `const`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly this can be reduced in scope if we use the existing ReplaceConstValue? EDIT: probably not; it doesn't do a lot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah. Should I leave as-is then?
On that note, Kubernetes might eventually support const: kubernetes/kubernetes#134684
| // TODO (@NickLarsenNZ): At this point, we are seeing duplicate `title` when printing schema as json. | ||
| // This is because `title` is specified under both `extensions` and `other`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it make sense to concat the titles?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean remove it from other when it is already in extensions?
| if types.any(|r#type| r#type != variant_type) { | ||
| panic!("oneOf variants must all have the same type"); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably a sane unwrap for now. might need another line about why this is true in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added extra explanation in 77af8e5
| if subschemas.is_empty() { | ||
| return; | ||
| } | ||
|
|
||
| // Ensure we aren't looking at the subschema with a null, as that is hoisted by another | ||
| // transformer. | ||
| if subschemas.len() == 2 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these two checks could be a slice pattern match perhaps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I quickly tried both:
match subschemas.len() { ...}
// and
match subschemas.as_slice() { .. }But it didn't make things any simpler.
Would you mind giving a rough example as a hint?
| assert-json-diff.workspace = true | ||
| runtime-macros.workspace = true | ||
| prettyplease.workspace = true | ||
| tokio = { workspace = true, features = ["rt", "macros"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be a workspace dep
Danil-Grigorev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I'd prefer not to, partly because of the possible merge conflicts, but also because I think they only solve part of the problem without seeing the broader issue.
I believe the #1821 is only mentioning the specific issue with optional enums, and we need to fix that one directly, and not "oh btw I also fixed that". Having separate issues for this PR addresses would help.
| // TODO (@NickLarsenNZ): Return errors instead of panicking, leave panicking up to the | ||
| // infallible schemars::Transform | ||
| let Schema::Object(to_hoist) = subschema_to_hoist else { | ||
| panic!("the non-null anyOf subschema is a bool. That is not expected here"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not panic in this code. This produces no linter hints, and gives no trace on which field actually failed. The only correct approach with the chosen path by schemars is to split code into more domain specific parts, and log errors instead of panicking here.
| use super::*; | ||
|
|
||
| #[test] | ||
| fn optional_tagged_enum_with_unit_variants() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this representable with a CRD schema? While pure json seems clear here, it probably does not make sense to test internals of the generated code, as this may change over time. It may also be unrepresentable in a CRD form, in which case we shouldn't try to test such things.
With a CRD based approach, the test is focused on functionality first, so in my opinion this should follow the pattern of crd_schema_test.rs or newly added crd_complex_enum_tests.rs
| #[derive(Debug, Clone)] | ||
| pub struct AnyOfSchemaRewriter; | ||
|
|
||
| impl Transform for AnyOfSchemaRewriter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that this method has no functional differences with implementation in #1853 and therefore it will pass the tests. Yet it is split in two transforms - OptionalEnumSchemaRewriter and AnyOfSchemaRewriter instead of addressing it in one place.
| /// NOTE: The trailing null is removed because it's not needed by Kubernetes | ||
| /// and makes the CRD more compact by removing redundant information. | ||
| #[derive(Debug, Clone)] | ||
| pub struct OptionalEnumSchemaRewriter; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems strange to have AnyOfSchemaRewriter and OptionalEnumSchemaRewriter split like this. Also OptionalEnumSchemaRewriter already overrides null containing enums with top level enum. Why is this needed? Can this be produced/tested by some CRD schema?
| return; | ||
| } | ||
|
|
||
| // Ensure we aren't looking at the subschema with a null, as that is hoisted by another |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should be driven by the order of transformers, not the manual if->return.
| .expect("there should be exactly 2 elements. We checked earlier"); | ||
|
|
||
| // Get the `any_of` subschema that isn't the null entry | ||
| let subschema_to_hoist = match entry_is_null { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless there is a scenario where the null enum produced first in the generated array by CRD definition, it doesn't make sense to test permutations like this.
NOTE: the kube_schema metadata does need dropping, but that happens implicitly when we replace kube_schema with the subschema
Fixes #1821
This updates the kube Schema hoisting since schemars Schema's have changed.
This also fixes other issues that appeared along the way:
nullentry added to enums.See also: stackabletech#1, where this work was originally done.
Motivation
Since kube 1.0.0, schemas became invalid when optional enums contained comments on the variants.
Solution