feat: Add strict (RFC-9562-compliant) parsing and validation functions #192
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
#37 added parsing support for non-RFC-9562-compliant UUIDs of the form
{xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}orxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx. (Actually, the curly braces are allowed to be any byte, which is even more permissive - see also #60.)We at
nicheincwould like to be able to opt out of this more lenient parsing, to ensure that a successfully parsed UUID is compliant with RFC 9562. Concretely, theuuid's default lax parsing has caused issues for us during fuzz testing, leading to us writing our own stricterparseUUIDfunction wrappinguuid.Parse. However, it would be nice if strict parsing were available out of the box.To that end, this PR adds a corresponding "strict" version for each parsing function:
ParseStrictParseParseBytesStrictParseBytesMustParseMustStrictParseValidateStrictValidateStrictis the first prefix I thought of, but there may be a clearer name - happy to change it if so.Note that I've chosen to implement the existing lenient functions in terms of the new strict functions. They could instead be kept totally independent, at the cost of code duplication.
Alternative Solutions
Parse. Another option here is to create a new major version ofuuid, whereParse, etc. are strict by default, with lenient parsing being moved to new functions. However, I don't think that's worth a whole new major version by itself. It could be worth considering foruuid/v2, if one were ever planned.Strict Unmarshaling
This PR does not provide strict versions of the
UnmarshalTextandUnmarshalBinarymethods inmarshal.gosince their purpose is to implement theencoding.TextUnmarshalerandencoding.BinaryUnmarshalerinterfaces, and a type can only have one implementation of an interface. 😕The only two approaches I can think of to enable strict unmarshaling are:
uuidthat toggles the behavior ofUnmarshalTextandUnmarshalBinary. This kind of invisible, package-wide configuration is in my opinion an antipattern and not worth pursuing.uuid.UUIDthat has strictUnmarshalTextandUnmarshalBinarymethods. This option is pretty noisy and unergonomic.I'd love to hear if anyone can think of a more reasonable approach. Regardless, I think it would be beneficial for
uuidto provide standard-compliant parsing functions even if it can't include unmarshaling.Edit (2025-07-18): Using
encoding/json/v2, it's possible to pass*json.Unmarshalerstojson.Unmarshalto override unmarshaling behavior for particular types. As an extension of this PR, it should be possible foruuidto provide a*json.Unmarshalersthat enforces strict parsing, corresponding toStrictParse. That seems to me like the most ergonomic way to support strict unmarshaling in a backward-compatible way.