-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Port Compile API extensions from EOPA #7887
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for openpolicyagent ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
8a37c1f to
41f6db1
Compare
.github/workflows/pull-request.yaml
Outdated
| - name: Prep e2e prisma tests | ||
| run: npm ci | ||
| working-directory: v1/test/e2e/compile/prisma | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need this because the e2e compile tests also cover the ucast > prisma path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These "checks" are post-PE, checking if the partial results are within the fragment that's translatable to SQL or UCAST. Each target comes with its own constraints, like, some can do startswith and some cannot. Also, when you want your Rego to be translatable into multiple target formats, the intersection of each target's constraints is what matters, so there's machinery here to build that intersection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file collects some stuff that the server handler uses. It's put here to not be public (yet), and because in EOPA, we had another caller: the rego.compile() builtin. If/when that's ported is open for debate, but having these methods here shouldn't hurt.
| var ( | ||
| sqlBuiltins = allBuiltins | ||
|
|
||
| // sqlite doesn't support startswith/endswith/contains | ||
| sqlSQLiteBuiltins = ucastBuiltins.Clone().Add("internal.member_2") | ||
|
|
||
| ucastBuiltins = NewSet( | ||
| "eq", | ||
| "neq", | ||
| "lt", | ||
| "lte", | ||
| "gt", | ||
| "gte", | ||
| ) | ||
| allBuiltins = ucastBuiltins.Clone().Add( | ||
| "internal.member_2", | ||
| // "nin", // TODO: deal with NOT IN | ||
| "startswith", | ||
| "endswith", | ||
| "contains", | ||
| ) | ||
| ucastLINQBuiltins = ucastBuiltins.Clone().Add( | ||
| "internal.member_2", | ||
| // "nin", // TODO: deal with NOT IN | ||
| ) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the built-in functions supported by the different targets. It's a prime spot for extensions and new cool features, expanding the fragment of rego policies we can translate into SQL etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file contains the UCAST conversion: It's the IR used between PE and SQL filters, and directly consumed by our ORM helper packages for Prisma (JS/TS), LINQ (C#).
| case "startswith": | ||
| if dialect == "sqlite-internal" { | ||
| return cond.Var(sqlbuilder.Build("internal_startswith($?, $?)", sqlbuilder.Raw(field), value)), nil | ||
| } | ||
| pattern, err := prefix(value) | ||
| if err != nil { | ||
| return "", err | ||
| } | ||
| return cond.Like(field, pattern), nil | ||
| case "endswith": | ||
| if dialect == "sqlite-internal" { | ||
| return cond.Var(sqlbuilder.Build("internal_endswith($?, $?)", sqlbuilder.Raw(field), value)), nil | ||
| } | ||
| pattern, err := suffix(value) | ||
| if err != nil { | ||
| return "", err | ||
| } | ||
| return cond.Like(field, pattern), nil | ||
| case "contains": | ||
| if dialect == "sqlite-internal" { | ||
| return cond.Var(sqlbuilder.Build("internal_contains($?, $?)", sqlbuilder.Raw(field), value)), nil | ||
| } | ||
| pattern, err := infix(value) | ||
| if err != nil { | ||
| return "", err | ||
| } | ||
| return cond.Like(field, pattern), nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This internal dialect was used to support testing filter policies. It's also related to rego.compile() and filter.helper(), and not made use of in this slice of code. We could remove it, but since the target is never advertised, I think it might be OK to leave it in: Nobody will benefit from (ab)using this, as it'll generate SQL filters that not-modified-by-us SQLite instances don't understand.
| CompilePrepPartial = "compile_prep_partial" | ||
| CompileEvalConstraints = "compile_eval_constraints" | ||
| CompileTranslateQueries = "compile_translate_queries" | ||
| CompileExtractAnnotationsUnknowns = "compile_extract_annotations_unknowns" | ||
| CompileExtractAnnotationsMask = "compile_extract_annotations_mask" | ||
| CompileEvalMaskRule = "compile_eval_mask_rule" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some new metrics, prefixed accordingly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file collects the code added to the v1/server package. It extends (s *Server) a bit, but I thought that's OK, maybe better than putting more into v1/server/server.go. But if you have strong feelings about the tidyness of this, I'm happy to adjust.
| hooks hooks.Hooks | ||
|
|
||
| compileUnknownsCache *lru.Cache[string, []*ast.Term] | ||
| compileMaskingRulesCache *lru.Cache[string, ast.Ref] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cache things that are looked up from annotations: unknowns and the masking_rule used for a filter target (i.e. a rule of a policy written with filtering in mind).
When
- there is no masking-rule or unknowns in the payload,
- and the caches yield nothing,
we'll clone the server's compiler, parse the modules with annotation processing enabled, and put the result of the lookup into the cache (throwing away the cloned compiler).
When the server's reload method is called with an event that touched a policy, the caches are purged (see below)
| opts = append(opts, rego.Resolver(entrypoint, r)) | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From hereon down, we're just passing no custom fields to DL log calls.
| ndbCache builtins.NDBCache, | ||
| err error, | ||
| m metrics.Metrics, | ||
| custom map[string]any, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
☝️ see above.
| f := &fixture{ | ||
| server: server, | ||
| recorder: httptest.NewRecorder(), | ||
| t: t, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wasn't used
c327587 to
18bbd06
Compare
18bbd06 to
ab10356
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ℹ️ I've moved this into e2e/api/compile because I have the intention of moving our testscript checks here. That'll become e2e/cli/.
c4f9ac7 to
c8f322a
Compare
johanfylling
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is cool stuff coming in from EOPA, that definitely deserves a special place in the release notes!
I apologize if I'm commenting on stuff that has already been through review in EOPA 🙂.
| continue | ||
| } | ||
|
|
||
| closestStrings := levenshtein.ClosestStrings(maxDistanceForHint, miss, slices.Values(candidates)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could there be a situation where miss is in candidates? Resulting in something like "foo is undefined, did you mean foo?"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's in candidates, it wouldn't be a miss. The lookup in partial eval would result in a saved expression, not in a fail.
| @@ -0,0 +1,6 @@ | |||
| # github.com/open-policy-agent/opa/e2e | |||
|
|
|||
| This Go module is for end-to-end tests that come with dependencies we don't otherwise need for OPA, to avoid bloat in OPA's Go module's deps. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great dependency isolation here, thank you!
| } | ||
| ruleNames = append(ruleNames, rule.Module.Package.Path.String()+"."+rule.Head.Name.String()) | ||
| } | ||
| closest := levenshtein.ClosestStrings(65536, input, slices.Values(ruleNames)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does 65536 signify in this context? Big number, so generous fuz?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm sounds about right, but I'm not sure. @philipaconrad? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! So, under the hood, OPA is using the github.com/agnivade/levenshtein package, and they have a note about the max window size that the library supports.
We use the maximum 65536 value, because in this case, we absolutely do not care about how long the string is, we just want a match, even if it's thousands of characters long. If we knew in advance how long the longest string might be in runes, we could select a much lower context window cap.
|
|
||
| // Compile does all the steps needed to generate `*compile.Filters` from a | ||
| // prepared state (`*compile.Prepared`). | ||
| func (p *Prepared) Compile(ctx context.Context, eo ...rego.EvalOption) (*Filters, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only accessible as a lib and through the server, right? Would be neat if it could also be used as an opa eval cmd --target, if only for demo purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup that's a good idea. We have opa eval --partial, so we could add it there.... In EOPA, we've got this re-ified as a builtin, rego.compile(). But I actually think the CLI option would be nice, too. 💡
v1/rego/compile/compile.go
Outdated
| return n | ||
| } | ||
|
|
||
| // Prepare evaluates as much as possible without known the (known) input yet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit]
| // Prepare evaluates as much as possible without known the (known) input yet | |
| // Prepare evaluates as much as possible without knowing the (known) input yet |
| } | ||
|
|
||
| if ref, ok := t.Value.(ast.Ref); ok && ref.HasPrefix(ast.DefaultRootRef) { | ||
| // TODO(sr): point to rule with else -- but we don't have the full rego yet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the wider context of this assumption?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what I mean is that the location of the error should point to the rule that (likely) has an else... but we only check things based on the expressions remaining in PE results, so we don't know the location of that rule (or if it really has an else)...
| switch { | ||
| case unknownMustBeFirst: | ||
| if _, ok := e.Operand(0).Value.(ast.Ref); !ok { | ||
| return err(loc, "rhs of %v must be known", op) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this assumption about the rhs by checking the lhs? If this doesn't always hold, the error message could be confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At that point, we have made it through
for i := range 2 {
if err := checkOperand(c, op, e.Operand(i)); err != nil {
return err
}
}
So every operand (LHS/RHS) is either a valid unknown (like input.fruits.name), or known. If it's known, and the RHS is known, then the call wouldn't be in the PE results. If LHS is unknown, RHS must be known for this to work. At that point, if the RHS is a ref, it's unknown.
It's a bit spaghetti-ish, I'm afraid 🤔
| } | ||
|
|
||
| // all our allowed builtins have two operands | ||
| for i := range 2 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe something to assert?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, but how would you do that? Right now, the list of allowed builtins is encoded in this switch,
switch {
case op0 == ast.Equality.Name ||
op0 == ast.NotEqual.Name ||
op0 == ast.LessThan.Name ||
op0 == ast.LessThanEq.Name ||
op0 == ast.GreaterThan.Name ||
op0 == ast.GreaterThanEq.Name:
twoRefsOK = true
case op0 == ast.StartsWith.Name ||
op0 == ast.EndsWith.Name ||
op0 == ast.Contains.Name ||
op0 == ast.Member.Name:
unknownMustBeFirst = trueI suppose we could make that a proper list...and have a test.... I'll make a note for when we're adding to this later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't mean asserting that all allowed built-ins have two args; there I think it's enough to just leave a comment where those approvals are declared that this piece of code needs updating if this promise is ever broken (if even that, given that it's within the same func).
I meant that we could assert that e really has exactly two operands before iterating over them. But I suppose that is already implicit from that above switch.
56f2e19 to
96d4438
Compare
Also adds e2e tests: These include coverage for ucast in the prisma setting, and thus require some JS runtime. Signed-off-by: Stephan Renatus <[email protected]>
...for macos runs, and for the go-compat suites. Signed-off-by: Stephan Renatus <[email protected]>
Signed-off-by: Stephan Renatus <[email protected]>
When running the tests in a loop for a while, I would see values of 0ns for this metric. However, comparing with its non-zero values, which are often 41 or 42ns, it seems like this is just not happening in this code path. So if "almost nothing" actually goes below 1ns, it's OK. Signed-off-by: Stephan Renatus <[email protected]>
…t/go-mssqldb Signed-off-by: Stephan Renatus <[email protected]>
Signed-off-by: Stephan Renatus <[email protected]>
…use true) Signed-off-by: Stephan Renatus <[email protected]>
Signed-off-by: Stephan Renatus <[email protected]>
Signed-off-by: Stephan Renatus <[email protected]>
Signed-off-by: Stephan Renatus <[email protected]>
96d4438 to
aa9157b
Compare
This PR ports the work done in EOPA to extend the Compile API with support for filters into OPA. Similar to the existing
v1/compileAPI, it'll run partial evaluation (PE), but with a bunch of twists:The new endpoint is
/v1/compile/{path}(similar to the Data API), and it expects the correct "Accept" header for selection the filter format. It furthermore accepts a request payload to influence the translation target.Docs
Docs will follow:
Differences
The Accept header values have gotten new names, like
application/vnd.opa.sql.postgresql+json. We can discuss the naming scheme, since existing (EOPA) users will need to make adjustments anyhow.When squashing, let's "Co-authored-by" all the contributors from the EOPA code, first and foremost @philipaconrad.