-
Notifications
You must be signed in to change notification settings - Fork 237
TEP-0021: Results API #217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This TEP proposes a Results API to store long term Tekton results independent of on-cluster runtime data stored in etcd.
|
/lgtm for proposed state |
|
@skaegi Sounded like you had use cases that align well here. Would love any feedback! |
teps/0021-results-api.md
Outdated
| ``` | ||
|
|
||
| `Execution`s contain Tekton execution types, namely TaskRuns or | ||
| PipelineRuns. They may also contain opaque types, which can be useful for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think about calling these Runs considering all the types we want to store are Runs? (TaskRuns, PipelineRuns, I'm guessing custom tasks aka Runs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel too strongly about this, but something I like about executions is that it's intentionally not a Run and can have broader meaning beyond the existing Run types (e.g. Custom Tasks like you mentioned, but also DSLs, and other types we don't natively support), even if we expect Runs to be the primary type we handle for most Tekton users.
That said, I'm fine with either. Was there similar discussion when naming Task/PipelineRuns? If so, why was run chosen over execution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update on this, we decided to change this to Events to be a bit broader. The intent is still the same.
Add more alternatives: `to get the process started: what if the API was in a different format? (e.g. not gRPC) what if there was no results store? what if there was no generic API?`
There was some confusion around what data could be stored within an execution, particularly for meta-configs/DSLs that get mapped to TaskRuns (e.g. you could have multiple executions, even if only 1 thing was actually ran). This has the side effect of treating Tekton execution types just like any other user extensible data type, which simiplifies how users need to think about the different subtypes, and how API Server implementations need to handle this data / filter queries.
vdemeester
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(oh.. I never sent my review 😓 )
| know that this has succeeded? | ||
| --> | ||
|
|
||
| - Define a Result API spec to store and query Tekton results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to define what "results" are here. Is this limited to the results type in Pipeline/Task or the status of the execution or is it more ; like any artifacts generated by a pipeline (tests reports, archive, …)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some definition here, as well as some discussion in Alternatives for how this relates to Task output results.
This is intentionally vague so that we can store broad data about TaskRuns and PipelineRuns. This will generally include the entire TaskRun/PipelineRun object (e.g. both spec and status), but we also see this evolving to include other information (input events, post-run receipts of GitHub status updates / Cloud Event publishing).
This is intended to be extensible to allow for future types which could include the artifacts you mentioned, though we are not looking to introduce new types as part of this proposal.
teps/0021-results-api.md
Outdated
| --> | ||
|
|
||
| Our goal is to make results have minimal impact on core Pipeline execution. By | ||
| running a separate controller, while this does add some more overhead, this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What overhead do we talk about here ? (I guess on the operation side)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Primarily wanted to acknowledge that at minimum this would run as a separate pod, which likely has some additional cost as opposed to bundling this in the Pipeline controller itself (but in exchange we gain component modularity).
I suspect the operational overhead would vary depending on the cluster environment (e.g. multi-tenant clusters may require per tenant controllers? - not 100% sure. I instinctively suspect that there would be a result controller for each Tekton Pipeline controller), but these details are likely too implementation/environment specific to detail in this doc.
Address comments on how this proposal relates to similiarly named Task output results, and alternative names considered. Also adds some minor changes for auto-delete future work, resource overhead, and removes mentions of already completed work being in-progress.
bobcatfish
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 🎉 🎉
| - "@wlynch" | ||
| creation-date: 2020-09-23 | ||
| last-updated: 2020-10-26 | ||
| status: proposed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after the discussion in today's API working group, i want to float the idea that we merge this as proposed very quickly (without everything from "proposal" down)
(np if that doesn't help and want to continue the discussion as is)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
| - Service providers can choose to provide their own controller and server | ||
| implementations to upload Results with additional service specific metadata or | ||
| implementation specific validation / field formats, so long as it conforms to | ||
| the Results API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great requirements 👍 i think these can be good examples to point to if folks are trying to distinguish b/w goals + requirements in other TEPs
teps/0021-results-api.md
Outdated
| Results are intended to be this abstraction over execution types (i.e. | ||
| PipelineRun, TaskRun) so that we can group this data for users and provide a | ||
| mechanism to include additional metadata that does not fit neatly into TaskRuns | ||
| or PipelineRuns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would think we'd still store the individual TaskRuns along with the PipelineRun tho right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup! Exactly. Rephrased to make this a bit clearer.
| We may choose to provide a facade of the Task/Pipeline APIs in the future for | ||
| convenience, but this would be a layer in addition to the Results API. | ||
|
|
||
| ### REST/Open API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 👍 👍 Thanks for exploring this alternative, seems reasonable to me.
| [initial mention](https://github.com/tektoncd/pipeline/issues/1273#issuecomment-546494832) | ||
| of the Task results field (named results to not conflict with PipelineResource | ||
| outputs) was suggested the same week as the original Results API design doc. | ||
| Both of these predated the current TEP process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤣 🤭
fwiw THIS particular feature has been called "result" since at least tektoncd/pipeline#454 (jan 2019)
| the project from Task output results. | ||
| - Event API : While event is general enough to fill a similar role, this takes | ||
| away our usage of event as a field of the current Result resource, which then | ||
| leaves us with a different naming problem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"stuff that happened"
The only other name that seems appealing to me is something like "history"? "Tekton History"?
afrittoli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this, great work!
I really like the idea of abstracting away from Tekton specific types, and be open to store other "results", like events, manifests from DSLs or else.
The main question (concern?) I have is about how the various Event are attached to a Result. Due to the distributed nature of tekton workflows (even a relatively simple one like git hosting -> triggers -> pipeline -> notifications), I feel it should be possible to add Events to a Result incrementally. Is that the way it is planned to work?
AFAIU, the controller would be responsible to create the initial Result and attach extra Events to it as they are eventually discovered. Would the controller be the only one responsible for talking to the API to write objects, or would it be possible for other parties to push events directly to the API?
Since not everything is a CRD, it may be difficult otherwise for the controller to discover all relevant events.
Since a result is a collection of events, does it have to be statically defined as a result, or could it be dynamically collected from available events based on a set of selection criteria?
Results could be basically views on set of events, that can be stored and queried via the API.
Anyways, I'm totally in favour of the overall idea; I hope we can be flexible still in the definition of the API, so it may change from what is in the doc today.
/approve
|
|
||
| // The etag for this result. | ||
| // If this is provided on update, it must match the server's etag. | ||
| string etag = 6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happened with 5 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Various edits/reordering. 😅 I'll reorder the numbers when we create the API for real.
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: afrittoli The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Yes! We've been working on some modifications to the proposal to change events to a more explicit subresource, which would make it easy to do just that. I've been hesitant to add this addition to this proposal, so that we can get this in and make progress. Even in the current API, you could do this by updating a current result, using the
I think we'll want to be agnostic to the source of the initial event creation. Even within Tekton, there might be different controllers creating results (e.g. it seems reasonable for the trigger controller to also create results to record the input event before handing off to pipelines for execution). As long as the client is authorized, the API server shouldn't care.
We will probably want an explicit Result resource. While I agree with you that the most relevant bits of data are in the Events, having a container for logical groupings of CI runs is a large benefit we get by having a Result resource. That said, I think it would be reasonable to be able to query across events, where the Events then link back to the Results they are contained in. This is another thing that is enabled with the subresource change mentioned above! ^_^ Let me know if you're okay with defering to another PR for that change, or if you'd like to see it reflected here. |
|
/cc @sbwsg |
|
/lgtm |

This TEP proposes a Results API to store long term Tekton results independent of
on-cluster runtime data stored in etcd.
/cc @imjasonh @afrittoli