Add optional inference objective#1995
Conversation
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Hi @Gregory-Pereira. Thanks for your PR. I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
This PR looks ok, but somehow I think it's missing something. It is creating a single InferenceObjective with a name that matches the Helm Release Name. As I understand things the InferenceObjective is referenced by the header x-gateway-inference-objective sent with the request. This is a request related thing. I would expect the ability to create several InferenceObjectives each with a different name and different priority. |
|
Good point, I will update the implementation so that users could define all the inference objectives they wish to relate to the inference pool |
|
/ok-to-test |
ca76a99 to
ff97818
Compare
|
Can you please discuss the motivation for this? I see some value, but infObj are a resource that will be created/updated/deleted after creating the infPool; meaning likely new objectives will be added/deleted later. |
6751dd6 to
db76251
Compare
I saw the value as automating the creation / deletion of them. In this way they get created and cleaned up with the helm chart. Not to say that others cannot add more out of band. I started on this in preparation for the Flow Control integration work with regard to an LLM-D guide that could showcase the work. |
|
ok, I can see value in cases where for the most part the objectives are known in advance and mostly static |
|
Agreed with the other comments here. As long as we communicate clearly that there isn't a need to correlate the infObjectives at Pool creation, this all seems reasonable to me |
|
@Gregory-Pereira the PR overall lgtm. |
Signed-off-by: greg pereira <grpereir@redhat.com>
Signed-off-by: greg pereira <grpereir@redhat.com>
Signed-off-by: greg pereira <grpereir@redhat.com>
Signed-off-by: greg pereira <grpereir@redhat.com>
… over inference objectives Signed-off-by: greg pereira <grpereir@redhat.com>
db76251 to
b2ac7ee
Compare
Signed-off-by: greg pereira <grpereir@redhat.com>
|
I think this is ready for review again if you have cycles @kfswain, @nirrozenbaum or @ahg-g. Sorry it took me so long to get back to this |
|
/approve LGTM, would like another pair of eyes to help catch anything I may have missed |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Gregory-Pereira, kfswain The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/lgtm |
* enable creating inferenceObjective via the inferencepool helm chart Signed-off-by: greg pereira <grpereir@redhat.com> * updating readme + linting Signed-off-by: greg pereira <grpereir@redhat.com> * allow array of inferencepools Signed-off-by: greg pereira <grpereir@redhat.com> * move inferenceObjective to top level and cleanup template Signed-off-by: greg pereira <grpereir@redhat.com> * remaining cleanup removing the checking of apiVersion when itterating over inference objectives Signed-off-by: greg pereira <grpereir@redhat.com> * document use-case for use infernece-objective values field Signed-off-by: greg pereira <grpereir@redhat.com> --------- Signed-off-by: greg pereira <grpereir@redhat.com>
* enable creating inferenceObjective via the inferencepool helm chart Signed-off-by: greg pereira <grpereir@redhat.com> * updating readme + linting Signed-off-by: greg pereira <grpereir@redhat.com> * allow array of inferencepools Signed-off-by: greg pereira <grpereir@redhat.com> * move inferenceObjective to top level and cleanup template Signed-off-by: greg pereira <grpereir@redhat.com> * remaining cleanup removing the checking of apiVersion when itterating over inference objectives Signed-off-by: greg pereira <grpereir@redhat.com> * document use-case for use infernece-objective values field Signed-off-by: greg pereira <grpereir@redhat.com> --------- Signed-off-by: greg pereira <grpereir@redhat.com>
What type of PR is this?
/kind cleanup
/kind feature
What this PR does / why we need it:
Enable utilization of the InferenceObjective CR we already have
Does this PR introduce a user-facing change?:
NONE, simply exposes the inferencepool objective in the helm charts