Skip to content

InferencePool: Signal Backend Protocol #1273

@danehans

Description

@danehans

Backend apps can signal the desired protocol by using appProtocol of a Service. Since inference backends, e.g. model servers, are not expected to use a Service, InferencePool should have a similar field that Gateway implementations can use to set the appropriate protocol when routing a request.

vLLM: Supports HTTP/1.1 (xref).
Triton: Supports HTTP/1.1 (REST) and gRPC inference protocols (xref).

Metadata

Metadata

Assignees

No one assigned

    Labels

    triage/acceptedIndicates an issue or PR is ready to be actively worked on.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions