Backend apps can signal the desired protocol by using appProtocol of a Service. Since inference backends, e.g. model servers, are not expected to use a Service, InferencePool should have a similar field that Gateway implementations can use to set the appropriate protocol when routing a request.
vLLM: Supports HTTP/1.1 (xref).
Triton: Supports HTTP/1.1 (REST) and gRPC inference protocols (xref).
Backend apps can signal the desired protocol by using appProtocol of a Service. Since inference backends, e.g. model servers, are not expected to use a Service, InferencePool should have a similar field that Gateway implementations can use to set the appropriate protocol when routing a request.
vLLM: Supports HTTP/1.1 (xref).
Triton: Supports HTTP/1.1 (REST) and gRPC inference protocols (xref).