Skip to content

Commit 8278aa7

Browse files
committed
Update README.md
1 parent fc20338 commit 8278aa7

File tree

1 file changed

+5
-6
lines changed

1 file changed

+5
-6
lines changed

config/charts/inferencepool/README.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -83,18 +83,18 @@ $ helm install triton-llama3-8b-instruct \
8383

8484
To deploy the EndpointPicker in a high-availability (HA) active-passive configuration, you can enable leader election. When enabled, the EPP deployment will have multiple replicas, but only one "leader" replica will be active and ready to process traffic at any given time. If the leader pod fails, another pod will be elected as the new leader, ensuring service continuity.
8585

86-
To enable HA, set `inferenceExtension.enableLeaderElection` to `true` and increase the number of replicas in your `values.yaml` file:
86+
To enable HA, set `inferenceExtension.flags.has-enable-leader-election` to `true` and increase the number of replicas in your `values.yaml` file:
8787

8888
```yaml
8989
inferenceExtension:
9090
replicas: 3
91-
enableLeaderElection: true
91+
has-enable-leader-election: true
9292
```
9393
9494
Then apply it with:
9595
9696
```txt
97-
helm install vllm-llama3-8b-instruct ./config/charts/inferencepool -f values.yaml \
97+
helm install vllm-llama3-8b-instruct ./config/charts/inferencepool -f values.yaml
9898
```
9999

100100
## Uninstall
@@ -122,10 +122,9 @@ The following table list the configurable parameters of the chart.
122122
| `inferenceExtension.env` | List of environment variables to set in the endpoint picker container as free-form YAML. Defaults to `[]`. |
123123
| `inferenceExtension.extraContainerPorts` | List of additional container ports to expose. Defaults to `[]`. |
124124
| `inferenceExtension.extraServicePorts` | List of additional service ports to expose. Defaults to `[]`. |
125-
| `inferenceExtension.flags` | List of flags which are passed through to endpoint picker. |
125+
| `inferenceExtension.flags` | List of flags which are passed through to endpoint picker. Example flags, enable-pprof, grpc-port etc. Refer [runner.go](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/cmd/epp/runner/runner.go) for complete list. |
126+
| `inferenceExtension.flags.has-enable-leader-election` | Enable leader election for high availability. When enabled, only one EPP pod (the leader) will be ready to serve traffic. |
126127
| `provider.name` | Name of the Inference Gateway implementation being used. Possible values: `gke`. Defaults to `none`. |
127-
| `inferenceExtension.flags.has-enable-leader-election` | Enable leader election for high availability. When enabled, only one EPP pod (the leader) will be ready to serve traffic. It is recommended to set `inferenceExtension.replicas` to a value greater than 1 when this is set to `true`. Defaults to `false`. |
128-
129128

130129
## Notes
131130

0 commit comments

Comments
 (0)