Skip to content

Commit 142adc0

Browse files
committed
Update inference perf chart to match inf perf repo.
1 parent 77a5cb6 commit 142adc0

12 files changed

Lines changed: 295 additions & 86 deletions

File tree

benchmarking/benchmark-values.yaml

Lines changed: 18 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
job:
22
image:
33
repository: quay.io/inference-perf/inference-perf
4-
tag: "latest" # Defaults to .Chart.AppVersion
5-
serviceAccountName: ""
4+
tag: "" # Defaults to .Chart.AppVersion
65
nodeSelector: {}
6+
serviceAccountName: ""
77
# Example resources:
88
# resources:
99
# requests:
@@ -18,19 +18,27 @@ logLevel: INFO
1818

1919
# A GCS bucket path that points to the dataset file.
2020
# The file will be copied from this path to the local file system
21-
# at /dataset/gcs-dataset.json for use during the run.
22-
# NOTE: For this dataset to be used, config.data.path must also be explicitly set to /dataset/gcs-dataset.json.
21+
# at /gcsDataset/gcs-dataset.json for use during the run.
22+
# NOTE: For this dataset to be used, config.data.path must also be explicitly set to /gcsDataset/gcs-dataset.json.
23+
# Format: bucket-name/folder/to/dataset/file
2324
gcsPath: ""
2425

25-
# A S3 bucket path that points to the dataset file.
26+
# An S3 bucket path that points to the dataset file.
2627
# The file will be copied from this path to the local file system
27-
# at /dataset/s3-dataset.json for use during the run.
28-
# NOTE: For this dataset to be used, config.data.path must also be explicitly set to /dataset/s3-dataset.json.
28+
# at /s3Dataset/s3-dataset.json for use during the run.
29+
# NOTE: For this dataset to be used, config.data.path must also be explicitly set to /s3Dataset/s3-dataset.json.
30+
# Format: bucket-name/folder/to/dataset/file
2931
s3Path: ""
3032

31-
# hfToken optionally creates a secret with the specified token.
32-
# Can be set using helm install --set hftoken=<token>
33-
hfToken: ""
33+
# Optional Token configuration for Hugging Face authentication.
34+
# hfSecret: Configures a pre-existing Kubernetes Secret.
35+
# hfToken: Creates a new kubernetes secret with the specified token.
36+
# If both specified, 'hfSecret' takes precedence over 'hfToken'.
37+
token:
38+
hfSecret:
39+
name: "" # The name of the existing Secret (e.g., 'my-hf-secret').
40+
key: "" # The key within the Secret that holds the token value (e.g., 'token' or 'hf-token').
41+
hfToken: ""
3442

3543
config:
3644
load:

benchmarking/inference-perf/Chart.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,4 @@ name: inference-perf
33
description: A Helm chart for running inference-perf benchmarking tool
44
type: application
55
version: 0.2.0
6-
appVersion: "0.2.0"
6+
appVersion: "v0.2.0"

benchmarking/inference-perf/README.md

Lines changed: 37 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -19,11 +19,22 @@ Make sure you have the following tools installed and configured:
1919

2020
Before deployment, navigate to the **`deploy/inference-perf`** directory and edit the **`values.yaml`** file to customize your deployment and the benchmark parameters.
2121

22-
#### Optional Parameters
22+
#### Optional Token Parameters
23+
Hugging Face token can be provided either by providing a value (`hfToken`) or by referencing an existing Kubernetes Secret (`hfSecret.Name` and `hfSecret.Key`).
24+
25+
> If both `hfToken` and the `hfSecret` parameters are provided, the chart logic is configured to prioritize the `hfSecret` reference.
2326
2427
| Key | Description | Default |
2528
| :--- | :--- | :--- |
2629
| `hfToken` | Hugging Face API token. If provided, a Kubernetes `Secret` named `hf-token-secret` will be created for authentication. | `""` |
30+
| `hfSecret.name` | The name of a pre-existing Kubernetes Secret that contains a Hugging Face API token. | `""` |
31+
| `hfSecret.key` | The key within the pre-existing Kubernetes Secret that holds the token value. | `""` |
32+
---
33+
34+
#### Optional Job Parameters
35+
36+
| Key | Description | Default |
37+
| :--- | :--- | :--- |
2738
| `serviceAccountName` | Standard Kubernetes `serviceAccountName`. If not provided, default service account is used. | `""` |
2839
| `nodeSelector` | Standard Kubernetes `nodeSelector` map to constrain pod placement to nodes with matching labels. | `{}` |
2940
| `resources` | Standard Kubernetes resource requests and limits for the main `inference-perf` container. | `{}` |
@@ -54,7 +65,29 @@ The identity executing the workload (e.g., the associated Kubernetes Service Acc
5465
5566
| Key | Description | Default |
5667
| :--- | :--- | :--- |
57-
| `gcsPath` | A GCS URI pointing to the dataset file (e.g., `gs://my-bucket/dataset.json`). The file will be automatically copied to the running pod during initialization. | `""` |
68+
| `gcsPath` | A GCS bucket name pointing to the dataset file (e.g., `<my-bucket-path-to-file>/dataset.json`). The file will be automatically copied to the running pod during initialization. The file will be copied to `gcsDataset/dataset.json` | `""` |
69+
70+
---
71+
72+
#### AWS Specific Parameters
73+
74+
This section details the necessary configuration and permissions for using an S3 path to manage your dataset, typical for deployments on AWS EKS.
75+
76+
##### Required IAM Permissions
77+
78+
The identity executing the workload (e.g., the associated Kubernetes Service Account, often configured via IRSA - IAM Roles for Service Accounts) must possess an associated AWS IAM Policy that grants the following S3 Actions on the target S3 bucket for data transfer:
79+
80+
* **S3 Read/Download (Object Access)**
81+
* Action: `s3:GetObject` (Required to download the input dataset from S3).
82+
* Action: `s3:ListBucket` (Often required to check for the file's existence and list bucket contents).
83+
84+
* **S3 Write/Upload (Object Creation)**
85+
* Action: `s3:PutObject` (Required to upload benchmark results back to S3).
86+
87+
88+
| Key | Description | Default |
89+
| :--- | :--- | :--- |
90+
| `s3Path` | An S3 bucket name pointing to the dataset file (e.g., `<my-bucket-path-to-file>/dataset.json`). The file will be automatically copied to the running pod during initialization. The file will be copied to `s3Dataset/dataset.json` | `""` |
5891
5992
---
6093
@@ -80,6 +113,6 @@ Use the **`helm install`** command from the **`deploy/inference-perf`** director
80113
### 4. Cleanup
81114
82115
To remove the benchmark deployment.
83-
```bash
116+
```bash
84117
helm uninstall test
85-
```
118+
```

benchmarking/inference-perf/templates/job.yaml

Lines changed: 30 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -23,19 +23,19 @@ spec:
2323
initContainers:
2424
- name: fetch-gcs-dataset
2525
image: google/cloud-sdk:latest
26-
command: ["sh", "-c", "gsutil cp {{ .Values.gcsPath }} /dataset/gcs-dataset.json"]
26+
command: ["sh", "-c", "gsutil cp gs://{{ .Values.gcsPath }} /gcsDataset/gcs-dataset.json"]
2727
volumeMounts:
28-
- name: dataset-volume
29-
mountPath: /dataset
28+
- name: gcs-dataset-volume
29+
mountPath: /gcsDataset
3030
{{- end }}
3131
{{- if .Values.s3Path}}
3232
initContainers:
3333
- name: fetch-s3-dataset
3434
image: google/cloud-sdk:latest
35-
command: ["sh", "-c", "aws s3 cp s3://{{ .Values.s3Path }} /dataset/s3-dataset.json"]
35+
command: ["sh", "-c", "aws s3 cp s3://{{ .Values.s3Path }} /s3Dataset/s3-dataset.json"]
3636
volumeMounts:
37-
- name: dataset-volume
38-
mountPath: /dataset
37+
- name: s3-dataset-volume
38+
mountPath: /s3dataset
3939
{{- end }}
4040
containers:
4141
- name: inference-perf-container
@@ -47,20 +47,42 @@ spec:
4747
- "--log-level"
4848
- {{ .Values.logLevel }}
4949
env:
50-
{{- if .Values.hfToken }}
50+
{{- if and .Values.token.hfSecret.name .Values.token.hfSecret.key }}
51+
- name: HF_TOKEN
52+
valueFrom:
53+
secretKeyRef:
54+
name: {{ .Values.token.hfSecret.name }}
55+
key: {{ .Values.token.hfSecret.key }}
56+
{{- else if .Values.token.hfToken }}
5157
- name: HF_TOKEN
5258
valueFrom:
5359
secretKeyRef:
5460
name: {{ include "inference-perf.hfSecret" . }}
5561
key: {{ include "inference-perf.hfKey" . }}
56-
{{- end }}
62+
{{- end }}
5763
volumeMounts:
5864
- name: config-volume
5965
mountPath: {{ include "inference-perf.configMount" . }}
6066
readOnly: true
67+
{{- if .Values.gcsPath}}
68+
- name: gcs-dataset-volume
69+
mountPath: /gcsDataset
70+
{{- end }}
71+
{{- if .Values.s3Path}}
72+
- name: s3-dataset-volume
73+
mountPath: /s3Dataset
74+
{{- end }}
6175
resources:
6276
{{- toYaml .Values.job.resources | nindent 12 }}
6377
volumes:
6478
- name: config-volume
6579
configMap:
6680
name: {{ include "inference-perf.fullname" . }}-config
81+
{{- if .Values.gcsPath}}
82+
- name: gcs-dataset-volume
83+
emptyDir: {}
84+
{{- end }}
85+
{{- if .Values.s3Path}}
86+
- name: s3-dataset-volume
87+
emptyDir: {}
88+
{{- end }}

benchmarking/prefix-cache-aware/high-cache-values.yaml

Lines changed: 17 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@
22
job:
33
image:
44
repository: quay.io/inference-perf/inference-perf
5-
tag: "0.2.0" # Defaults to .Chart.AppVersion
6-
serviceAccountName: ""
5+
tag: "" # Defaults to .Chart.AppVersion
76
nodeSelector: {}
7+
serviceAccountName: ""
88
# Example resources:
99
# resources:
1010
# requests:
@@ -19,19 +19,27 @@ logLevel: INFO
1919

2020
# A GCS bucket path that points to the dataset file.
2121
# The file will be copied from this path to the local file system
22-
# at /dataset/dataset.json for use during the run.
23-
# NOTE: For this dataset to be used, config.data.path must also be explicitly set to /dataset/dataset.json.
22+
# at /gcsDataset/gcs-dataset.json for use during the run.
23+
# NOTE: For this dataset to be used, config.data.path must also be explicitly set to /gcsDataset/gcs-dataset.json.
24+
# Format: bucket-name/folder/to/dataset/file
2425
gcsPath: ""
2526

2627
# An S3 bucket path that points to the dataset file.
2728
# The file will be copied from this path to the local file system
28-
# at /dataset/s3-dataset.json for use during the run.
29-
# NOTE: For this dataset to be used, config.data.path must also be explicitly set to /dataset/s3-dataset.json.
29+
# at /s3Dataset/s3-dataset.json for use during the run.
30+
# NOTE: For this dataset to be used, config.data.path must also be explicitly set to /s3Dataset/s3-dataset.json.
31+
# Format: bucket-name/folder/to/dataset/file
3032
s3Path: ""
3133

32-
# hfToken optionally creates a secret with the specified token.
33-
# Can be set using helm install --set hftoken=<token>
34-
hfToken: ""
34+
# Optional Token configuration for Hugging Face authentication.
35+
# hfSecret: Configures a pre-existing Kubernetes Secret.
36+
# hfToken: Creates a new kubernetes secret with the specified token.
37+
# If both specified, 'hfSecret' takes precedence over 'hfToken'.
38+
token:
39+
hfSecret:
40+
name: "" # The name of the existing Secret (e.g., 'my-hf-secret').
41+
key: "" # The key within the Secret that holds the token value (e.g., 'token' or 'hf-token').
42+
hfToken: ""
3543

3644
config:
3745
load:
@@ -44,8 +52,6 @@ config:
4452
duration: 30
4553
- rate: 500
4654
duration: 30
47-
- rate: 700
48-
duration: 30
4955
worker_max_concurrency: 1000
5056
api:
5157
type: completion

benchmarking/prefix-cache-aware/low-cache-values.yaml

Lines changed: 17 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@
22
job:
33
image:
44
repository: quay.io/inference-perf/inference-perf
5-
tag: "0.2.0" # Defaults to .Chart.AppVersion
6-
serviceAccountName: ""
5+
tag: "" # Defaults to .Chart.AppVersion
76
nodeSelector: {}
7+
serviceAccountName: ""
88
# Example resources:
99
# resources:
1010
# requests:
@@ -19,19 +19,27 @@ logLevel: INFO
1919

2020
# A GCS bucket path that points to the dataset file.
2121
# The file will be copied from this path to the local file system
22-
# at /dataset/dataset.json for use during the run.
23-
# NOTE: For this dataset to be used, config.data.path must also be explicitly set to /dataset/dataset.json.
22+
# at /gcsDataset/gcs-dataset.json for use during the run.
23+
# NOTE: For this dataset to be used, config.data.path must also be explicitly set to /gcsDataset/gcs-dataset.json.
24+
# Format: bucket-name/folder/to/dataset/file
2425
gcsPath: ""
2526

2627
# An S3 bucket path that points to the dataset file.
2728
# The file will be copied from this path to the local file system
28-
# at /dataset/s3-dataset.json for use during the run.
29-
# NOTE: For this dataset to be used, config.data.path must also be explicitly set to /dataset/s3-dataset.json.
29+
# at /s3Dataset/s3-dataset.json for use during the run.
30+
# NOTE: For this dataset to be used, config.data.path must also be explicitly set to /s3Dataset/s3-dataset.json.
31+
# Format: bucket-name/folder/to/dataset/file
3032
s3Path: ""
3133

32-
# hfToken optionally creates a secret with the specified token.
33-
# Can be set using helm install --set hftoken=<token>
34-
hfToken: ""
34+
# Optional Token configuration for Hugging Face authentication.
35+
# hfSecret: Configures a pre-existing Kubernetes Secret.
36+
# hfToken: Creates a new kubernetes secret with the specified token.
37+
# If both specified, 'hfSecret' takes precedence over 'hfToken'.
38+
token:
39+
hfSecret:
40+
name: "" # The name of the existing Secret (e.g., 'my-hf-secret').
41+
key: "" # The key within the Secret that holds the token value (e.g., 'token' or 'hf-token').
42+
hfToken: ""
3543

3644
config:
3745
load:
@@ -44,8 +52,6 @@ config:
4452
duration: 30
4553
- rate: 500
4654
duration: 30
47-
- rate: 700
48-
duration: 30
4955
worker_max_concurrency: 1000
5056
api:
5157
type: completion

benchmarking/single-workload/decode-heavy-values.yaml

Lines changed: 17 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@
22
job:
33
image:
44
repository: quay.io/inference-perf/inference-perf
5-
tag: "0.2.0" # Defaults to .Chart.AppVersion
6-
serviceAccountName: ""
5+
tag: "" # Defaults to .Chart.AppVersion
76
nodeSelector: {}
7+
serviceAccountName: ""
88
# Example resources:
99
# resources:
1010
# requests:
@@ -19,19 +19,27 @@ logLevel: INFO
1919

2020
# A GCS bucket path that points to the dataset file.
2121
# The file will be copied from this path to the local file system
22-
# at /dataset/dataset.json for use during the run.
23-
# NOTE: For this dataset to be used, config.data.path must also be explicitly set to /dataset/dataset.json.
22+
# at /gcsDataset/gcs-dataset.json for use during the run.
23+
# NOTE: For this dataset to be used, config.data.path must also be explicitly set to /gcsDataset/gcs-dataset.json.
24+
# Format: bucket-name/folder/to/dataset/file
2425
gcsPath: ""
2526

2627
# An S3 bucket path that points to the dataset file.
2728
# The file will be copied from this path to the local file system
28-
# at /dataset/s3-dataset.json for use during the run.
29-
# NOTE: For this dataset to be used, config.data.path must also be explicitly set to /dataset/s3-dataset.json.
29+
# at /s3Dataset/s3-dataset.json for use during the run.
30+
# NOTE: For this dataset to be used, config.data.path must also be explicitly set to /s3Dataset/s3-dataset.json.
31+
# Format: bucket-name/folder/to/dataset/file
3032
s3Path: ""
3133

32-
# hfToken optionally creates a secret with the specified token.
33-
# Can be set using helm install --set hftoken=<token>
34-
hfToken: ""
34+
# Optional Token configuration for Hugging Face authentication.
35+
# hfSecret: Configures a pre-existing Kubernetes Secret.
36+
# hfToken: Creates a new kubernetes secret with the specified token.
37+
# If both specified, 'hfSecret' takes precedence over 'hfToken'.
38+
token:
39+
hfSecret:
40+
name: "" # The name of the existing Secret (e.g., 'my-hf-secret').
41+
key: "" # The key within the Secret that holds the token value (e.g., 'token' or 'hf-token').
42+
hfToken: ""
3543

3644
config:
3745
load:

benchmarking/single-workload/prefill-heavy-values.yaml

Lines changed: 17 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@
22
job:
33
image:
44
repository: quay.io/inference-perf/inference-perf
5-
tag: "0.2.0" # Defaults to .Chart.AppVersion
6-
serviceAccountName: ""
5+
tag: "" # Defaults to .Chart.AppVersion
76
nodeSelector: {}
7+
serviceAccountName: ""
88
# Example resources:
99
# resources:
1010
# requests:
@@ -19,19 +19,27 @@ logLevel: INFO
1919

2020
# A GCS bucket path that points to the dataset file.
2121
# The file will be copied from this path to the local file system
22-
# at /dataset/dataset.json for use during the run.
23-
# NOTE: For this dataset to be used, config.data.path must also be explicitly set to /dataset/dataset.json.
22+
# at /gcsDataset/gcs-dataset.json for use during the run.
23+
# NOTE: For this dataset to be used, config.data.path must also be explicitly set to /gcsDataset/gcs-dataset.json.
24+
# Format: bucket-name/folder/to/dataset/file
2425
gcsPath: ""
2526

2627
# An S3 bucket path that points to the dataset file.
2728
# The file will be copied from this path to the local file system
28-
# at /dataset/s3-dataset.json for use during the run.
29-
# NOTE: For this dataset to be used, config.data.path must also be explicitly set to /dataset/s3-dataset.json.
29+
# at /s3Dataset/s3-dataset.json for use during the run.
30+
# NOTE: For this dataset to be used, config.data.path must also be explicitly set to /s3Dataset/s3-dataset.json.
31+
# Format: bucket-name/folder/to/dataset/file
3032
s3Path: ""
3133

32-
# hfToken optionally creates a secret with the specified token.
33-
# Can be set using helm install --set hftoken=<token>
34-
hfToken: ""
34+
# Optional Token configuration for Hugging Face authentication.
35+
# hfSecret: Configures a pre-existing Kubernetes Secret.
36+
# hfToken: Creates a new kubernetes secret with the specified token.
37+
# If both specified, 'hfSecret' takes precedence over 'hfToken'.
38+
token:
39+
hfSecret:
40+
name: "" # The name of the existing Secret (e.g., 'my-hf-secret').
41+
key: "" # The key within the Secret that holds the token value (e.g., 'token' or 'hf-token').
42+
hfToken: ""
3543

3644
config:
3745
load:

0 commit comments

Comments
 (0)