You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: benchmarking/inference-perf/README.md
+37-4Lines changed: 37 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,11 +19,22 @@ Make sure you have the following tools installed and configured:
19
19
20
20
Before deployment, navigate to the **`deploy/inference-perf`** directory and edit the **`values.yaml`** file to customize your deployment and the benchmark parameters.
21
21
22
-
#### Optional Parameters
22
+
#### Optional Token Parameters
23
+
Hugging Face token can be provided either by providing a value (`hfToken`) or by referencing an existing Kubernetes Secret (`hfSecret.Name` and `hfSecret.Key`).
24
+
25
+
> If both `hfToken` and the `hfSecret` parameters are provided, the chart logic is configured to prioritize the `hfSecret` reference.
23
26
24
27
| Key | Description | Default |
25
28
| :--- | :--- | :--- |
26
29
|`hfToken`| Hugging Face API token. If provided, a Kubernetes `Secret` named `hf-token-secret` will be created for authentication. |`""`|
30
+
|`hfSecret.name`| The name of a pre-existing Kubernetes Secret that contains a Hugging Face API token. |`""`|
31
+
|`hfSecret.key`| The key within the pre-existing Kubernetes Secret that holds the token value. |`""`|
32
+
---
33
+
34
+
#### Optional Job Parameters
35
+
36
+
| Key | Description | Default |
37
+
| :--- | :--- | :--- |
27
38
|`serviceAccountName`| Standard Kubernetes `serviceAccountName`. If not provided, default service account is used. |`""`|
28
39
|`nodeSelector`| Standard Kubernetes `nodeSelector` map to constrain pod placement to nodes with matching labels. |`{}`|
29
40
|`resources`| Standard Kubernetes resource requests and limits for the main `inference-perf` container. |`{}`|
@@ -54,7 +65,29 @@ The identity executing the workload (e.g., the associated Kubernetes Service Acc
54
65
55
66
| Key | Description | Default |
56
67
| :--- | :--- | :--- |
57
-
| `gcsPath` | A GCS URI pointing to the dataset file (e.g., `gs://my-bucket/dataset.json`). The file will be automatically copied to the running pod during initialization. | `""` |
68
+
| `gcsPath` | A GCS bucket name pointing to the dataset file (e.g., `<my-bucket-path-to-file>/dataset.json`). The file will be automatically copied to the running pod during initialization. The file will be copied to `gcsDataset/dataset.json` | `""` |
69
+
70
+
---
71
+
72
+
#### AWS Specific Parameters
73
+
74
+
This section details the necessary configuration and permissions for using an S3 path to manage your dataset, typical for deployments on AWS EKS.
75
+
76
+
##### Required IAM Permissions
77
+
78
+
The identity executing the workload (e.g., the associated Kubernetes Service Account, often configured via IRSA - IAM Roles for Service Accounts) must possess an associated AWS IAM Policy that grants the following S3 Actions on the target S3 bucket for data transfer:
79
+
80
+
* **S3 Read/Download (Object Access)**
81
+
* Action: `s3:GetObject` (Required to download the input dataset from S3).
82
+
* Action: `s3:ListBucket` (Often required to check for the file's existence and list bucket contents).
83
+
84
+
* **S3 Write/Upload (Object Creation)**
85
+
* Action: `s3:PutObject` (Required to upload benchmark results back to S3).
86
+
87
+
88
+
| Key | Description | Default |
89
+
| :--- | :--- | :--- |
90
+
| `s3Path` | An S3 bucket name pointing to the dataset file (e.g., `<my-bucket-path-to-file>/dataset.json`). The file will be automatically copied to the running pod during initialization. The file will be copied to `s3Dataset/dataset.json` | `""` |
58
91
59
92
---
60
93
@@ -80,6 +113,6 @@ Use the **`helm install`** command from the **`deploy/inference-perf`** director
0 commit comments