Skip to content

Commit 5291e0b

Browse files
authored
docs: add file permissions section to Artifact walkthrough page (#14997)
Signed-off-by: Elliot Gunton <[email protected]>
1 parent c9cba3b commit 5291e0b

29 files changed

+365
-220
lines changed

api/jsonschema/schema.json

Lines changed: 2 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

api/openapi-spec/swagger.json

Lines changed: 2 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

docs/executor_swagger.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -229,7 +229,7 @@ It will marshall back to string - marshalling is not symmetric.
229229
| globalName | string| `string` | | | GlobalName exports an output artifact to the global scope, making it available as</br>'{{workflow.outputs.artifacts.XXXX}} and in workflow.status.outputs.artifacts | |
230230
| hdfs | [HDFSArtifact](#h-d-f-s-artifact)| `HDFSArtifact` | | | | |
231231
| http | [HTTPArtifact](#http-artifact)| `HTTPArtifact` | | | | |
232-
| mode | int32 (formatted integer)| `int32` | | | mode bits to use on this file, must be a value between 0 and 0777</br>set when loading input artifacts. | |
232+
| mode | int32 (formatted integer)| `int32` | | | mode bits to use on this file, must be a value between 0 and 0777.</br>Set when loading input artifacts. It is recommended to set the mode value</br>to ensure the artifact has the expected permissions in your container. | |
233233
| name | string| `string` | | | name of the artifact. must be unique within a template's inputs/outputs. | |
234234
| optional | boolean| `bool` | | | Make Artifacts optional, if Artifacts doesn't generate or exist | |
235235
| oss | [OSSArtifact](#o-s-s-artifact)| `OSSArtifact` | | | | |
@@ -330,7 +330,7 @@ of a single workflow step, which the executor will use as a default location to
330330
| globalName | string| `string` | | | GlobalName exports an output artifact to the global scope, making it available as</br>'{{workflow.outputs.artifacts.XXXX}} and in workflow.status.outputs.artifacts | |
331331
| hdfs | [HDFSArtifact](#h-d-f-s-artifact)| `HDFSArtifact` | | | | |
332332
| http | [HTTPArtifact](#http-artifact)| `HTTPArtifact` | | | | |
333-
| mode | int32 (formatted integer)| `int32` | | | mode bits to use on this file, must be a value between 0 and 0777</br>set when loading input artifacts. | |
333+
| mode | int32 (formatted integer)| `int32` | | | mode bits to use on this file, must be a value between 0 and 0777.</br>Set when loading input artifacts. It is recommended to set the mode value</br>to ensure the artifact has the expected permissions in your container. | |
334334
| name | string| `string` | | | name of the artifact. must be unique within a template's inputs/outputs. | |
335335
| optional | boolean| `bool` | | | Make Artifacts optional, if Artifacts doesn't generate or exist | |
336336
| oss | [OSSArtifact](#o-s-s-artifact)| `OSSArtifact` | | | | |

docs/fields.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2288,7 +2288,7 @@ Artifact indicates an artifact to place at a specified path
22882288
|`globalName`|`string`|GlobalName exports an output artifact to the global scope, making it available as '{{io.argoproj.workflow.v1alpha1.outputs.artifacts.XXXX}} and in workflow.status.outputs.artifacts|
22892289
|`hdfs`|[`HDFSArtifact`](#hdfsartifact)|HDFS contains HDFS artifact location details|
22902290
|`http`|[`HTTPArtifact`](#httpartifact)|HTTP contains HTTP artifact location details|
2291-
|`mode`|`integer`|mode bits to use on this file, must be a value between 0 and 0777 set when loading input artifacts.|
2291+
|`mode`|`integer`|mode bits to use on this file, must be a value between 0 and 0777. Set when loading input artifacts. It is recommended to set the mode value to ensure the artifact has the expected permissions in your container.|
22922292
|`name`|`string`|name of the artifact. must be unique within a template's inputs/outputs.|
22932293
|`optional`|`boolean`|Make Artifacts optional, if Artifacts doesn't generate or exist|
22942294
|`oss`|[`OSSArtifact`](#ossartifact)|OSS contains OSS artifact location details|
@@ -4798,7 +4798,7 @@ ArtifactPaths expands a step from a collection of artifacts
47984798
|`globalName`|`string`|GlobalName exports an output artifact to the global scope, making it available as '{{io.argoproj.workflow.v1alpha1.outputs.artifacts.XXXX}} and in workflow.status.outputs.artifacts|
47994799
|`hdfs`|[`HDFSArtifact`](#hdfsartifact)|HDFS contains HDFS artifact location details|
48004800
|`http`|[`HTTPArtifact`](#httpartifact)|HTTP contains HTTP artifact location details|
4801-
|`mode`|`integer`|mode bits to use on this file, must be a value between 0 and 0777 set when loading input artifacts.|
4801+
|`mode`|`integer`|mode bits to use on this file, must be a value between 0 and 0777. Set when loading input artifacts. It is recommended to set the mode value to ensure the artifact has the expected permissions in your container.|
48024802
|`name`|`string`|name of the artifact. must be unique within a template's inputs/outputs.|
48034803
|`optional`|`boolean`|Make Artifacts optional, if Artifacts doesn't generate or exist|
48044804
|`oss`|[`OSSArtifact`](#ossartifact)|OSS contains OSS artifact location details|

docs/walk-through/artifacts.md

Lines changed: 71 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,15 @@
11
# Artifacts
22

33
!!! Note
4-
You will need to [configure an artifact repository](../configure-artifact-repository.md) to run this example.
4+
You will need to [configure an artifact repository](../configure-artifact-repository.md) to run artifact examples.
55

6-
When running workflows, it is very common to have steps that generate or consume artifacts. Often, the output artifacts of one step may be used as input artifacts to a subsequent step.
6+
## Basic Example
77

8-
The below workflow spec consists of two steps that run in sequence. The first step named `generate-artifact` will generate an artifact using the `hello-world-to-file` template that will be consumed by the second step named `print-message-from-file` that then consumes the generated artifact.
8+
When running workflows, it is very common to have steps that generate or consume artifacts.
9+
Often, the output artifacts of one step may be used as input artifacts to a subsequent step.
10+
11+
The below workflow spec consists of two steps that run in sequence.
12+
The first step named `generate-artifact` will generate an artifact using the `hello-world-to-file` template that will be consumed by the second step named `print-message-from-file` that then consumes the generated artifact.
913

1014
```yaml
1115
apiVersion: argoproj.io/v1alpha1
@@ -53,13 +57,19 @@ spec:
5357
args: ["cat /tmp/message"]
5458
```
5559
56-
The `hello-world-to-file` template uses the `echo` command to generate a file named `/tmp/hello-world.txt`. It then `outputs` this file as an artifact named `hello-art`. In general, the artifact's `path` may be a directory rather than just a file. The `print-message-from-file` template takes an input artifact named `message`, unpacks it at the `path` named `/tmp/message` and then prints the contents of `/tmp/message` using the `cat` command.
57-
The `artifact-example` template passes the `hello-art` artifact generated as an output of the `generate-artifact` step as the `message` input artifact to the `print-message-from-file` step. DAG templates use the tasks prefix to refer to another task, for example `{{tasks.generate-artifact.outputs.artifacts.hello-art}}`.
60+
In this Workflow:
61+
62+
- The `hello-world-to-file` template uses the `echo` command to generate a file named `/tmp/hello-world.txt`.
63+
- The `hello-world-to-file` template then `outputs` this file as an artifact named `hello-art`.
64+
In general, the artifact's `path` may be a directory rather than just a file.
65+
- The `print-message-from-file` template takes an input artifact named `message`, unpacks it at the `path` named `/tmp/message` and then prints the contents of `/tmp/message` using the `cat` command.
66+
- The `artifact-example` template passes the `hello-art` artifact generated as an output of the `generate-artifact` step as the `message` input artifact to the `print-message-from-file` step.
67+
DAG templates use the tasks prefix to refer to another task, for example `{{tasks.generate-artifact.outputs.artifacts.hello-art}}`.
5868

5969
Optionally, for large artifacts, you can set `podSpecPatch` in the workflow spec to increase the resource request for the init container and avoid any Out of memory issues.
6070

6171
```yaml
62-
<... snipped ...>
72+
# <... snipped ...>
6373
- name: large-artifact
6474
# below patch gets merged with the actual pod spec and increases the memory
6575
# request of the init container.
@@ -78,13 +88,19 @@ Optionally, for large artifacts, you can set `podSpecPatch` in the workflow spec
7888
image: alpine:latest
7989
command: [sh, -c]
8090
args: ["cat /tmp/large-file"]
81-
<... snipped ...>
91+
# <... snipped ...>
8292
```
8393

84-
Artifacts are packaged as Tarballs and gzipped by default. You may customize this behavior by specifying an archive strategy, using the `archive` field. For example:
94+
## Setting File Behavior
95+
96+
### Archive Strategy
97+
98+
Artifacts are packaged as Tarballs and gzipped by default.
99+
You may customize this behavior by specifying an archive strategy, using the `archive` field.
100+
For example:
85101

86102
```yaml
87-
<... snipped ...>
103+
# <... snipped ...>
88104
outputs:
89105
artifacts:
90106
# default behavior - tar+gzip default compression.
@@ -108,14 +124,43 @@ Artifacts are packaged as Tarballs and gzipped by default. You may customize thi
108124
tar:
109125
# no compression (also accepts the standard gzip 1 to 9 values)
110126
compressionLevel: 0
111-
<... snipped ...>
127+
# <... snipped ...>
112128
```
113129

130+
### File Permissions (`mode`)
131+
132+
It is good practice to specify the `mode` of the input file, to ensure your container code can always interact with it (reading/writing/executing) as expected.
133+
[This article](https://www.redhat.com/en/blog/linux-file-permissions-explained) explains file permissions and octal values.
134+
135+
For example, to allow the user to execute `/bin/kubectl`, we set `mode: 0755`.
136+
137+
```yaml
138+
# <... snipped ...>
139+
templates:
140+
- name: executable-artifact
141+
inputs:
142+
artifacts:
143+
# Download kubectl 1.8.0 and place it at /bin/kubectl
144+
- name: kubectl
145+
path: /bin/kubectl
146+
mode: 0755
147+
http:
148+
url: https://storage.googleapis.com/kubernetes-release/release/v1.8.0/bin/linux/amd64/kubectl
149+
# <... snipped ...>
150+
```
151+
152+
!!! Note
153+
`0755` is a YAML-formatted octal value (`0o755` is also allowed in YAML spec 1.2).
154+
When the Kubernetes API receives and validates the octal `mode` value, it will be stored in decimal.
155+
Therefore you will see `mode: 493` in YAML on the cluster, and therefore also in the Argo UI.
156+
The file permissions will still be correct as 493 in decimal is 0755 in octal.
157+
114158
## Artifact Garbage Collection
115159

116160
As of version 3.4 you can configure your Workflow to automatically delete Artifacts that you don't need (visit [artifact repository capability](../configure-artifact-repository.md) for the current supported store engine).
117161

118-
Artifacts can be deleted `OnWorkflowCompletion` or `OnWorkflowDeletion`. You can specify your Garbage Collection strategy on both the Workflow level and the Artifact level, so for example, you may have temporary artifacts that can be deleted right away but a final output that should be persisted:
162+
Artifacts can be deleted `OnWorkflowCompletion` or `OnWorkflowDeletion`.
163+
You can specify your Garbage Collection strategy on both the Workflow level and the Artifact level, so for example, you may have temporary artifacts that can be deleted right away but a final output that should be persisted:
119164

120165
```yaml
121166
apiVersion: argoproj.io/v1alpha1
@@ -153,12 +198,16 @@ spec:
153198

154199
### Artifact Naming
155200

156-
Consider parameterizing your S3 keys by {{workflow.uid}}, etc (as shown in the example above) if there's a possibility that you could have concurrent Workflows of the same spec. This would be to avoid a scenario in which the artifact from one Workflow is being deleted while the same S3 key is being generated for a different Workflow.
201+
Consider parameterizing your S3 keys by {{workflow.uid}}, etc (as shown in the example above) if there's a possibility that you could have concurrent Workflows of the same spec.
202+
This would be to avoid a scenario in which the artifact from one Workflow is being deleted while the same S3 key is being generated for a different Workflow.
157203

158-
In the case of having a whole directory as S3 key, please pay attention to the key value. Here are two examples:
204+
In the case of having a whole directory as S3 key, please pay attention to the key value.
205+
Here are two examples:
159206

160-
- (A) When changing the default archive option to none, it is important that it ends with a "/". Otherwise, the directory will be created in S3 but the GC pod won't be able to remove it.
161-
- (B) When keeping the default archive option to `.tgz`, in this case, it is important that it does NOT end with "/". Otherwise, Argo will fail to create the archive file.
207+
- (A) When changing the default archive option to none, it is important that it ends with a "/".
208+
Otherwise, the directory will be created in S3 but the GC pod won't be able to remove it.
209+
- (B) When keeping the default archive option to `.tgz`, in this case, it is important that it does NOT end with "/".
210+
Otherwise, Argo will fail to create the archive file.
162211

163212
Example (A) without packaging as `.tgz`
164213

@@ -230,7 +279,9 @@ spec:
230279

231280
### Service Accounts and Annotations
232281

233-
Does your S3 bucket require you to run with a special Service Account or IAM Role Annotation? You can either use the same ones you use for creating artifacts or generate new ones that are specific for deletion permission. Generally users will probably just have a single Service Account or IAM Role to apply to all artifacts for the Workflow, but you can also customize on the artifact level if you need that:
282+
Does your S3 bucket require you to run with a special Service Account or IAM Role Annotation?
283+
You can either use the same ones you use for creating artifacts or generate new ones that are specific for deletion permission.
284+
Generally users will probably just have a single Service Account or IAM Role to apply to all artifacts for the Workflow, but you can also customize on the artifact level if you need that:
234285

235286
```yaml
236287
apiVersion: argoproj.io/v1alpha1
@@ -307,13 +358,15 @@ rules:
307358
- patch
308359
```
309360

310-
This is the `artifactgc` role if you installed using one of the quick-start manifest files. If you installed with the `install.yaml` file for the release then the same permissions are in the `argo-cluster-role`.
361+
This is the `artifactgc` role if you installed using one of the quick-start manifest files.
362+
If you installed with the `install.yaml` file for the release then the same permissions are in the `argo-cluster-role`.
311363

312364
If you don't use your own `ServiceAccount` and are just using `default` ServiceAccount, then the role needs a RoleBinding or ClusterRoleBinding to `default` ServiceAccount.
313365

314366
### What happens if Garbage Collection fails?
315367

316-
If deletion of the artifact fails for some reason (other than the Artifact already having been deleted which is not considered a failure), the Workflow's Status will be marked with a new Condition to indicate "Artifact GC Failure", a Kubernetes Event will be issued, and the Argo Server UI will also indicate the failure. For additional debugging, the user should find 1 or more Pods named `<wfName>-artgc-*` and can view the logs.
368+
If deletion of the artifact fails for some reason (other than the Artifact already having been deleted which is not considered a failure), the Workflow's Status will be marked with a new Condition to indicate "Artifact GC Failure", a Kubernetes Event will be issued, and the Argo Server UI will also indicate the failure.
369+
For additional debugging, the user should find 1 or more Pods named `<wfName>-artgc-*` and can view the logs.
317370

318371
If the user needs to delete the Workflow and its child CRD objects, they will need to patch the Workflow to remove the finalizer preventing the deletion:
319372

0 commit comments

Comments
 (0)