Replies: 4 comments 7 replies
-
|
@kMutagene Can you explain the Sample input example in more detail? Are you talking about the ARC scaffold spec? It would then still be an example for the RO-Crate being stricter than the ARC scaffold, right? |
Beta Was this translation helpful? Give feedback.
-
|
Another case, the PropertyValue: In RO-Crate, both the Consider the following example table.
With the current RO-Crate profile, the Characteristic can't be properly depicted for the sample "MyIn2", as the value is missing in the tabular structure, but a value is required in RO-Crate. So either, we set a random value (which I think is problematic here semantically) or we the leave the Characteristic out in RO-Crate. The second option would be problematic if we regard RO-Crate as a true exchange format, as the information gets lost on write and read. My suggestion here would be to set the |
Beta Was this translation helpful? Give feedback.
-
|
JFC here is the now closed epic for the isa ro crate profiles: nfdi4plants/ARC-specification#135 We are doing the same for the workflow/run part here: nfdi4plants/ARC-specification#150 |
Beta Was this translation helpful? Give feedback.
-
Workflow and Run ARC Scaffold/RO-Crate-harmonizationIn general, the discussion was about the topic of how and where to depict the information from the Workflow Run Crates in the ARC-Scaffold. Here we named two different types of information:
Conceptual explanations and ideas for representation in the ARC Scaffold for these two metadata follows: Documentary metadataThe Actionable metadataThe But now we have the situation that the ARC RO-Crate will not only be produced and used by us, and other user-groups (like Galaxy) prefer other workflow languages than CWL (Galaxy Workflows, Nextflow etc.). The Workflow-Run-Crate (and by extension our profile) is unbiased towards the choice of workflows. Therefore, there is a strong incentive to handle also those cases where an ARC RO-Crate contains non-cwl workflows. This includes mapping the ARC RO-Crate to ARC Scaffold, having the workflows runnable in methods native to ARC Scaffold and mapping it back to ARC RO-Crate in such a way that does not enforce usage of cwl in the RO-Crate representation. To this effect here's the concept of wrapper CWL: Wrapper CWL and other workflow languagesWrapper CWLs are minimal CWL files which are used to wrap other workflow languages in the ARC Scaffold representation. They are assimilated into an These wrapper CWLs are NOT meant as generic wrappers for a group of wrapped workflows. Instead, Each Wrapper CWL is specific to the workflow it wraps, i.e. it contains the specific parameter names. Example1Let's say we have the following reduced {
"@id": "Galaxy-Workflow-Hello_World.ga",
"@type": ["File", "SoftwareSourceCode", "ComputationalWorkflow"],
"programmingLanguage": {"@id": "https://w3id.org/workflowhub/workflow-ro-crate#galaxy"},
"input": [
{"@id": "#simple_input"}
],
"output": [
{"@id": "#reversed"}
]
},In the ARC Scaffold representation, we would map the flowchart TD
subgraph "Workflow Folder"
ga1[Galaxy-Workflow-Hello_World.ga]
w1[workflow.cwl]
end
w1 --wraps--> ga1
Example2: Orchestrated foreign workflowsNow, if the situation in the RO-Crate is more complex and the foreign workflows orchestrate each other, we might end up with the following ARC Scaffold: flowchart TD
subgraph Workflow1
ga1[step1.ga]
w1[workflow.cwl]
end
subgraph Workflow2
ga2[step2.ga]
w2[workflow.cwl]
end
subgraph Run
ga3[run.ga]
y[run.yml]
r[run.cwl]
end
ga3 --"orchestrates"--> ga2
w1 --wraps--> ga1
w2 --wraps--> ga2
r --wraps--> ga3
ga3 --"orchestrates"--> ga1
Notice three things:
Handling process and workflow inputsAnother part of the If we take a look at the (reduced) example from the Workflow Run Crate specs again and already add the double typing with {
"@id": "#wfrun-5a5970ab-4375-444d-9a87-a764a66e3a47",
"@type": ["CreateAction", "LabProcess"],
"name": "Galaxy workflow run 5a5970ab-4375-444d-9a87-a764a66e3a47",
"instrument": {"@id": "Galaxy-Workflow-Hello_World.ga"},
"object": [
{"@id": "#verbose-pv"},
{"@id": "MyInputFile.csv"}
]
},
{
"@id": "MyInputFile.csv",
"@type": "File"
},
{
"@id": "#verbose-pv",
"@type": "PropertyValue",
"exampleOfWork": {"@id": "#verbose-param"},
"name": "verbose",
"value": "True"
}Notice the (Creating this example I noticed a possible issue with this apprach. In the original example, the TLDR
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
This is a necessary discussion as part of the ongoing efforts to implement the 'other side of the coin' representation of the ARC - the ARC RO-Crate.
Preface on mandatory information
The ARC ecosystem was initially created and grown with the user-facing
ARC scaffoldrepresentation. To ease the process of manually creating an ever-evolving FAIR digital object, the specification for this ARC representation only specifies the mandatory structure of the information, without formulating requirements for the information itself. Any further requirements - e.g. ARCs in which information must satisfy domain-specific requirements - are formulated via the pull-model of ARC validation in regards to a target.Now, the other side of the coin - the ARC RO-crate - is an increasing focus of development, as it will facilitate machine interaction with ARCs. We provide both sides of the coin by attaching the ARC RO-crate metadata as a json file to the ARC scaffold hosted on the DataHUB. However, there are differences in the required information that these formats have that need to be addressed.
Mandatory fields for being a valid RO-crate
In the JSON-LD world of RO-crates, there are some mandatory properties that
MUSTbe present. Aside from technicalities, the relevant ones are:•
name•
description•
datePublished•
licenseWe must either find 1-to-1 mappings in the ARC scaffold for these or provide sensible default values. This is done for all other than
licenseIIRC, and here is the respective discussionExamples for maybe being too strict
agentfor each process. This is not the case for the ARC spec and may be unnecessary.Examples for maybe being too lax
I encourage others to have a look at both specs and give input on where to be more strict and where to be more lax:
cc @HLWeil @floWetzels @muehlhaus @Brilator
Beta Was this translation helpful? Give feedback.
All reactions