Refactor parallel subpackage to use dataframes and grouping#887
Merged
Refactor parallel subpackage to use dataframes and grouping#887
Conversation
* Adds a new term rotation - we previously used frame but frame is separately used so a new term made sense * Replaced plantbarcode with barcode to fit a broader range of applications
Replaces "%Y-%m-%d %H:%M:%S.%f" with "%Y-%m-%dT%H:%M:%S.%fZ"
Replaces metadata_parser with a new modular workflow that parses three types of datasets and uses a dataframe structure to do metadata filtering
The workflow configuration template needed to be updated to match updates to WorkflowConfig
* Add a new module for standardizing and implementing workflow command-line and notebook input arguments * Update job_builder to plug inputs into the new argument framework * Update multiprocess tests
Still needs work to reduce complexity
Codecov Report
@@ Coverage Diff @@
## 4.x #887 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 159 160 +1
Lines 6738 6714 -24
=========================================
- Hits 6738 6714 -24
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Mock sys.argv to run test on argparse function
Contributor
|
Error handling for "images" input in Workflow Image when input is not a list. |
HaleySchuhl
approved these changes
Jun 27, 2022
New version breaks something in acute_vertex that we can figure out later
Member
Author
|
@JorgeGtz found an issue with the I think I can refactor |
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Describe your changes
This PR makes several major changes to the parallel subpackage.
.github/workflows/continuous-integration.yml:pipto install PlantCV instead ofsetup.py.plantcv/parallel/__init__.py:convert_datetime_to_unixtimeandcheck_date_range.workflow_inputsfunction andWorkflowInputsclass. These are from new module for handling workflow inputs in Jupyter notebooks and scripts was added to make it easier to migrate from Jupyter to a parallel workflow script.WorkflowConfigdefault timestamp format was updated to an ISO 8601 UTC datetime.WorkflowConfigcoprocessattribute and replace withgroupbyandgroup_nameattributes. The new attributes are used to group images in the new dataframe-based metadata parser framework and name the image inputs to parallel workflows.rotationmetadata attribute was added toWorkflowConfig.plantbarcodemetadata attribute inWorkflowConfigwas renamed tobarcodeto be more general.plantcv/parallel/parsers.py:phenodata) was added to the parsers module.plantcv/parallel/job_builder.py:workflow_inputs-basedargparseframework.image1,image2, etc.)plantcv/parallel/workflow_inputs.py:WorkflowInputsclass is used to set Jupyter notebook input variables in a framework that is compatible with the command-line arguments used in parallel workflow scripts.workflow_inputsfunction creates a standardizedargparsecommand-line argument parser for workflows.plantcv/parallel/process_results.py:plantcv/utils/converters.py:json2csvutil function was updated to handle grouped output data.json2csvnow only outputs a single CSV file in long format.Additionally, relevant tests were added/updated. Documentation was updated where necessary.
Type of update
Is this a: New feature or feature enhancement
Associated issues
Closes #474
Closes #423
Closes #538
Replaces #759