forked from OSOceanAcoustics/echopop
-
Notifications
You must be signed in to change notification settings - Fork 0
Update biodata ingestion #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
brandynlucca
added a commit
that referenced
this pull request
Apr 29, 2025
* Adjust validator to accommodate single biodata file input * Create test script for scratch-coding * Enable ship user-designations * Adjust survey data loading functions to appropriately filter biodata upon ingestion * Pre-commit fixes * Changes to `pytest` to incorporate updated biodata ingestion functionality * Drop extraneous `test_script.py`.
brandynlucca
added a commit
that referenced
this pull request
Apr 29, 2025
* Create test script * Compability adjustments for 2017 * Reindexing change required for 2021 dataset * Fix column name parsing when `regex=False` in `pandera` `DataFrame` validators * Fix `*.csv` filtering * Limit options for `Lat_S` and `Lon_S` * Fix to empty/whitespace in `*.xlsx` files * Upload support files for backwards compatibility testing * Index by `"stratum_inpfc"` AND `"transect_num"` to resolves cases where some transects crossed multiple INPFC strata. This provides greater specificity and avoids duplicate transect indices that would otherwise raise errors. * Ensure that `transect_summary` index is properly reset and resolve `pandas` `FutureWarning` associated with the `groupby` operations for `Survey.stratified_analysis(dataset="kriging")` * Diagnosing 2013 survey issues * Address mismatched stratum indices across hauls * Overhaul to Echoview export ingestion step that correctly reads in group-specific transect-region-haul mapping files * Variety of commits for fixing years 2015-2023 * Update configuration files * Change to imported stratum-haul mapping for INPFC * Stratification fixes * Update to `test_survey` script used for testing each year * Add `haul ` as a possible column name for translation * Testing file update for tracking success * Update to 2013 configuration * Further update for test file * Changes to address odd 2012 virtual transect issue * Adjustment to how INPFC sheet is read in * Geostrata latitude sorting fixes monotonic issue * Compatibility improvements for 2011 * Initial spot check to column name validator * Changes to validators to enable flexible column naming schemes * Fix to f-string * Fix to backslash in f-string * Add triple-quoting to f-string Literal * Changes to address large abundance estimate deviation * Enable non-grouped transect region file mapping in configuration YAML * Adjustments to transect region file that is called for the 2011 survey year * Adjustment to convex hull cropping method * Fixes to kriged abundance report values * Stored changes to file investigation script * Enable .xls file extension * Revert .xls functionality (temporarily) * Enable 2011 transect interval filter file ingestion * Update biodata ingestion (#9) * Adjust validator to accommodate single biodata file input * Create test script for scratch-coding * Enable ship user-designations * Adjust survey data loading functions to appropriately filter biodata upon ingestion * Pre-commit fixes * Changes to `pytest` to incorporate updated biodata ingestion functionality * Drop extraneous `test_script.py`. * Fix to `CSVFile` validator * Revert consolidated biodata sheet * Modified test script and config * Bug hotfixes to data loading * Test for 2011 * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix to `test_survey` * Pre-commit fixes --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Biological data are currently ingested by reading in multiple
*.xlsxfiles. The desired goal is to be able to consolidate this into a single*.xlsx(across multiple sheets).