Skip to content

Conversation

@brandynlucca
Copy link
Owner

Biological data are currently ingested by reading in multiple *.xlsx files. The desired goal is to be able to consolidate this into a single *.xlsx (across multiple sheets).

@brandynlucca brandynlucca self-assigned this Apr 23, 2025
@brandynlucca brandynlucca merged commit 8a54aad into main Apr 23, 2025
4 checks passed
brandynlucca added a commit that referenced this pull request Apr 29, 2025
* Adjust validator to accommodate single biodata file input

* Create test script for scratch-coding

* Enable ship user-designations

* Adjust survey data loading functions to appropriately filter biodata upon ingestion

* Pre-commit fixes

* Changes to `pytest` to incorporate updated biodata ingestion functionality

* Drop extraneous `test_script.py`.
brandynlucca added a commit that referenced this pull request Apr 29, 2025
* Create test script

* Compability adjustments for 2017

* Reindexing change required for 2021 dataset

* Fix column name parsing when `regex=False` in `pandera` `DataFrame` validators

* Fix `*.csv` filtering

* Limit options for `Lat_S` and `Lon_S`

* Fix to empty/whitespace in `*.xlsx` files

* Upload support files for backwards compatibility testing

* Index by `"stratum_inpfc"` AND `"transect_num"` to
resolves cases where some transects crossed multiple INPFC strata. This
provides greater specificity and avoids duplicate transect indices that
would otherwise raise errors.

* Ensure that `transect_summary` index is properly reset and resolve `pandas` `FutureWarning` associated with the `groupby` operations for `Survey.stratified_analysis(dataset="kriging")`

* Diagnosing 2013 survey issues

* Address mismatched stratum indices across hauls

* Overhaul to Echoview export ingestion step that correctly reads in group-specific transect-region-haul mapping files

* Variety of commits for fixing years 2015-2023

* Update configuration files

* Change to imported stratum-haul mapping for INPFC

* Stratification fixes

* Update to `test_survey` script used for testing each year

* Add `haul ` as a possible column name for translation

* Testing file update for tracking success

* Update to 2013 configuration

* Further update for test file

* Changes to address odd 2012 virtual transect issue

* Adjustment to how INPFC sheet is read in

* Geostrata latitude sorting fixes monotonic issue

* Compatibility improvements for 2011

* Initial spot check to column name validator

* Changes to validators to enable flexible column naming schemes

* Fix to f-string

* Fix to backslash in f-string

* Add triple-quoting to f-string Literal

* Changes to address large abundance estimate deviation

* Enable non-grouped transect region file mapping in configuration YAML

* Adjustments to transect region file that is called for the 2011 survey year

* Adjustment to convex hull cropping method

* Fixes to kriged abundance report values

* Stored changes to file investigation script

* Enable .xls file extension

* Revert .xls functionality (temporarily)

* Enable 2011 transect interval filter file ingestion

* Update biodata ingestion (#9)

* Adjust validator to accommodate single biodata file input

* Create test script for scratch-coding

* Enable ship user-designations

* Adjust survey data loading functions to appropriately filter biodata upon ingestion

* Pre-commit fixes

* Changes to `pytest` to incorporate updated biodata ingestion functionality

* Drop extraneous `test_script.py`.

* Fix to `CSVFile` validator

* Revert consolidated biodata sheet

* Modified test script and config

* Bug hotfixes to data loading

* Test for 2011

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix to `test_survey`

* Pre-commit fixes

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants