-
Notifications
You must be signed in to change notification settings - Fork 5
NASC ingestion refactoring #349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NASC ingestion refactoring #349
Conversation
|
@brandynlucca : testing structure looks great! thanks! |
leewujung
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @brandynlucca : Thanks for the PR!
The flow you have under the main functions looks good so I didn't go through all the new functions carefully. I trusted that you have verified the details.
I only have a few suggestions/questions below:
- For the old tests, you can use
pytestmark = pytest.mark.skip(reason="Temporarily disable this module")on top of a .py to disable all tests in lieu of commenting the code out. - Why are setting column names to lowercase and
impute_bad_coordinatesneeded inread_nasc_file? I was thinkingdf_nasc_all_ages(ordf_nasc_no_age1) would be a "final" product so that people could just load and use those directly. Are the formatting and data indf_nasc_all_ageskept in a specific way to accommodate something else? - Is
filter_transect_intervalsrelated to the mesh polygon creation downstream? Or is it just so that people have some control over what part of the geographical regions are included? - I see many unit tests, but couldn't find integration tests. How about adding those for the functions that are exposed in
feat_hake.py? - Mirror
feat_hake.pyto have a notebook as a gateway for people to interactively try out the code?
|
Oh also just noticed the merge conflict - seems just small things from my PRs added after your branched out this, so they ended up not in the commit history here. |
for more information, see https://pre-commit.ci
This mostly ensures backwards compatibility with previous FEAT survey years where the column name schemes are somewhat inconsistent. This is somewhat in anticipation of incorporating the validation step(s), which would incorporate any required formatting changes.
This relates to removing off-effort transect intervals. |
This draft PR includes refactored changes to the NASC ingestion, docstrings, and associated tests.