Skip to content

Conversation

@brandynlucca
Copy link
Collaborator

This draft PR includes refactored changes to the NASC ingestion, docstrings, and associated tests.

@brandynlucca brandynlucca requested a review from leewujung May 8, 2025 20:29
@leewujung
Copy link
Member

@brandynlucca : testing structure looks great! thanks!

@brandynlucca brandynlucca marked this pull request as ready for review May 15, 2025 20:21
Copy link
Member

@leewujung leewujung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @brandynlucca : Thanks for the PR!

The flow you have under the main functions looks good so I didn't go through all the new functions carefully. I trusted that you have verified the details.

I only have a few suggestions/questions below:

  • For the old tests, you can use pytestmark = pytest.mark.skip(reason="Temporarily disable this module") on top of a .py to disable all tests in lieu of commenting the code out.
  • Why are setting column names to lowercase and impute_bad_coordinates needed in read_nasc_file? I was thinking df_nasc_all_ages (or df_nasc_no_age1) would be a "final" product so that people could just load and use those directly. Are the formatting and data in df_nasc_all_ages kept in a specific way to accommodate something else?
  • Is filter_transect_intervals related to the mesh polygon creation downstream? Or is it just so that people have some control over what part of the geographical regions are included?
  • I see many unit tests, but couldn't find integration tests. How about adding those for the functions that are exposed in feat_hake.py?
  • Mirrorfeat_hake.py to have a notebook as a gateway for people to interactively try out the code?

@leewujung
Copy link
Member

Oh also just noticed the merge conflict - seems just small things from my PRs added after your branched out this, so they ended up not in the commit history here.

@brandynlucca
Copy link
Collaborator Author

Why are setting column names to lowercase and impute_bad_coordinates needed in read_nasc_file? I was thinking df_nasc_all_ages (or df_nasc_no_age1) would be a "final" product so that people could just load and use those directly. Are the formatting and data in df_nasc_all_ages kept in a specific way to accommodate something else?

This mostly ensures backwards compatibility with previous FEAT survey years where the column name schemes are somewhat inconsistent. This is somewhat in anticipation of incorporating the validation step(s), which would incorporate any required formatting changes.

Is filter_transect_intervals related to the mesh polygon creation downstream? Or is it just so that people have some control over what part of the geographical regions are included?

This relates to removing off-effort transect intervals.

@brandynlucca brandynlucca merged commit 33d72c6 into OSOceanAcoustics:main Jun 4, 2025
6 checks passed
@brandynlucca brandynlucca mentioned this pull request Jun 5, 2025
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants