-
Notifications
You must be signed in to change notification settings - Fork 5
Report generation #301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report generation #301
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
for more information, see https://pre-commit.ci
…chopop into report_generation
leewujung
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @brandynlucca : Thanks for the PR! Exciting to see these all in place!
I have a minor inline comment and a few structural ones below that hopefully are straightforward.
Module name
Change FEATReport.py to feat_report.py?
Report output folder
What do you think about making save_directory a required argument in .generate()? The current default is a separate folder report under the data_root_dir in *_config.yml, and there are a few other folders under this same root directory. I feel it is usually good practice to keep data and outputs separately located, so forcing people to provide a path to a folder may help facilitate that.
Report generation API
I think the current implementation with FEATReport being a separate class in echopop.extensions is a great arrangement. I only have one comment in the calling API -- Going through the example notebook, I felt there is a jump to have to bring in a separate FEATReports class after doing survey.SOME_FUNCTION multiple times.
What about keeping the same API format and creating a hook under the hood so that users can do something:
survey.generate_report(kind="FEAT", save_directory="REPORT_PATH", ...)underneath this function you could pass self and all argument to the current FEATReport class to work in the same way.
Since the FEATReport only has 1 outward-facing function .generate(), I feel this would reduce one less thing users have to worry about, and allows other report types to be flexibly added in the future using the some survey.generate_report() shell but with completely function/class structures.
__METHOD_REFERENCE__
I think using lambda v: v.data to pass self.data (or one of the dict entries) as argument data in functions like aged_length_haul_counts_report, kriged_length_age_biomass_report, etc in __METHOD_REFERENCE__ can make maintenance down the road more difficult. The main thing is that it would require additional effort to see that some of the data argument are just FEATReport.data and some of the data argument are a dict entry of FEATReport.data, and FEATReport.data is nothing but the entire survey object.
Instead of this structure, I suggest the following:
- remove the
dataargument and just get everything needed fromself.datawithin each of the class methods (those specified as"function"in the current__METHOD_REFERENCE__), to avoid passing the same thing twice (the currentselfanddataarguments). Things likedata(self).input["biology"]["specimen_df"],is be pretty confusing -- isn't it the same asself.input["biology"]["specimen_df"]? - since
Path(v.save_directory)is used in all"filename"entries in the current__METHOD_REFERENCE__, you can passsave_directoryinto each function
Take the first entry aged_length_haul_counts for example, you can have:
"aged_length_haul_counts": {
"function": aged_length_haul_counts_report,
"title": "Aged Length-Haul Counts ({SEX})",
"filename": "aged_length_haul_counts_table.xlsx",
},and in aged_length_haul_counts_report have the function signature
def aged_length_haul_counts_report(self, save_directory, title, filename)
# assemble report path under this function from save_directory and filename
When calling it, the line can be modified slightly:
report_files = [v.get("function")(self, save_directory, **v) for _, v in report_methods.items()]
|
Creating a tracking checklist:
|
This PR adds:
echopop.extensions.FEATReports) which comprises a series of helper functions that all feed into a centralFEATReportsclass*.xlsxfiles are produced at the defined directory path viaFEATReports.generate(), which can be tuned to produce a subset of the reports if neededFEATReportsis used for generating these reports