addition of item similarity measure - python version#1522
Conversation
miguelgfierro
left a comment
There was a problem hiding this comment.
Yan this is really nice, I have one point for discussion
Codecov Report
@@ Coverage Diff @@
## staging #1522 +/- ##
===========================================
+ Coverage 62.03% 62.20% +0.17%
===========================================
Files 84 84
Lines 8397 8441 +44
===========================================
+ Hits 5209 5251 +42
- Misses 3188 3190 +2
Continue to review full report at Codecov.
|
miguelgfierro
left a comment
There was a problem hiding this comment.
This is really good Yan, you have changed the code super quickly.
The only thing I see is that these changes also affect the notebooks right? but the tests didn't fail, so are we not testing the diversity notebook?
diversity notebook only use spark version as an example. We do not have example diversity notebook for python version. |
miguelgfierro
left a comment
There was a problem hiding this comment.
this is super good Yan
| # diversity metrics | ||
| class PythonDiversityEvaluation: | ||
| """Python Diversity Evaluator""" | ||
| def check_column_dtypes_diversity_serendipity(func): |
There was a problem hiding this comment.
Shouldn't we have this decorator private (with an '_' at the front of the name)?
There was a problem hiding this comment.
We did not use _ in existing code, e.g. the function "check_column_dtypes" does not have _ in front of the function name. Therefore I don't add _ to be consistent.
There was a problem hiding this comment.
No, I meant adding just an underscore, without any quotation marks. Currently you see these methods in readthedocs, where they are probably not needed.
There was a problem hiding this comment.
There is another issue with the docstrings. The args are missing in the python functions e.g. here because before they were inside the encapsulating class but now they are required.
| * Serendipity - The "unusualness" or "surprise" of recommendations to a user. When 'col_relevance' is used, it indicates how "pleasant surprise" of recommendations is to a user. | ||
|
|
||
| The metric definitions/formulations are based on the following references with modification: | ||
| def check_column_dtypes_novelty_coverage(func): |
There was a problem hiding this comment.
Similar, should this be private?
|
There are some long lines (caught by flake). |
I used "black" to format the files. |
|
Thanks @anargyri for catching many issue! |
Description
Related Issues
Checklist:
staging branchand not tomain branch.