tfidf runtime enhancement changes#1571
tfidf runtime enhancement changes#1571miguelgfierro merged 3 commits intorecommenders-team:stagingfrom AdityaSoni19031997:AdityaSoni19031997/tfidf_runtime_enhancements
Conversation
|
cc @miguelgfierro Thanks! |
|
@AdityaSoni19031997 approved! |
miguelgfierro
left a comment
There was a problem hiding this comment.
Really good contribution, thanks!
|
Thanks Miguel! Looking forward to explore bits and pieces of this repository! |
|
cc @miguelgfierro I am just curious, Why the repository doesn't use PyTorch? |
hehe, good question. Parts of the code in this repo are old, Microsoft Research used to use TF, and now they are moving to PyTorch. In future releases you will probably see more code in PyTorch, but we will keep supporting TF. |
|
Haha, Not a TF fan! I like PyTorch more than TF. Is there any brach where people are working in porting the snips to torch? Thanks. |
we are not planning to port the current code to PyTorch, the future one will be developed in PyTorch |
Description
The minimal changes made to the
tf_idf_utilsfile in this PR helps in reducing the overall runtime by avoiding the looping and slicing the pandas DataFrame.The PR helps in resolving the issue raised in #1568.
Checklist:
staging branchand not tomain branch.Looking forward to the feedback on the PR and any other changes needed.
Best,
Aditya.