-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Large refactor #1086
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Large refactor #1086
Changes from 16 commits
Commits
Show all changes
67 commits
Select commit
Hold shift + click to select a range
c47fe4d
rename and refactor :boom:
miguelgfierro 5551d4a
rename and refactor :boom:
miguelgfierro f6b0453
refact
miguelgfierro 6db4148
scenarios
miguelgfierro 485c1f1
retail
miguelgfierro 02bec05
retail
miguelgfierro 9c35983
retail
miguelgfierro 3e3756c
retail
miguelgfierro 7864b8e
retail
miguelgfierro 44772c7
comments @yueguoguo
miguelgfierro d79f878
Merge branch 'staging' into miguel/burn_and_destroy
miguelgfierro c6c20c5
Merge branch 'staging' into miguel/burn_and_destroy
miguelgfierro 5db328f
advance
miguelgfierro 8eb19fa
advance
miguelgfierro f3ddcae
advance
miguelgfierro f01dcb6
review
miguelgfierro c1baf1e
Merge branch 'staging' into miguel/burn_and_destroy
miguelgfierro 1e78d52
scenarios
miguelgfierro 60d9587
structure change
miguelgfierro 7f44a9d
glossary
miguelgfierro 61923c7
:boom:
miguelgfierro 78986b7
readme
miguelgfierro 36ed9e6
rewrite of retail readme for readability.
5b007f7
format
e1a5f51
glossary
miguelgfierro 33c6e5e
:doc:
miguelgfierro 40560c3
:doc:
miguelgfierro 65dd13c
:doc:
miguelgfierro 573e004
Update README.md
wutaomsft f42e8f5
wip
miguelgfierro 63930e5
Merge branch 'miguel/burn_and_destroy' of github.com:microsoft/recomm…
miguelgfierro a442096
glossary
miguelgfierro 47f9d25
glossary
miguelgfierro f572dcf
kg
miguelgfierro 4930065
fix links
miguelgfierro 97672a9
readme
miguelgfierro f022427
fix paths
miguelgfierro a46b18f
fix paths
miguelgfierro f156c0b
fix paths
miguelgfierro 4f77506
rename
miguelgfierro 5dd3f68
fix :bug: and paths
miguelgfierro 123a737
tests
miguelgfierro 14d7c50
fixing tests
miguelgfierro 1ee91fa
:bug:
miguelgfierro d90c9a3
:bug:
miguelgfierro 39705c1
typo
miguelgfierro e1bbd2a
fix :bug: test lightfm
miguelgfierro 9d7c661
papers
miguelgfierro 44b4843
papers
miguelgfierro da7cdbf
typo
miguelgfierro a6e441e
fixed :bug: with pymanopt
miguelgfierro 57b0c8a
long tail
miguelgfierro c0185c1
spark
miguelgfierro b0f8a59
ignore
miguelgfierro 841fc49
mmlspark lgb criteo
miguelgfierro 871ef72
:bug:
miguelgfierro 1881066
java8
miguelgfierro 24b6ba9
benchmark
miguelgfierro 4e9263a
retail
miguelgfierro 16baaed
spark 2.4.3
miguelgfierro fd1eb0b
Update README.md
anargyri a281478
lightgcn
miguelgfierro d4a5244
fix :bug: in readme
miguelgfierro 845964a
readms
miguelgfierro d5ae933
update authors
miguelgfierro f4c1f4d
merge staging
miguelgfierro 930427f
:bug:
miguelgfierro File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,16 @@ | ||
| # Recommendation System Scenarios | ||
|
|
||
| On this section there is listed a number of business scenarios that are common in Recommendation Systems. | ||
|
|
||
| The list of scenarios are: | ||
|
|
||
| * [Ads](ads) | ||
| * [Entertainment](entertainment) | ||
| * [Food and restaurants](food_and_restaurants) | ||
| * [News and document]() | ||
| * [Retail](retail) | ||
| * [Travel](travel) | ||
|
|
||
|
|
||
|
|
||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| # Recommendation systems for Advertisement |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| # Recommendation systems for Entertainment |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| # Recommendation systems for Food and Restaurants |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| # Recommendation systems for News |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,100 @@ | ||
| # Recommendation systems for Retail | ||
|
|
||
| An increasing number of online companies are utilizing recommendation systems (RS) to increase user interaction and enrich shopping potential. Use cases of recommendation systems have been expanding rapidly across many aspects of eCommerce and online media over the last 4-5 years, and we expect this trend to continue. | ||
|
|
||
| Companies across many different areas of enterprise are beginning to implement recommendation systems in an attempt to enhance their customer’s online purchasing experience, increase sales and retain customers. Business owners are recognizing potential in the fact that recommendation systems allow the collection of a huge amount of information relating to user’s behavior and their transactions within an enterprise. This information can then be systematically stored within user profiles to be used for future interactions. | ||
|
|
||
| ## Typical Business Scenarios in Recommendation Systems for Retail | ||
|
|
||
| The most common scenarios companies use are: | ||
|
|
||
| * Others you may like (also called similar items): The "Others you may like" recommendation predicts the next product that a user is most likely to engage with or purchase. The prediction is based on both the entire shopping or viewing history of the user and the candidate product's relevance to a current specified product. | ||
miguelgfierro marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| * Frequently bought together"(shopping cart expansion): The "Frequently bought together" recommendation predicts items frequently bought together for a specific product within the same shopping session. If a list of products is being viewed, then it predicts items frequently bought with that product list. This recommendation is useful when the user has indicated an intent to purchase a particular product (or list of products) already, and you are looking to recommend complements (as opposed to substitutes). This recommendation is commonly displayed on the "add to cart" page, or on the "shopping cart" or "registry" pages (for shopping cart expansion). | ||
|
|
||
| * Recommended for you: The "Recommended for you" recommendation predicts the next product that a user is most likely to engage with or purchase, based on the shopping or viewing history of that user. This recommendation is typically used on the home page. | ||
|
|
||
|
|
||
| ## Data in Recommendation Systems | ||
|
|
||
| ### Data types | ||
|
|
||
| In RS for retail there are typically the following types of data | ||
|
|
||
| * Explicit interactions: When a user explicitly rate an item, typically between 1-5, the user is giving a value on the likeliness of the item. In retail, this kind of data is not very common. | ||
|
|
||
| * Implicit interactions: Implicit interactions are views or clicks that show a certain interest of the user about a specific items. These kind of data is more common but it doesn't define the intention of the user as clearly as the explicit data. | ||
|
|
||
| * User features: These include all information that define the user, some examples can be name, address, email, demographics, etc. | ||
|
|
||
| * Item features: These include information about the item, some examples can be SKU, description, brand, price, etc. | ||
|
|
||
| * Knowledge graph data: ... | ||
|
|
||
| ### Considerations about data size | ||
|
|
||
| The size of the data is important when designing the system... | ||
|
|
||
| ### Cold start scenarios | ||
|
|
||
| Personalized recommender systems take advantage of users past history to make predictions. The cold start problem concerns the personalized recommendations for users with no or few past history (new users). Providing recommendations to users with small past history becomes a difficult problem for CF models because their learning and predictive ability is limited. Multiple research have been conducted in this direction using hybrid models. These models use auxiliary information (multimodal information, side information, etc.) to overcome the cold start problem. | ||
|
|
||
| ### Long tail products | ||
|
|
||
| Typically, the shape of items interacted in retail follow a long tail distribution [1,2]. | ||
|
|
||
| ## Measuring Recommendation performance | ||
|
|
||
| ### Machine learning metrics (offline metrics) | ||
|
|
||
| Offline metrics in RS are based on rating, ranking, classification or diversity. For learning more about offline metrics, see the [definitions available in Recommenders repository](../../examples/03_evaluate) | ||
|
|
||
| ### Business success metrics (online metrics) | ||
|
|
||
| Below are some of the various potential benefits of recommendation systems in business, and the metrics that tipically are used: | ||
|
|
||
| * Click-through rate (CTR): Optimizing for CTR emphasizes engagement; you should optimize for CTR when you want to maximize the likelihood that the user interacts with the recommendation. | ||
|
|
||
| * Revenue per order: The revenue per order optimization objective is the default optimization objective for the "Frequently bought together" recommendation model type. This optimization objective cannot be specified for any other recommendation model type. | ||
|
|
||
| * Conversion rate: Optimizing for conversion rate maximizes the likelihood that the user purchases the recommended item; if you want to increase the number of purchases per session, optimize for conversion rate. | ||
|
|
||
| ### Relationship between online and offline metrics in retail | ||
|
|
||
| There is some literature about the relationship between offline and online metrics... | ||
|
|
||
|
|
||
| ### A/B testing | ||
|
|
||
| ### Advanced A/B testing: online learning with VW | ||
|
|
||
| ... | ||
|
|
||
| ## Examples of end 2 end recommendation scenarios with Microsoft Recommenders | ||
|
|
||
| From a technical perspective, RS can be grouped in these categories [1]: | ||
|
|
||
| * Collaborative filtering: This type of recommendation system makes predictions of what might interest a person based on the taste of many other users. It assumes that if person X likes Snickers, and person Y likes Snickers and Milky Way, then person X might like Milky Way as well. See the [list of examples in Recommenders repository](../../examples/02_model_collaborative_filtering). | ||
|
|
||
| * Content-based filtering: This type of recommendation system focuses on the products themselves and recommends other products that have similar attributes. Content-based filtering relies on the characteristics of the products themselves, so it doesn’t rely on other users to interact with the products before making a recommendation. See the [list of examples in Recommenders repository](../../examples/02_model_content_based_filtering). | ||
|
|
||
| * Hybrid filtering: This type of recommendation system can implement a combination fo any two of the above systems. See the [list of examples in Recommenders repository](../../examples/02_model_hybrid). | ||
|
|
||
| * Knowledge-base: ... | ||
|
|
||
| In the repository we have the following examples that can be used in retail | ||
|
|
||
| | Scenario | Description | Algorithm | Implementation | | ||
| |----------|-------------|-----------|----------------| | ||
| | Collaborative Filtering with explicit interactions in Spark environment | Matrix factorization algorithm for explicit feedback in large datasets, optimized by Spark MLLib for scalability and distributed computing capability | Alternating Least Squares (ALS) | [pyspark notebook using Movielens dataset](https://github.com/microsoft/recommenders/blob/staging/notebooks/00_quick_start/als_movielens.ipynb) | | ||
| | Content-Based Filtering for content recommendation in Spark environment | Gradient Boosting Tree algorithm for fast training and low memory usage in content-based problems | LightGBM/MMLSpark | [spark notebook using Criteo dataset](https://github.com/microsoft/recommenders/blob/staging/notebooks/02_model/mmlspark_lightgbm_criteo.ipynb) | | ||
|
|
||
|
|
||
|
|
||
| ## References and resources | ||
|
|
||
| [1] Aggarwal, Charu C. Recommender systems. Vol. 1. Cham: Springer International Publishing, 2016. | ||
| [2]. Park, Yoon-Joo, and Alexander Tuzhilin. "The long tail of recommender systems and how to leverage it." In Proceedings of the 2008 ACM conference on Recommender systems, pp. 11-18. 2008. [Link to paper](http://people.stern.nyu.edu/atuzhili/pdf/Park-Tuzhilin-RecSys08-final.pdf). | ||
| [3]. Armstrong, Robert. "The long tail: Why the future of business is selling less of more." Canadian Journal of Communication 33, no. 1 (2008). [Link to paper](https://www.cjc-online.ca/index.php/journal/article/view/1946/3141). | ||
|
|
||
| sources: [1](https://emerj.com/ai-sector-overviews/use-cases-recommendation-systems/), [2](https://cloud.google.com/recommendations-ai/docs/placements), [3](https://www.researchgate.net/post/Can_anyone_explain_what_is_cold_start_problem_in_recommender_system) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| # Recommendation systems for Travel |
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.