Skip to content

Conversation

@mstfbl
Copy link
Contributor

@mstfbl mstfbl commented May 13, 2020

Fixes #4874

This PR updates the libmf submodule, where its recent changes in libmf PR #41 and PR #42 address the shuffling of values in a given matrix.

The C++ function void random_shuffle(_RanIt _First, _RanIt _Last) is implemented differently on Windows vs. MacOS vs. Linux. This resulted in inconsistent results on MacOS and more-predictable yet still inconsistent results on Linux. As a result, a given matrix factorization problem, even with a constant seed, produced differing MSEs between each run on each system. The libmf codebase has been updated to prevent this, and baseline MSE values in the unit test MatrixFactorizationSimpleTrainAndPredict() have been updated to reflect this.

@mstfbl mstfbl marked this pull request as ready for review May 13, 2020 17:46
@mstfbl mstfbl requested a review from a team as a code owner May 13, 2020 17:46
Copy link
Contributor

@frank-dong-ms-zz frank-dong-ms-zz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@@ -54,7 +54,6 @@ public void MatrixFactorization_Estimator()
}

[MatrixFactorizationFact]
Copy link
Contributor

@harishsk harishsk May 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is MatrixFactorizationFact needed? #Resolved

Copy link
Contributor Author

@mstfbl mstfbl May 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it is not! Currently, MatrixFactorizationFact is just an EnvironmentSpecificFactAttribute that always returns true with IsEnvironmentSupported(). In a seperate PR, I'll be marking these MatrixFactorizationFact marked tests as Fact, and if possible, remove the MatrixFactorizationFact from the codebase all together. #Resolved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is already done in below PR: #5122


In reply to: 424743738 [](ancestors = 424743738)

@codecov
Copy link

codecov bot commented May 13, 2020

Codecov Report

Merging #5121 into master will increase coverage by 0.00%.
The diff coverage is 63.63%.

@@           Coverage Diff           @@
##           master    #5121   +/-   ##
=======================================
  Coverage   75.55%   75.55%           
=======================================
  Files         995      995           
  Lines      179316   179318    +2     
  Branches    19298    19298           
=======================================
+ Hits       135479   135482    +3     
- Misses      38572    38575    +3     
+ Partials     5265     5261    -4     
Flag Coverage Δ
#Debug 75.55% <63.63%> (+<0.01%) ⬆️
#production 71.50% <ø> (+<0.01%) ⬆️
#test 88.63% <63.63%> (-0.01%) ⬇️
Impacted Files Coverage Δ
...ests/TrainerEstimators/MatrixFactorizationTests.cs 97.07% <63.63%> (-0.52%) ⬇️
...c/Microsoft.ML.FastTree/Utils/ThreadTaskManager.cs 79.48% <0.00%> (-20.52%) ⬇️
...soft.ML.Data/DataLoadSave/Text/TextLoaderCursor.cs 88.93% <0.00%> (-0.21%) ⬇️
src/Microsoft.ML.AutoML/Sweepers/Parameters.cs 85.16% <0.00%> (+0.84%) ⬆️
src/Microsoft.ML.Sweeper/AsyncSweeper.cs 73.97% <0.00%> (+1.36%) ⬆️
....ML.AutoML/PipelineSuggesters/PipelineSuggester.cs 86.55% <0.00%> (+6.72%) ⬆️

@mstfbl mstfbl merged commit 8e5f7b4 into dotnet:master May 14, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Mar 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MatrixFactorizationSimpleTrainAndPredict() on MacOS is broken

4 participants