Conversation
| @@ -146,33 +146,33 @@ describe('Real World Matching Performance', () => { | |||
| 10, | |||
There was a problem hiding this comment.
How often do we match currently? If we match too often, the score does not have any effect at all ...
There was a problem hiding this comment.
How often do we match currently?
Angebote team does it on Fridays EOD.
If we match too often, the score does not have any effect at all ...
Could you elaborate a bit on this? I'm worried I might be misunderstanding how the matching algorithm works / missing something extremely obvious.
For example, at the moment, there are around 70 pupils in the pool. Most of them need help in similar subjects, and there aren’t many students available right now. Our intention with this change is to prioritise pupils who have been waiting for weeks, rather than those who have just entered the pool with a similar matching profile.
There was a problem hiding this comment.
Well, if you have one pupil and one student that can be matched due to constraints, then the score is entirely useless as we will match those and do not have any other option. If on the other hand you take all pupils and all students, then there are many many possible matches and the score becomes highly relevant. Thus there is a big trade off between waiting time and match quality, which is also reflected in the small simulation we do here. You can see the results here for adding pupils and students as they arrived in the match pool from our historical data, and then running the matching algorithm every 1 day, every 10 days or every 1000 days. As you can see we get the best results if we match every 1000 days (many many users in the match pool, but the average waiting time is 500 days), quite okay results if we match every 10 days, and terrible results if we match on a daily basis.
So in general, it holds that match_quality ~ possible_matches ~ pupils_to_match * students_to_match. If we have an unbalanced matching as currently it is actually quite okay to match very often as even for pupil_to_match = 70 && students_to_match = 1 we have an okayish number of combinations.
You generally have the following trade-offs:
- More constraints reduce match quality according to the score (as less matches are actually possible) vs. More constraints are enforced
- A bigger match pool improves match quality according to the score, at the expense of waiting time
Using the score to balance out the waiting time (i.e. let users wait a bit longer by matching not too often, but prevent them from waiting for a long time through the score) is generally a good idea which I already tried before (the commented out code), but I wasn't too happy with the results so I abandoned it. Seems you've found a heuristic which is okayish, although the actual impact is also quite moderate (according to our historical data).
There was a problem hiding this comment.
As you can see we get the best results if we match every 1000 days (many many users in the match pool, but the average waiting time is 500 days), quite okay results if we match every 10 days, and terrible results if we match on a daily basis
Ah, in the current state one can already see the impact of the constraints, as all three variants are currently pretty much equal in the simulation, so the score has basically no real world impact.
| const score = 0.3 * subjectBonus + 0.1 * languageBonus + 0.6 * requestWaitingBonus; | ||
|
|
||
| // TODO: Fix retention for matches with only few subjects (e.g. both helper and helpee only have math as subject) | ||
| // in that case the score is not so high, and thus they are retained for a long time, although the match is perfect |
There was a problem hiding this comment.
Maybe while at it, it makes sense to revisit this, something like:
const subjectBonus = matchingSubjects / max(request.subjects.length, offer.subjects.length);
There was a problem hiding this comment.
Do you mean change this one?
const subjectBonus = sigmoid((matchingSubjects - 1) * 2);
Ticket
https://github.com/corona-school/project-user/issues/1589
What was done?