-
-
Notifications
You must be signed in to change notification settings - Fork 20
Description
Currently there's only several simple and, if not naive, approaches on report validation. Previously we've proposed Z-test mechanism, and implemented on the previous backend. However, due to the MongoDB evaluation bottleneck existing on the previous backend, we unfortunately have to disable such feature due to high performance drawbacks.
The backend-next project now has the capability on both flexibility and performance extensibility to allow us relaunch such mechanism on checking the reports.
Moreover, currently the DropInfo section is somewhat artificially decided and might not be suitable for the first several hundred reports, due to the nature that we can't predict what is actually the finite set of drop possibilities, so there previously have existed several issues related to DropInfo not being applied properly at the first, causing potentially deviations for the dataset. Although we've been fixing those actively manually, those are time-consuming and as well not an optimal solution at all. Therefore, there could also be a mechanism where DropInfo itself could adopt continuously with the growth of the report dataset. However the implementation detail of the adoption is still a huge topic to discuss.
Just to note down here that, those statistics-based tests are all pretty susceptible to attacks where the attacker could aim to report several hundred or about a thousand of false reports after the very first moments the stage opens, causing the dataset to converge to a skewed result. Any results afterwards would consider invalid and therefore rejecting the true reports. Such attack could be mitigated by either randomly picking reports across different accounts, IPs, and carefully designing the threshold when Z-test kicks in, to minimize the effect such attack could bring.