Sample ratio mismatches can be detected using a chi-squared test.[3] Using methods to detect SRM can help non-experts avoid making discussions using biased data.[4] If the sample size is large enough, even a small discrepancy between the observed and expected group sizes can invalidate the results of an experiment.[5][6]
Example
Suppose we run an A/B test in which we randomly assign 1000 users to equally sized treatment and control groups (a 50–50 split). The expected size of each group is 500. However, the actual sizes of the treatment and control groups are 600 and 400.
Using Pearson's chi-squaredgoodness of fit test, we find a sample ratio mismatch with a p-value of 2.54 × 10-10. In other words, if the assignment of users were truly random, the probability that these treatment and control group sizes would occur by chance is 2.54 × 10-10.[7]