In one-sample applications is the theoretical distribution and is the empirically observed distribution. Alternatively the two distributions can both be empirically estimated ones; this is called the two-sample case.
Let be the observed values, in increasing order. Then the statistic is[3]: 1153 [5]
If this value is larger than the tabulated value, then the hypothesis that the data came from the distribution can be rejected.
Watson test
A modified version of the Cramér–von Mises test is the Watson test[6] which uses the statistic U2, where[5]
where
Cramér–von Mises test (two samples)
Let and be the observed values in the first and second sample respectively, in increasing order. Let be the ranks of the xs in the combined sample, and let be the ranks of the ys in the combined sample. Anderson[3]: 1149 shows that
where U is defined as
If the value of T is larger than the tabulated values,[3]: 1154–1159 the hypothesis that the two samples come from the same distribution can be rejected. (Some books[specify] give critical values for U, which is more convenient, as it avoids the need to compute T via the expression above. The conclusion will be the same.)
The above assumes there are no duplicates in the , , and sequences. So is unique, and its rank is in the sorted list . If there are duplicates, and through are a run of identical values in the sorted list, then one common approach is the midrank[7] method: assign each duplicate a "rank" of . In the above equations, in the expressions and , duplicates can modify all four variables , , , and .
References
^Cramér, H. (1928). "On the Composition of Elementary Errors". Scandinavian Actuarial Journal. 1928 (1): 13–74. doi:10.1080/03461238.1928.10416862.
^von Mises, R. E. (1928). Wahrscheinlichkeit, Statistik und Wahrheit. Julius Springer.
^A.N. Kolmogorov, "Sulla determinizione empirica di una legge di distribuzione" Giorn. Ist. Ital. Attuari , 4 (1933) pp. 83–91
^ abPearson, E.S., Hartley, H.O. (1972) Biometrika Tables for Statisticians, Volume 2, CUP. ISBN0-521-06937-8 (page 118 and Table 54)
^Watson, G.S. (1961) "Goodness-Of-Fit Tests on a Circle", Biometrika, 48 (1/2), 109-114 JSTOR2333135
^Ruymgaart, F. H., (1980) "A unified approach to the asymptotic distribution theory of certain midrank statistics". In: Statistique non Parametrique Asymptotique, 1±18, J. P. Raoult (Ed.), Lecture Notes on Mathematics, No. 821, Springer, Berlin.
M. A. Stephens (1986). "Tests Based on EDF Statistics". In D'Agostino, R.B.; Stephens, M.A. (eds.). Goodness-of-Fit Techniques. New York: Marcel Dekker. ISBN0-8247-7487-6.