Kolmogorov-Smirnov test

In statistics, the Kolmogorov-Smirnov test is used to determine whether two empirical distributions are different or whether an empirical distribution differs from a theoretical distribution.

The empirical cumulative distribution for n observations y_i is defined as E(x) = Σ _i (y_i < x). The two one-sided Kolmogorov-Smirnov test statistics statistics are given by

where F(x) is the hypothesized distribution or another empirical distribution. The probability distributions of these two statistics, given that the null hypothesis of equality of distributions is true, does not depend on what the hypothesized distribution is, as long as it is continuous. Knuth gives a detailed description of how to analyze the significance of this pair of statistics. Many people use max(D_n⁺, D_n^-) instead, but the distribution of this statistic is more difficult to deal with.

Note that when the underlying independent variable is cyclic, as with day of the year or day of the week, then Kuiper's test is more appropriate. Numerical Recipes is a good source of information on this. Note furthermore, that the Kolmogorov-Smirnov test is more sensitive at points near the median of the distribution than on its tails. The Anderson-Darling test is a test that provides equal sensitivity at the tails.

External links

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35g.htm - A lovely explanation of the one-sided KS test
http://www.io.com/~ritter/JAVASCRP/NORMCHIK.HTM - JavaScript code that implements both the one-sided and two-sided tests.
As always, Numerical Recipes (ISBN 0521431085) is a prime resource for this sort of thing (see http://www.nr.com/nronline_switcher.html for a discussion).