Main Page | See live article | Alphabetical index

Completeness (statistics)

Suppose a random variable X (which may be a sequence (X1, ..., Xn) of scalar-valued random variables), has a probability distribution belonging to a known family of probability distributions, parametrized by θ, which may be either vector- or scalar-valued. A function g(X) is an unbiased estimator of zero if the expectation E(g(X)) remains zero regardless of the value of the parameter θ. Then X is a complete statistic precisely if it admits no such unbiased estimator of zero.

For example, suppose X1, X2 are independent, identically distributed random variables, normally distributed with expecation θ and variance 1. Then X1X2 is an unbiased estimator of zero. Therefore the pair (X1, X2) is not a complete statistic. On the other hand, the sum X1 + X2 can be shown to be a complete statistic. That means that there is no non-zero function g such that

remains zero regardless of changes in the value of θ. That fact may be seen as follows. The probability distribution of X1 + X2 is normal with expectation 2θ and variance 2. Its probability density function is therefore
The expectation above would therefore be a constant times
A bit of algebra reduces this to
As a function of θ this is a two-sided Laplace transform of h(x), and cannot be identically zero unless h(x) zero almost everywhere.

One reason for the importance of the concept is the Lehmann-Scheffé theorem, which states that a statistic that is complete, sufficient, and unbiased is the best unbiased estimator, i.e., the one that has a smaller mean squared error than any other unbiased estimator, or, more generally, a smaller expected loss, for any convex loss function.