Completeness (statistics)
Suppose a
random variable X (which may be a sequence
(
X_{1}, ...,
X_{n}) of
scalarvalued
random variables), has a probability distribution belonging to a known
family of probability distributions, parametrized by θ, which
may be either
vector or scalarvalued. A function
g(
X)
is an
unbiased estimator of zero if the expectation
E(
g(
X)) remains zero regardless of the value of the
parameter θ. Then
X is a
complete statistic
precisely if it admits no such unbiased
estimator of
zero.
For example, suppose X_{1}, X_{2}
are independent, identically
distributed random variables,
normally distributed with
expecation θ and variance 1. Then
X_{1} — X_{2} is an unbiased
estimator of zero. Therefore the pair
(X_{1}, X_{2}) is not a complete
statistic. On the other hand, the sum
X_{1} + X_{2}
can be shown to be a complete statistic. That means that there
is no nonzero function g such that

remains zero regardless of changes in the value of θ.
That fact may be seen as follows.
The probability distribution of
X_{1} +
X_{2}
is normal with expectation 2θ and variance 2.
Its probability density function is therefore

The expectation above would therefore be a constant times

A bit of algebra reduces this to

As a function of θ this is a twosided
Laplace transform
of
h(
x), and cannot be identically zero unless
h(
x)
zero almost everywhere.
One reason for the importance of the concept is the LehmannScheffé theorem,
which states that a statistic that is complete, sufficient, and unbiased is the best unbiased estimator, i.e., the one that has a smaller mean squared error than any other unbiased estimator, or, more generally, a smaller expected loss, for any convex loss function.