Main Page | See live article | Alphabetical index


In probability theory and statistics, the covariance between two real-valued random variables X and Y, with expected values E(X) = μ and E(Y) = ν is defined as:

This is equivalent to the following formula which is commonly used in actual calculations:

For column-vector valued random variables X and Y with respective expected values μ and ν, and n and m scalar components respectively, the covariance is defined to be the n×m matrix

If X and Y are independent, then their covariance is zero. This follows because under independence, E(X·Y) = E(X)·E(Y). The converse, however, is not true: it is possible that X and Y are not independent, yet their covariance is zero.

If X and Y are real-valued random variables and c is a constant ("constant", in this context, means non-random), then the following facts are a consequence of the definition of covariance:

For vector-valued random variables, cov(X, Y) and cov(Y, X) are each other's transposes.

The covariance is sometimes called a measure of "linear dependence" between the two random variables. That phrase does not mean the same thing that it means in a more formal linear algebraic setting (see linear dependence), although that meaning is not unrelated. The correlation is a closely related concept used to measure the degree of linear dependence between two variables.