Confidence interval

In statistics, confidence intervals are the most prevalent form of interval estimation. If U and V are statistics (i.e., "observable" random variables) whose probability distribution depends on some unobservable parameter θ, and the relation

then the random interval (U,V) is a "90% confidence interval for θ".

How to misunderstand confidence intervals

It is very tempting to misunderstand this statement in the following way. We used capital letters U and V for random variables; it is conventional to use lower-case letters u and v for their observed values in a particular instance. The misunderstanding is the conclusion that

so that after the data has been observed, a conditional probability distribution of θ, given the data, is inferred. For example, suppose X is normally distributed with expected value θ and variance 1. (It is grossly unrealistic to take the variance to be known while the expected value must be inferred from the data, but it makes the example simple.) The random variable X is observable. (The random variable X - θ is an example of one that is not observable, since its value depends on θ.) Then X - θ is normally distributed with expectation 0 and variance 1; therefore

Consequently

so the interval from X - 1.645 to X + 1.645 is a 90% confidence interval for θ. But when X = 82 is observed, can we then say that

This conclusion does not follow from the laws of probabilty because θ is not a "random variable"; i.e., no probability distribution has been assigned to it. Confidence intervals are generally a frequentist method, i.e., employed by those who interpret "90% probability" as "occurring in 90% of all cases". Suppose, for example, that θ is the mass of the planet Neptune, and the randomness in our measurement error means that 90% of the time our statement that the mass is between this number and that number will be correct. The mass is not what is random. Therefore, given that we have measured it to be 82 units, we cannot say that in 90% of all cases, the mass is between 82 - 1.645 and 82 + 1.645. There are no such cases; there is, after all, only one planet Neptune.

But if probabilities are construed as degrees of belief rather than as relative frequencies of occurrence of random events, i.e., if we are Bayesians rather than frequentists, can we then say we are 90% sure that the mass is between 82 − 1.645 and 82 + 1.645? Many answers to this question have been proposed, and are philosophically controversial. The answer will not be a mathematical theorem, but a philosophical tenet.

[I will add an example of a "recognizable subset" here; i.e., a case in which the data themselves make the epistemic conclusion dubious.]

Concrete practical examples

Here is one of the most familiar realistic examples. Suppose X₁, ..., X_n are an independent sample from a normally distributed population with mean μ and variance σ². Let

Then

has a Student's t-distribution with n − 1 degrees of freedom. Note that what distribution it has does not depend on the values of the unobservable parameters μ and σ²; i.e., it is a pivotal quantity. If c is the 95th percentile of this distribution, then

(Note: "95" and "90" are correct; this is a frequent occasion for careless mistakes.)

Consequently

and we have a 90% confidence interval for μ.