Negative binomial distribution

In probability theory, the negative binomial distribution is the probability distribution of the number of trials needed to get a fixed (i.e., non-random) number of successes in a Bernoulli process. If the random variable X is the number of trials needed to get r successes in a series of trials where each trial has success probability p, then X follows the negative binomial distribution with parameters r and p.

Table of contents

1 Formulas
2 Example

2.1 What's the probability mass function for selling the last candy bar at the xth house?
2.2 What's the probability that he finishes on the tenth house?
2.3 What's the probability that he finishes on or before reaching the eighth house?
2.4 What's the probability that he exhausts all houses in the neighborhood, gives up, and then goes to live on the streets?

3 Properties
4 Explanation of the name

Formulas

Parameters : r (number of successes) is an integer where 1 ≤ r; the special case r = 1 creates the geometric distribution.

p = probability of success on each trial is a real number where 0 < p < 1.

Support (domain where probability mass > 0) = set of all integers ≥ r.

Probability mass function f(x) = P(X = x) = probability that rth success occurs on the xth trial

= C(x − 1, r − 1) p^r(1 − p) ^{x − r} (see binomial coefficient).

Cumulative distribution function F(x) = P(X ≤ x) = probability that rth success occurs on or before the xth trial : No simple closed form solution exists, but this can be computed via the regularized incomplete Beta function as with the binomial distribution.

Expected value E[X] = r/p.

Variance var(X) = σ² = r(1 − p)/p².

Example

(After a problem by Dr. Diane Evans, professor of mathematics at Rose-Hulman Institute of Technology)

Johnny, a sixth grader at Honey Creek Middle School in Terre Haute, Indiana, is required to sell candy bars in his neighborhood to raise money for the 6th grade field trip. There are thirty homes in his neighborhood, and his father has told him not to return home until he has sold five candy bars. So the boy goes door to door, selling candy bars. At each home he visits, he has an 0.4 probability of selling one candy bar and an 0.6 probability of selling nothing.

What's the probability mass function for selling the last candy bar at the xth house?

f(x) = 0.01024*((x − 1) choose 4) 0.6^{x − 5}

What's the probability that he finishes on the tenth house?

f(10) = 0.100

What's the probability that he finishes on or before reaching the eighth house?

Answer: To finish on or before the eighth house, he must finish at the fifth, sixth, seventh, or eighth house. Sum those probabilities:

f(5) = 0.0102; f(6) = .0307, f(7) = .0553; f(8) = .0774; sum(f(j), j=5..8) = 0.1737

What's the probability that he exhausts all houses in the neighborhood, gives up, and then goes to live on the streets?

Moral: Negative binomial distributions don't turn our children out on the streets; bad parenting does.

Properties

If X_r is a random variable following the negative binomial distribution with parameters r and p, then X_r is a sum of r independent variables following the geometric distribution with parameter p. As a result of the central limit theorem, X_r is therefore approximately normal for sufficiently large r.

Furthermore, if Y_s is a random variable following the binomial distribution with parameters s and p, then

Pr[X_r ≤ s] = Pr[Y_s ≥ r] = Pr["after s trials, there are at least r successes"]

In this sense, the negative binomial distribution is the "inverse" of the binomial distribution. Every question about probabilities of negative binomial variables can be translated into an equivalent one about binomial variables.

The negative binomial distribution also arises as a continuous mixture of Poisson distributions for which the Poisson parameter λ was generated by a Gamma distribution.

Explanation of the name

Suppose X is a random variable with a negative binomial distribution with parameters r and p. The statement that the sum from x = r to infinity, of the probability Pr[X = x], is equal to 1, can be shown by a bit of algebra to be equivalent to the statement that (1 − p)^{− r} is what Newton's binomial theorem says it should be.

Suppose Y is a random variable with a binomial distribution with parameters n and p. The statement that the sum from y = 0 to n, of the probability Pr[Y = y], is equal to 1, says that that 1 = (p + (1 − p))ⁿ is what the strictly finitary binomial theorem of high-school algebra says it should be.

Thus the negative binomial distribution bears the same relationship to the negative-integer-exponent case of the binomial theorem that the binomial distribution bears to the positive-integer-exponent case.