Numerical integration

In scientific computing and numerical analysis, the term numerical integration is used to describe a broad family of algorithms for calculating the numerical value of a definite integral, and by extension, the term is also sometimes used to describe numerical algorithms for solving differential equations. Basically, numerical integration seeks an approximate solution to some integral:

Table of contents

1 Reasons for Numerical Integration
2 Methods

2.1 Quadrature
2.2 Adaptive algorithms
2.3 Conservative (a priori) error estimation

Reasons for Numerical Integration

Numerical integration is performed for three main reasons:

The function is only known in certain discrete points, such as obtained by sampling. Several embedded systems and other computer applications may need a lot of numerical integration for this reason.
We know the function , but it is impossible to calculate the integral analytically, because a primitive function, which is needed for the integration, is impossible to obtain. (One such function is the probability density function of the normal distribution.)
We know the function , but it is is too hard to solve it analytically, and we want to fall back on approximation.

Methods

The most primitive numerical integration method is suggested by the Riemann integral:

(*)

The sum on the right is termed a Riemann sum, since it looks like the integral of a step function (see the section on the Riemann integral.) Of course, it may not be possible in general to construe the sum on the right as being either a lower, or upper sum, however, if one can build actual lower and upper sums, then this algorithm can be made to yield an interval of certainty.

The disadvantage of this algorithm is that it has extremely poor convergence properties. In order to get any precision at all, one needs n to be extremely large, even for extremely well behaved functions. For instance, with

the integral on the left of (*) is 1/2 while the sum on the right of (*) is 1/2 + 1/(2n). Hence, if we use a sum with n terms in it, the error we obtain in our numerical approximation is 1/(2n). To obtain, say, 9 digits of precision, we must use 10⁹ terms in the sum, which has two problems. The first problem is that roundoff (see numerical analysis) will begin to dominate our calculations. The second is that the running time for 10⁹ iterations of any algorithm is usually too expensive; it takes too long to run.

We note at this point that the step function in (*) was built by approximating f(x) over each interval by the value of f at x=(k+1)/n, that is, the right-hand side of the interval. Had we chosen the middle of the interval instead, for a large class of functions (at least all twice differentiable functions) the error term of the algorithm would have been at least one order of magnitude better.

Quadrature

Using the function value in the middle of two equidistant points and multiply it by the length of the interval is referred to as the midpoint rule or rectangular rule in numerical integration. In (*) this would mean evaluating the function at f(0.5) and multiply by 1. This is a common method.

Another very simple choice to approximate the integral on the left hand side of (*) would be to evaluate f(0) and f(1), then to approximate f with a linear polynomial between 0 and 1, and then calculate the exact area under this linear polynomial. This yields

This is called the trapezoid rule of numerical integration. Instead of using a single trapezoid, we can use multiple trapezoids placed at regular intervals, in which case we obtain

The preceding formula is called the iterated trapezoid rule.

Instead of using linear polynomials, one can use polynomials of any degree, these algorithms are called quadrature algorithms.

Interpolation with polynomials on equally-spaced intervals, i.e. where the integration span is divided into equidistant subintervals, correspond to Newton-Cotes formulas. In this group you will find the midpoint rule with polynomial index 0, the trapezoid rule with index 1 and at index 2 a quadratic polynomial interpolating between three points at a time. Using quadratic polynomials leads to Simpson's rule.

If we lose the criteria of equally spaced intervals, we find another group of quadrature formulas, the best of which are known as Gaussian quadrature. (Another such quadrature formula is to interpolate using Chebyshev polynomials.) The error for such quadrature can be shown to be very small, if the function f is very smooth (if it has many derivatives.)

Adaptive algorithms

If f does not have many derivatives at all points, or if the derivatives become large, then Gaussian quadrature is often insufficient. In this case, an algorithm similar to the following will perform better:

   // This algorithm calculates the definite integral of a function
   // from 0 to 1, adaptively, by choosing smaller steps near
   // problematic points.
   // Set initial_h to the initial step size.
   x:=0
   h:=initial_h
   accumulator:=0
   WHILE x<1.0 DO
       IF x+h>1.0 THEN
           h=1.0-x
       END IF
       IF error from quadrature over [x,x+h] for f is too large THEN
           Make h smaller
       ELSE
           accumulator:=accumulator + quadrature of f over [x,x+h]
           x:=x+h
           IF error from quadrature over [x,x+h] is very small THEN
               Make h larger
           END IF
       END IF
   END WHILE
   RETURN accumulator

Some details of the algorithm require careful thought. For many cases, estimating the error from quadrature over an interval for a function f isn't obvious. One popular solution is to use two different rules of quadrature, and use their difference as an estimate of the error from quadrature. The other problem is deciding what "too large" or "very small" signify. A possible criterion for "too large" is that the quadrature error should not be larger than th where t, a real number, is the tolerance we wish to set for global error. Then again, if h is already tiny, it may not be worthwhile to make it even smaller even if the quadrature error is apparently large. This type of error analysis is usually called "a posteriori" since we compute the error after having computed the approximation.

Of course, more sophisticated heuristics are possible.

Conservative (a priori) error estimation

Let f have a bounded first derivative over [a,b]. The mean value theorem for f, where , gives

for some y_x in [a,x] depending on x. If we integrate in x from a to b on both sides and the take absolute values, we obtain

We can further approximate the integral on the right-hand side by bringing the absolute value into the integrand, and replacing the term in f' by an upper bound:

(**)

(See supremum.) Hence, if we approximate the integral ∫_a^bf(x)dx by the quadrature rule (b-a)f(a) our error is no greater than the right hand side of (**). We can convert this into an error analysis for the Riemann sum (*), giving an upper bound of

for the error term of that particular approximation. (Note that this is precisely the error we calculated for the example .) Using more derivatives, and by tweaking the quadrature, we can do a similar error analysis using a Taylor series (using a partial sum with remainder term) for f. This error analysis gives a strict upper bound on the error, if the derivatives of f are available.

This integration method can be combined with interval arithmetic to produce computer proofs and verified calculations.

See also: Runge-Kutta method, Numerical partial differential equations, Finite element method, Monte Carlo simulation.