To explain why this has a chance of working, consider a two-dimensional case. Suppose we have a function f(x,y) to maximise, subject to g(x,y) = c, where c is a given constant. We can visualise 'contours' given by f(x,y) = d for various values of d. The constraint is to stay on one contour, given by fixing the value of g. Suppose we are walking along that contour, and taking note of the various contours f = d we cross. If they cross the contour transversally, we can increase the value of f by walking 'uphill': that is following the direction leading to higher values of f. Only if the contour f = d we are on actually touches the contour g = c we are confined to will this not be possible. At a constrained maximum of f, that must be true.

A familiar example can be obtained from weather maps, with their contours for temperature and pressure: the constrained extrema will occur where the superposed maps show touching lines (isopleths).

Geometrically we translate the tangency condition to saying that the gradients of f and g are parallel vectors at the maximum. Introducing an unknown scalar λ, the gradient of f -λg is then 0 for some value of λ. This in geometrical form is the Lagrange multiplier argument: f -λg must be stationary, where the multiplier λ is a new variable, at a local extremum.

Table of contents |

2 Example 3 External links |

Let *f* (*x*_{1}, *x*_{2}, … *x*_{n}) be a function defined on an *n*-dimensional open set.
We suppose *f* has continuous partial derivatives on that set.
A constraint is defined by the hypersurface which satisfies *g* (*x*_{1}, *x*_{2}, … *x*_{n}) = *C* within the open set; where *C* is a constant.
Additionally, *g* ≠ **0** everywhere on the curve (where is the gradient).

Given that there is a local maximum (or minimum) of *f*(** x**) on the set {

- .

In general one can apply the reasoning to several constraints, by introducing multipliers λ, μ, ... for each.

The constraint *g* (** x**) =

Suppose we wish to find the discrete probability distribution with maximal information entropy. Then

The method of Lagrange multipliers is generalized by the Kuhn-Tucker Theorem.