Lagrange multiplier
Lagrange multipliers are a method for solving certain optimization problems.
The class of problems to be solved deals with finding local extrema of multidimensional functions with one-dimensional contraints in the form of a curve.
The method reduces the constrained problem to an unconstrained problem which can be solved by the usual gradient method.
Let f (x1, x2, … xn) be a function defined on an n-dimensional open set.
f has to have continuous partial derivatives on that set.
The constraint is defined by the curve which satisfies g (x1, x2, … xn) = C which is contained in the open set (C is a constant).
Additionally,
Suppose we wish to find the discrete probability distribution with maximal information entropy. Then
To find the point of maximum entropy (depending on the probabilities), we can use the Lagrange multiplier.
We set:
For another example, see also Derivation of the partition function.
Formalism of the method of Lagrange multipliers
g ≠ 0 everywhere on the curve (where
is the gradient).
The task is now to find a local maximum (or minimum) of f(x) on the set {x ∈ Rn | g (x) = C}.
By setting the derivative of f along the curve g (x) = C to zero, one can show that there exists a real number λ (the Lagrange multiplier) such that
or - written differently
for each x that is such a local extremum.
Because the solution of this equation still gives values outside of the curve, the constraint g (x) = C has to be used again.
This results in all possible solutions.
Example
Of course, the sum of these probabilities equals 1, so our constraint function is
Carrying out the differentation of these n equations, we get
This shows that all pi are equal (because they depend on λ only).
By using the constraint ∑k pk = 1 again, we get the uniform distribution: