Skip to content Skip to navigation

Connexions

You are here: Home » Content » Constrained Optimization

Navigation

Content Actions

  • Download module PDF
  • Add to ...
    Add the module to:
    • My Favorites
    • A lens
    • An external social bookmarking service
    • My Favorites (What is 'My Favorites'?)
      'My Favorites' is a special kind of lens which you can use to bookmark modules and collections directly in Connexions. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need a Connexions account to use 'My Favorites'.
    • A lens (What is a lens?)

      Definition of a lens

      Lenses

      A lens is a custom view of Connexions content. You can think of it as a fancy kind of list that will let you see Connexions through the eyes of organizations and people you trust.

      What is in a lens?

      Lens makers point to Connexions materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

      Who can create a lens?

      Any individual Connexions member, a community, or a respected organization.

    • External bookmarks
  • E-mail the author

Recently Viewed

Constrained Optimization

Module by: Don Johnson

Constrained optimization is the minimization of an objective function subject to constraints on the possible values of the independent variable. Constraints can be either equality constraints or inequality constraints. Because the scalar-variable case follows easily from the vector one, only the latter is discussed in detail here.

Equality Constraints

The typical constrained optimization problem has the form minx{fx}   subject to   gx=0 x f x   subject to   g x 0 where f· f · is the scalar-valued objective function and g· g · is the vector-valued constraint function. Strict convexity of the objective function is not sufficient to guarantee a unique minimum; in addition, each component of the constraint must be strictly convex to guarantee that the problem has a unique solution. Because of the constraint, stationary points of f· f · alone may not be solutions to the constrained problem: they may not satisfy the constraints. In fact, solutions to the constrained problem are often not stationary points of the objective function. Consequently, the ad hoc technique of searching for all stationary points of the objective function that also satisfy the constraint do not work.

The classical approach to solving constrained optimization problems is the method of Lagrange multipliers. This approach converts the constrained optimization problem into an unconstrained one, thereby allowing use of the techniques described in the previous section. The Lagrangian of a constrained optimization problem is defined to be the scalar-valued function Lxλ=fx+λTgx L x λ f x λ g x Essentially, the following theorem states that stationary points of the Lagrangian are potential solutions of the constrained optimization problem: as always, each candidate solution must be tested to determine which minimizes the objective function.

theorem 1

Let x x denote a local solution to the constrained optimization problem given above where the gradients x g 1 x x g 1 x , …, x g M x x g M x of the constraint function's components are linearly independent. There then exists a unique vector λ λ such that xLxλ=0 x L x λ 0 Furthermore, the quadratic form yTx2Lxλy y x 2 L x λ y is non-negative for all y y satisfying xgxTy=0 x g x y 0 .

The latter result in the theorem says that the Hessian of the Lagrangian evaluated at its stationary points is non-negative definite with respect to all vectors orthogonal to the gradient of the constraint. This result generalizes the notion of a positive definite Hessian in unconstrained problems.

The rather abstract result of the preceding theorem has a simple geometric interpretation. As shown in Figure 1, the constraint corresponds to a contour in the x x plane.

Figure 1: The thick line corresponds to the contour of the values of xx satisfying the constraint equation gx=0 g x 0 . The thinner lines are contours of constant values of the objective function fx f x . The contour corresponding to the smallest value of the objective function just tangent to the constraint contour is the solution to the optimization problem with equality constraints.
Geometric interpretation of Lagrange multipliers.

	    Geometric interpretation of Lagrange multipliers.  
	   (conopt.jpg)
A contour map of the objective function indicates those values of x x for which fx=c f x c . In this figure, as c c becomes smaller, the contours shrink to a small circle in the center of the figure. The solution to the constrained optimization problem occurs when the smallest value of cc is chosen for which the contour just touches the constraint contour. At that point, the gradient of the objective function and of the constraint contour are proportional to each other. This proportionality vector is λ λ , the so-called Lagrange multiplier. The Lagrange multiplier's exact value must be such that the constraint is exactly satisfied. Note that the constraint can be tangent to the objective function's contour map for larger values of c c. These potential, but erroneous, solutions can be discarded only by evaluating the objective function.

Example 1

A typical problem arising in signal processing is to minimize xTAx x A x subject to the linear constraint cTx=1 c x 1 . AA is a positive definite, symmetric matrix (a correlation matrix) in most problems. Clearly, the minimum of the objective function occurs at x=0 x 0 , but his solution cannot satisfy the constraint. The constraint gx=cTx-1 g x c x 1 is a scalar-valued one; hence the theorem of Lagrange applies as there are no multiple components in the constraint forcing a check of linear independence. The Lagrangian is Lxλ=xTAx+λcTx-1 L x λ x A x λ c x 1 Its gradient is 2Ax+λc 2 A x λ c with a solution x=- λ A-1c2 x λ A c 2 . To find the value of the Lagrange multiplier, this solution must satisfy the constraint. Imposing the constraint, λ cTA-1c=-2 λ c A c -2 ; thus, λ =-2cTA-1c λ -2 c A c and the total solution is x=A-1ccTA-1c x A c c A c

When the independent variable is complex-valued, the Lagrange multiplier technique can be used if care is taken to make the Lagrangian real. If it is not real, we cannot use the theorem that permits computation of stationary points by computing the gradient with respect to z¯ z alone. The Lagrangian may not be real-valued even when the constraint is real. Once insured real, the gradient of the Lagrangian with respect to the conjugate of the independent vector can be evaluated and the minimization procedure remains as before.

Example 2

Consider slight variations to the previous example: let the vector z z be complex so that the objective function is zHAz z A z where A A is a positive definite, Hermitian matrix and let the constraint be linear, but vector-valued ( Cz=c C z c ). The Lagrangian is formed from the objective function and the real part of the usual constraint term. Lzλ=zHAz+λHCz-c+λTC¯z¯-c¯ L z λ z A z λ C z c λ C z c For the Lagrange multiplier theorem to hold, the gradients of each component of the constraint must be linearly independent. As these gradients are the columns of C C, their mutual linear independence means that each constraint vector must not be expressible as a linear combination of the others. We shall assume this portion of the problem statement true. Evaluating the gradient with respect to z¯ z , keeping z z a constant, and setting the result equal to zero yields Az+CHλ=0 A z C λ 0 The solution is z z is -A-1CHλ A C λ . Applying the constraint, we find that CA-1CHλ=-c C A C λ c . Solving for the Lagrange multiplier and substituting the result into the solution, we find that the solution to the constrained optimization problem is z=A-1CHCA-1CH-1c z A C C A C c The indicated matrix inverses always exist: A A is assumed invertible and CA-1CH C A C is invertible because of the linear independence of the constraints.

Inequality Constraints

When some of the constraints are inequalities, the Lagrange multiplier technique can be used, but the solution must be checked carefully in its details. But first, the optimization problem with equality and inequality constraints is formulated as minx{fx}   subject to   gx=0   and   hx0 x f x   subject to   g x 0   and   h x 0 As before, f· f · is the scalar-valued objective function and g· g · is the equality constraint function; h· h · is the inequality constraint function.

The key result which can be used to find the analytic solution to this problem is to first form the Lagrangian in the usual way as Lxλμ=fx+λTgx+μThx L x λ μ f x λ g x μ h x . The following theorem is the general statement of the Lagrange multiplier technique for constrained optimization problems.

theorem 2

Let x x be a local minimum for the constrained optimization problem. If the gradients of gg's components and the gradients of those components of h· h · for which h i x=0 h i x 0 are linearly independent, then xLxλμ=0 x L x λ μ 0 where μ0 μ 0 and μi h i x=0 μ i h i x 0

The portion of this result dealing with the inequality constraint differs substantially from that concerned with the equality constraint. Either a component of the constraint equals its maximum value (zero in this case) and the corresponding component of its Lagrange multiplier is non-negative (and is usually positive) or a component is less than the constraint and its component of the Lagrange multiplier is zero. This latter result means that some components of the inequality constraint are not as stringent as others and these lax ones do not affect the solution.

The rationale behind this theorem is a technique for converting the inequality constraint into an equality constraint: h i x0 h i x 0 is equivalent to h i x+ s i 2=0 h i x s i 2 0 . Since the new term, called a slack variable, is non-negative, the constraint must be non-positive. With the inclusion of slack variables, the equality constraint theorem can be used and the above theorem results. To prove the theorem, not only does the gradient with respect to x x need to be considered, but also with respect to the vector s s of slack variables. The ith ith component of the gradient of the Lagrangian with respect to s s at the stationary point is 2 μ i s i =0 2 μ i s i 0 . If in solving the optimization problem, s i =0 s i 0 , the inequality constraint was in reality an equality constraint and that component of the constraint behaves accordingly. As s i =- h i x s i h i x , s i =0 s i 0 implies that that component of the inequality constraint must equal zero. On the other hand, if s i 0 s i 0 , the corresponding Lagrange multiplier must be zero.

Example 3

Consider the problem of minimizing a quadratic form subject to a linear equality constraint and an inequality constraint on the norm of the linear constraint vector's variation. minx{xTAx}   subject to   c+δTx=1   and   δ2ε x x A x   subject to   c δ x 1   and   δ 2 ε This kind of problem arises in robust estimation. One seeks a solution where one of the "knowns" of the problem, c c in this case, is, in reality, only approximately specified. The independent variables are x x and δ δ. The Lagrangian for this problem is Lxδλμ=xTAx+λc+δTx-1+μδ2-ε L x δ λ μ x A x λ c δ x 1 μ δ 2 ε Evaluating the gradients with respect to the independent variables yields 2Ax+ λ c+δ=0 2 A x λ c δ 0 λ x+2 μ δ=0 λ x 2 μ δ 0 The latter equation is key. Recall that either μ =0 μ 0 or the inequality constraint is satisfied with equality. If μ μ is zero, that implies that x x must be zero which will not allow the equality constraint to be satisfied. The inescapable conclusion is that δ2=ε δ 2 ε and that δ δ is parallel to x x : δ=- λ 2 μ x δ λ 2 μ x . Using the first equation, x x is found to be x=- λ 2A- λ 24μI-1c x λ 2 A λ 2 4 μ I c Imposing the constraints on this solution results in a pair of equations for the Lagrange multipliers. 1/4 λ 2 μ 2cTA-1/4 λ 2 μ I-2c=ε 14 λ 2 μ 2 c A 14 λ 2 μ I -2 c ε cTA-1/4 λ 2 μ I-1c=-2 λ -2ε μ λ 2 c A 14 λ 2 μ I c 2 λ 2 ε μ λ 2 Multiple solutions are possible and each must be checked. The rather complicated completion of this example is left to the (numerically oriented) reader.

Comments, questions, feedback, criticisms?

Send feedback