Skip to content Skip to navigation

Connexions

You are here: Home » Content » Constrained Least Squares (CLS) problem

Navigation

Recently Viewed

This feature requires Javascript to be enabled.
 

Constrained Least Squares (CLS) problem

Module by: Ricardo Vargas. E-mail the author

One of the common obstacles to innovation occurs when knowledge settles on a particular way of dealing with problems. While new ideas keep appearing suggesting innovative approaches to design digital filters, it is all too common in practice that l2l2 and ll dominate error criteria specifications. This section is devoted to exploring a different way of thinking about digital filters. It is important to note that up to this point we are not discussing an algorithm yet. The main concern being brought into play here is the specification (or description) of the design problem. Once the Constrained Least Squares (CLS) problem formulation is introduced, we will present an IRLS implementation to solve it, and will justify our approach over other existing approaches. It is the author's belief that under general conditions one should always use our IRLS implementation over other methods, especially when considering the associated management of transition regions.

The CLS problem was introduced in (Reference) and is repeated here for clarity,

min h D ( ω ) - H ( ω ; h ) 2 subject to | D ( ω ) - H ( ω ; h ) | τ min h D ( ω ) - H ( ω ; h ) 2 subject to | D ( ω ) - H ( ω ; h ) | τ
(1)

To the best of our knowledge this problem was first introduced in the context of filter design by John Adams [1] in 1991. The main idea consists in approximating iteratively a desired frequency response in a least squares sense except in the event that any frequency exhibits an error larger than a specified tolerance ττ. At each iteration the problem is adjusted in order to reduce the error on offending frequencies (i.e. those which do not meet the constraint specifications). Ideally, convergence is reached when the altered least squares problem has a frequency response whose error does not exceed constraint specifications. As will be shown below, this goal might not be attained depending on how the problem is posed.

Adams and some collaborators have worked in this problem and several variations [2]. However his main (and original) problem was illustrated in [1] with the following important assumption: the definition of a desired frequency response must include a fixed non-zero width transition band. His method uses Lagrange multiplier theory and alternation methods to find frequencies that exceed constraints and minimize the error at such locations, with an overall least squares error criterion.

Burrus, Selesnick and Lang [3] looked at this problem from a similar perspective, but relaxed the design specifications so that only a transition frequency needs to be specified. The actual transition band does indeed exist, and it centers itself around the specified transition frequency; its width adjusts as the algorithm iterates (constraint tolerances are still specified). Their solution method is similar to Adams' approach, and explicitly uses the Karush-Kuhn-Tucker (KKT) conditions together with an alternation method to minimize the least squares error while constraining the maximum error to meet specifications.

C. S. Burrus and the author of this work have been working on the CLS problem using IRLS methods with positive results. This document is the first thorough presentation of the method, contributions, results and code for this approach, and constitutes one of the main contributions of this work. It is crucial to note that there are two separate issues in this problem: on one hand there is the matter of the actual problem formulation, mainly depending on whether a transition band is specified or not; on the other hand there is the question of how the selected problem description is actually met (what algorithm is used). Our approach follows the problem description by Burrus et al. shown in [3] with an IRLS implementation.

Two problem formulations

As mentioned in (Reference), one can address problem Equation 1 in two ways depending on how one views the role of the transition band in a CLS problem. The original problem posed by Adams in [1] can be written as follows,

min h D ( ω ) - H ( ω ; h ) 2 subject to | D ( ω ) - H ( ω ; h ) | τ ω [ 0 , ω p b ] [ ω s b , π ] min h D ( ω ) - H ( ω ; h ) 2 subject to | D ( ω ) - H ( ω ; h ) | τ ω [ 0 , ω p b ] [ ω s b , π ]
(2)

where 0<ωpb<ωsb<π0<ωpb<ωsb<π. From a traditional standpoint this formulation feels familiar. It assigns fixed frequencies to the transition band edges as a number of filter design techniques do. As it turns out, however, one might not want to do this in CLS design.

An alternate formulation to Equation 2 could implicitly introduce a transition frequency ωtbωtb (where ωpb<ωtb<ωsbωpb<ωtb<ωsb); the user only specifies ωtbωtb. Consider

min h D ( ω ) - H ( ω ; h ) 2 ω [ 0 , π ] subject to | D ( ω ) - H ( ω ; h ) | τ ω [ 0 , ω p b ] [ ω s b , π ] min h D ( ω ) - H ( ω ; h ) 2 ω [ 0 , π ] subject to | D ( ω ) - H ( ω ; h ) | τ ω [ 0 , ω p b ] [ ω s b , π ]
(3)

The algorithm at each iteration generates an induced transition band in order to satisfy the constraints in Equation 3. Therefore {ωpb,ωsb}{ωpb,ωsb} vary at each iteration.

Figure 1: Two formulations for Constrained Least Squares problems.
Two graphs showing two formulations for Constrained Least Squares Problems.

It is critical to point out the differences between Equation 2 and Equation 3. Figure 1.a explains Adams' CLS formulation, where the desired filter response is only specified at the fixed pass and stop bands. At any iteration, Adams' method attempts to minimize the least squares error (ε2ε2) at both bands while trying to satisfy the constraint ττ. Note that one could think of the constraint requirements in terms of the Chebishev error εε by writing Equation 2 as follows,

min h D ( ω ) - H ( ω ; h ) 2 subject to D ( ω ) - H ( ω ; h ) τ ω [ 0 , ω p b ] [ ω s b , π ] min h D ( ω ) - H ( ω ; h ) 2 subject to D ( ω ) - H ( ω ; h ) τ ω [ 0 , ω p b ] [ ω s b , π ]
(4)

In contrast, Figure 1.b illustrates our proposed problem Equation 3. The idea is to minimize the least squared error ε2ε2 across all frequencies while ensuring that constraints are met in an intelligent manner. At this point one can think of the interval (ωpb,ωsb)(ωpb,ωsb) as an induced transition band, useful for the purposes of constraining the filter. Section 2 presents the actual algorithms that solve Equation 3, including the process of finding {ωpb,ωsb}{ωpb,ωsb}.

It is important to note an interesting behavior of transition bands and extrema points in l2l2 and ll filters. Figure 2 shows l2l2 and ll length-15 linear phase filters (designed using Matlab's firls and firpm functions); the transition band was specified at {ωpb=0.4/π,ωsb=0.5/π}{ωpb=0.4/π,ωsb=0.5/π}. The dotted l2l2 filter illustrates an important behavior of least squares filters: typically the maximum error of an l2l2 filter is located at the transition band. The solid ll filter shows why minimax filters are important: despite their larger error across most of the bands, the filter shows the same maximum error at all extrema points, including the transition band edge frequencies. In a CLS problem then, typically an algorithm will attempt to reduce iteratively the maximum error (usually located around the transition band) of a series of least squares filters.

Figure 2: Comparison of l2l2 and ll filters.
A graph consisting of two waveforms one is identified by a solid line and labeled L_∞. The other is identified by a dotted line and labeled L_2. The x-axis is labeled ω/π and the y-axis is labeled |H(ω)|. Both waveforms follow the same general path starting on the left high on the y-axis and dropping drastically to 0 on the y axis and bouncing along the x axis to the end of the graph.

Another important fact results from the relationship between the transition band width and the resulting error amplitude in ll filters. Figure 3 shows two ll designs; the transition bands were set at {0.4/π,0.5/π}{0.4/π,0.5/π} for the solid line design, and at {0.4/π,0.6/π}{0.4/π,0.6/π} for the dotted line one. One can see that by widening the transition band a decrease in error ripple amplitude is induced.

Figure 3: Effects of transition bands in ll filters.
This graph contains two waveforms. One wave is identified by a thin solid line and the other is identified by a dotted line. The x-axis is labeled ω/π and the y-axis is labeled |H(ω)|. Both wave forms follow the same general path. They start high on the y-axis and proceed to the right. The waves drop drastically to the x-axis and bounce along the x-axis until the end of the graph. The upper portion of the waves are bound vertically by two sets of horizontal line. Solid line has a greated amplitude and contained within these bounds are two dotted lines which mark the bounds of the dotted line. The dotted line has a smaller amplitude. After the drastic descent, the upper bound of the solid line wave is marked by a solid line. The dotted line wave's upper bound is marked by a horizontal dotted line. The amplitude of the solid line is greater than the dotted line.

These two results together illustrate the importance of the transition bandwidth for a CLS design. Clearly one can decrease maximum error tolerances by widening the transition band. Yet finding the perfect balance between a transition bandwidth and a given tolerance can prove a difficult task, as will be shown in Section 2. Hence the relevance of a CLS method that is not restricted by two types of specifications competing against each other. In principle, one should just determine how much error one can live with, and allow an algorithm to find the optimal transition band that meets such tolerance.

Two problem solutions

Section 1 introduced some important remarks regarding the behavior of extrema points and transition bands in l2l2 and ll filters. As one increases the constraints on an l2l2 filter, the result is a filter whose frequency response looks more and more like an ll filter.

(Reference) introduced the frequency-varying problem and an IRLS-based method to solve it. It was also mentioned that, while the method does not solve the intended problem (but a similar one), it could prove to be useful for the CLS problem. As it turns out, in CLS design one is merely interested in solving an unweighted, constrained least squares problem. In this work, we achieve this by solving a sequence of weighted, unconstrained least squares problems, where the sole role of the weights is to "constraint" the maximum error of the frequency response at each iteration. In other words, one would like to find weights ww such that

min h D ( ω ) - H ( ω ; h ) 2 subject to D ( ω ) - H ( ω ; h ) τ ω [ 0 , ω p b ] [ ω s b , π ] min h D ( ω ) - H ( ω ; h ) 2 subject to D ( ω ) - H ( ω ; h ) τ ω [ 0 , ω p b ] [ ω s b , π ]
(5)

is equivalent to

min h w ( ω ) · ( D ( ω ) - H ( ω ; h ) ) 2 min h w ( ω ) · ( D ( ω ) - H ( ω ; h ) ) 2
(6)

Hence one can revisit the frequency-varying design method and use it to solve the CLS problem. Assuming that one can reasonably approximate ll by using high values of pp, at each iteration the main idea is to use an lplp weighting function only at frequencies where the constraints are exceeded. A formal formulation of this statement is

w ϵ ( ω ) = | ϵ ( ω ) | p - 2 2 if | ϵ ( ω ) | > τ 1 otherwise w ϵ ( ω ) = | ϵ ( ω ) | p - 2 2 if | ϵ ( ω ) | > τ 1 otherwise
(7)

Assuming a suitable weighting function existed such that the specified tolerances are related to the frequency response constraints, the IRLS method would iterate and assign rather large weights to frequencies exceeding the constraints, while inactive frequencies get a weight of one. As the method iterates, frequencies with large errors move the response closer to the desired tolerance. Ideally, all the active constraint frequencies would eventually meet the constraints. Therefore the task becomes to find a suitable weighting function that penalizes large errors in order to have all the frequencies satisfying the constraints; once this condition is met, we have reached the desired solution.

Figure 4: CLS polynomial weighting function.
This image contains three graphs. The first graph represents Polynomial weighting (p=100). The second graph represents Linear-log view (p=100), and the third graph represents Linear-log view (p=500). The x-axis is labeled ε for all of these graphs, and the y-axis is labeled w(ε).

One proposed way to find adequate weights to meet constraints is given by a polynomial weighting function of the form

w ( ω ) = 1 + ϵ ( ω ) τ p - 2 2 w ( ω ) = 1 + ϵ ( ω ) τ p - 2 2
(8)

where ττ effectively serves as a threshold to determine whether a weight is dominated by either unity or the familiar lplp weighting term. Figure 4 illustrates the behavior of such a curve.

Figure 5: Original l2l2 guess for CLS algorithm.
This graph represents L_2 Initial Guess. The x-axis is labeled f and the y-axis is labeled H(f). There is one wave form present in this graph. The waveform starts at the coordinate (0,1) and then drops drastically to (0.25,0) and bounces along y=0 until (0,0.5) where the graph ends.
Figure 6: CLS design example using mild constraints.
This graph represents CLS Solution (τ=0.06). The x-axis is labeled f and the y-axis is labeled H(f). There is one wave form present in this graph. The waveform starts at the coordinate (0,1) and then drops drastically to (0.25,0) and bounces along y=0 until (0,0.5) where the graph ends.
Figure 7: CLS design example using tight constraints.
This graph represents CLS Solution (τ=0.03). The x-axis is labeled f and the y-axis is labeled H(f). There is one wave form present in this graph. The waveform starts at the coordinate (0,1) and then drops drastically to (0.25,0) and bounces along y=0 until (0,0.5) where the graph ends.
Figure 8: CLS design example without transition bands.
This image contains three graphs. The upper most graph is labeled L_2 Solution. The middle graph is labeled Intermediate Solution and the bottom graph is labeled CLS Solution. The x and y-axes of all graphs are labeled f and H(f) respectively. The waveforms of all of these graphs are similar. The waveform starts at the coordinate (0,1) and then drops drastically to (0.3,0) and bounces along y=0 until (0,0.5) where the graph ends. The only main difference between the waves is the amplitude of the wave, especially the peaks directly before and after the drastic fall.

In practice the method outlined above has proven robust particularly in connection with the specified transition band design. Consider the least squares design in Figure 5 (using a length-21 Type-I linear phase low-pass FIR filter with linear transition frequencies {0.2,0.25}{0.2,0.25}). This example illustrates the typical effect of CLS methods over l2l2 designs; the largest error (in an ll sense) can be located at the edges of the transition band. Figures Figure 6 and Figure 7 illustrate design examples using the proposed approach. Figure 6 shows an example of a mild constraint (τ=0.6τ=0.6), whereas Figure 7 illustrates an advantage of this method, associated to a hard constraint (τ=0.3τ=0.3). The method is trying iteratively to reduce the maximum error towards the constraint; however the specified constraint in Figure 7 is such that even at the point where an equiripple response is reached for the specified transition bands the constraint is not met. At this point the method converges to an optimal lplp solution that approximates equiripple as pp increases (the examples provided use p=50p=50).

A different behavior occurs when no transition bands are defined. Departing from an initial l2l2 guess (as shown in Figure 8.a) the proposed IRLS-based CLS algorithm begins weighting frequencies selectively in order to reduce the ll error towards the constraints ττ at each iteration. Eventually an equiripple behavior can be observed if the constraints are too harsh (as in Figure 8.b). The algorithm will keep weighting until all frequencies meet the constraints (as in Figure 8.c). The absence of a specified transition band presents some ambiguity in defining valid frequencies for weighting. One cannot (or rather should not) apply weights too close to the transition frequency specified as this would result in an effort by the algorithm to create a steep transition region (which as mentioned previously is counterintuitive to finding an equiripple solution). In a sense, this would mean having two opposite effects working at the same time and the algorithm cannot accommodate both, usually leading to numerical problems.

Figure 9: Definition of induced transition band.
This is a diagram of a waveform. The wave is a bold line that rises and falls with a dotted line marking the middle of the peaks and troughs. Two black lines mark the upper and lower extremes of this wave. There are three peaks in this diagram with the final peak marked 'Largest Ripple'. The middle section is labeled 'weighting corridor'. The space between the dotted line and solid upper and lower bounds are marked with brackets each labeled α.

In order to avoid these issues, an algorithm can be devised that selects a subset of the sampled frequencies for weighting purposes at each iteration. The idea is to identify the largest ripple per band at each iteration (the ripple associated with the largest error for a given band) and select the frequencies within that band with errors equal or smaller than such ripple error. In this way one avoids weighting frequencies around the transition frequency. This idea is illustrated in Figure 9.

Figure 10: CLS weights.
This image contains to graphs. The upper graph is labeled 'Weights at early iteration', and the lower graph is labeled 'Weight near convergence'. The x and y-axes on both graphs are labeled f and w(f) respectively. The wave forms of both graphs are similar each has peaks along the x-axis at .05, .1, .15, .2, .3,.35, .4 and .45. The amplitude of each of the graphs increase from .05 to .2 and then the amplitudes are mirrored from .3 to.45. The main difference is the height of the amplitude. The upper graph is slightly higher than the bottom graph and the rise of the wave is more gradual than the bottom one.

The previous example is fundamental since it illustrates the relevance of this method: since for a particular transition band the tightest constraint that one can get is given by the equiripple (or minimax) design (as shown in Section 1), a problem might arise when specifications are tighter than what the minimax design can meet. Adams found this problem (as reported in [1]); his method breaks under these conditions. The method proposed here overcomes an inadequate constraint and relaxes the transition band to meet the constraint.

It is worth noting that the polynomial weighting form works even when no transition bands are specified (this must become evident from Figure 8.c above). However, the user must be aware of some practical issues related to this approach. Figure 10 shows a typical CLS polynomial weighting function. Its "spiky" character becomes more dramatic as pp increases (the method still follows the homotopy and partial updating ideas from previous sections) as shown in Figure 10.b. It must be evident that the algorithm will assign heavy weights to frequencies with large errors, but at pp increases the difference in weighting exaggerates. At some point the user must make sure that proper sampling is done to ensure that frequencies with large weights (from a theoretical perspective) are being included in the problem, without compromising conputational efficiency (by means of massive oversampling, which can lead to ill-conditioning in numerical least squares methods). Also as pp increases, the range of frequencies with signifficantly large weights becomes narrower, thus reducing the overall weighting effect and affecting convergence speed.

Figure 11: CLS envelope weighting function.
A diagram of a CLS envelope weighting function. The diagram three main components. There are two horizontal dotted lines in the middle of the wave. The upper line is labeled 1|τ and the lower line is labeled | τ. The peaks and troughs of the wave are marked by rectangle caps that are labeled Weighted Frequencies.

A second weighting form can be defined where envelopes are used. The envelope weighting function approach works by assigning a weight to all frequencies not meeting a constraint. The value of such weights are assigned as flat intervals as illustrated in Figure 11. Intervals are determined by the edge frequencies within neighborhoods around peak error frequencies for which constraints are not met. Clearly these neighborhoods could change at each iteration. The weight of the kk-th interval is still determined by our typical expression,

w k ( ω ) = | ϵ ( ω k + ) | p - 2 2 w k ( ω ) = | ϵ ( ω k + ) | p - 2 2
(9)

where ωk+ωk+ is the frequency with largest error within the kk-th interval.

Envelope weighting has been applied in practice with good results. It is particularly effective at reaching high values of pp without ill-conditioning, allowing for a true alternative to minimax design. Figure 12 shows an example using τ=0.4τ=0.4; the algorithm managed to find a solution for p=500p=500. By specifying transition bands and unachievable constraints one can produce an almost equiripple solution in an efficient manner, with the added flexibility that milder constraints will result in CLS designs.

Figure 12: CLS design example using envelope weights.
This images contains two graphs. The first graph is labeled CLS Envelope Solution and the second graph is labeled Envelope Weights. The x-axes are labeled for both graphs. The y-axis of the first graph is labeled H(f) and the y-axis is labeled w(f) for the second graph. The Wave form of the first graph starts at (0,1) and progresses horizontally till (0.2,1) where the wave drops drastically to (.25,0) and then continues horizontally to (0.5,0). The second graph consist of a series of short horizontal line segments.

Comparison with lplp problem

This chapter presented two problems with similar effects. On one hand, (Reference) illustrated the fact (see (Reference)) that as pp increases towards infinity, an lplp filter will approximate an ll one. On the other hand, (Reference) presented the constrained least squared problem, and introduced IRLS-based algorithms that produce filters that approximate equiripple behavior as the constraint specifications tighten.

A natural question arises: how do these methods compare with each other? In principle it should be possible to compare their performances, as long as the necessary assumptions about the problem to be solved are compatible in both methods. Figure 13 shows a comparison of these algorithms with the following specifications:

  • Both methods designed length-21 Type-I lowpass linear phase digital filters with fixed transition bands defined by f={0.2,0.24}f={0.2,0.24} (in normalized linear frequency).
  • The lplp experiment used the following values of pp:
    p={2,2.2,2.5,3,4,5,7,10,15,20,30,50,70,100,170,400}p={2,2.2,2.5,3,4,5,7,10,15,20,30,50,70,100,170,400}
    (10)
  • The CLS experiment used the polynomial weighting method with fixed transition bands and a value of p=60p=60. The error tolerances were
    τ={.06,.077,.078,.8,.084,.088,.093,.1,.11,.12,.13,.14,.15,.16,.17,.18}τ={.06,.077,.078,.8,.084,.088,.093,.1,.11,.12,.13,.14,.15,.16,.17,.18}
    (11)

Some conclusions can be derived from Figure 13. Even though at the extremes of the curves they both seem to meet, the CLS curve lies just below the lplp curve for most values of pp and ττ. These two facts should be expected: on one hand, in principle the CLS algorithm gives an l2l2 filter if the constraints are so mild that they are not active for any frequency after the first iteration (hence the two curves should match around p=2p=2). On the other hand, once the constraints become too harsh, the fixed transition band CLS method basically should design an equiripple filter, as only the active constraint frequencies are lplp-weighted (this effects is more noticeable with higher values of pp). Therefore for tight constraints the CLS filter should approximate an ll filter.

The reason why the CLS curve lies under the lplp curve is because for a given error tolerance (which could be interpreted as for a given minimax error

 
εε) the CLS method finds the optimal l2l2 filter. An lplp filter is optimal in an lplp sense; it is not meant to be optimal in either the l2l2 or ll senses. Hence for a given ττ it cannot beat the CLS filter in an l2l2 sense (it can only match it, which happens around p=2p=2 or p=p=).

It is important to note that both curves are not drastically different. While the CLS curve represents optimality in an l2-ll2-l sense, not all the problems mentioned in this work can be solved using CLS filters (for example, the magnitude IIR problem presented in (Reference)). Also, one of the objectives of this work is to motivate the use of lplp norms for filter design problems, and the proposed CLS implementations (which absolutely depends on IRLS-based lplp formulations) are good examples of the flexibility and value of lplp IRLS methods discussed in this work.

Figure 13: Comparison between CLS and lplp problems.
This graph is a representation of the Comparison of CLS versus L_P errors. The x-axis is labeled ε_2 and the y-axis is labeled ε_∞. The graph contains two different curves. One curve represent L_p and is identified by a dotted blue line with points indicated by green stars. The other curve represents CLS curve is identified by a red dashed line with points identified by green squares. The L_p curve sits higher than the CLS curve.

References

  1. Adams, John W. (1991, April). FIR Digital Filters with Least-Squares Stopbands Subject to Peak-Gain Constraints. IEEE Trans. on Circuits and Systems, 39(4), 376-388.
  2. Adams, John W. and Sullivan, James L. (1998, Febr.). Peak-Constrained Least-Squares Optimization. IEEE Trans. on Signal Processing, 46(2), 306-321.
  3. Selesnick, Ivan W. and Lang, Markus and Burrus, Charles S. (1996, August). Constrained Least Square Design of FIR Filters without Specified Transition Bands. IEEE Transactions on Signal Processing, 44(8), 1879-1892.

Content actions

Download module as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks