If one poses the lplp approximation problem in solving an overdetermined set of
equations (case 2 [14]), it comes from defining the equation error norm as


e


p
=
∑
n

e
(
n
)

p
1
/
p


e


p
=
∑
n

e
(
n
)

p
1
/
p
(16)and finding xx to minimizing this pnorm of the equation error.
It has been shown this is equivalent to solving a least weighted squared error problem
for the appropriate weights.


e


p
=
∑
n
w
(
n
)
2

e
(
n
)

2
1
/
2


e


p
=
∑
n
w
(
n
)
2

e
(
n
)

2
1
/
2
(17)If one takes Equation 16 and factors the term being summed in the following manor, comparison to Equation 17 suggests the iterative reweighted least algorithm which is the subject of these notes.


e


p
=
∑
n

e
(
n
)

(
p

2
)

e
(
n
)

2
1
/
p


e


p
=
∑
n

e
(
n
)

(
p

2
)

e
(
n
)

2
1
/
p
(18)To find the minimum lplp approximate solution,
we propose the iterative reweighted least squared (IRLS) error algorithm
which starts with unity weighting, W=IW=I, solves for an
initial xx with Equation 15, calculates a new error from Equation 13,
which is then used to set a new weighting matrix WW with diagonal elements of
w
(
n
)
=
e
(
n
)
(
p

2
)
/
2
w
(
n
)
=
e
(
n
)
(
p

2
)
/
2
(19)to be used in the next iteration of Equation 15.
Using this, we find a new solution xx and repeat until convergence
(if it happens!). To guarantee convergence, this process should be a contraction map which
converges to a fixed point that is the solution. A simple Matlab program
that implements this algorithm is:
Listing 1: Program 1. Basic Iterative Reweighted Least Squares Algorithm% mfile IRLS0.m to find the optimal solution to Ax=b
% minimizing the L_p norm Axb_p, using basic IRLS.
% csb 11/10/2012
function x = IRLS0(A,b,p,KK)
if nargin < 4, KK=10; end;
x = pinv(A)*b; % Initial L_2 solution
E = [];
for k = 1:KK % Iterate
e = A*x  b; % Error vector
w = abs(e).^((p2)/2); % Error weights for IRLS
W = diag(w/sum(w)); % Normalize weight matrix
WA = W*A; % apply weights
x = (WA'*WA)\(WA'*W)*b; % weighted L_2 sol.
ee = norm(e,p); E = [E ee]; % Error at each iteration
end
plot(E)

This core idea has been repeatedly proposed and developed in different
application areas over the past 50 or so years with a variety of successes [11], [37], [3]. Used
in this basic form, it reliably converges for 1.5<p<31.5<p<3. In 1970 [31], a
modification was made by only partially updating the solution each iteration using
x
(
k
)
=
q
x
^
(
k
)
+
(
1

q
)
x
(
k

1
)
x
(
k
)
=
q
x
^
(
k
)
+
(
1

q
)
x
(
k

1
)
(20)where x^x^ is the new weighted least squares solution of Equation 15
which is used to only partially update the previous value x(k1)x(k1) and k is the iteration index.
The first use of this partial update optimized the value for qq on each
iteration to give a more robust convergence but it slowed the total algorithm
considerably.
A second improvement was made by using a specific update factor of
q
=
1
p

1
q
=
1
p

1
(21)to significantly improve the rate of asymptotic convergence. With this particular factor and for p>2, the
algorithm becomes a form of Newton's method which has quadratic asymptotic convergence [30], [7]
but, unfortunately, the initial convergence became less robust.
A third modification applied homotopy [6], [45], [44], [47], [30], [24]
by starting with a value for pp equal to 2 and increasing it each iteration
(or the first few iterations) until it reached the desired value, or, in the case of p<2p<2, decreasing it.
This improved the initial performance of the algorithm and made a significant increase in both the
range of pp that allowed convergence and in the speed of calculations. Some
of the history and details can be found applied to digital filter design in [3], [7].
A Matlab program that implements these ideas applied to our pseudoinverse
problem with more equations than unknowns is:
Listing 2: Program 2. IRLS Algorithm with Newton's updating and Homotopy for M>N% mfile IRLS1.m to find the optimal solution to Ax=b
% minimizing the L_p norm Axb_p, using IRLS.
% Newton iterative update of solution, x, for M > N.
% For 2<p<infty, use homotopy parameter K = 1.01 to 2
% For 0<p<2, use K = approx 0.7  0.9
% csb 10/20/2012
function x = IRLS1(A,b,p,K,KK)
if nargin < 5, KK=10; end;
if nargin < 4, K = 1.5; end;
if nargin < 3, p = 10; end;
pk = 2; % Initial homotopy value
x = pinv(A)*b; % Initial L_2 solution
E = [];
for k = 1:KK % Iterate
if p >= 2, pk = min([p, K*pk]); % Homotopy change of p
else pk = max([p, K*pk]); end
e = A*x  b; % Error vector
w = abs(e).^((pk2)/2); % Error weights for IRLS
W = diag(w/sum(w)); % Normalize weight matrix
WA = W*A; % apply weights
x1 = (WA'*WA)\(WA'*W)*b; % weighted L_2 sol.
q = 1/(pk1); % Newton's parameter
if p > 2, x = q*x1 + (1q)*x; nn=p; % partial update for p>2
else x = x1; nn=2; end % no partial update for p<2
ee = norm(e,nn); E = [E ee]; % Error at each iteration
end
plot(E)

This can be modified to use different pp's in different bands of equations or to
use weighting only when the error exceeds a certain threshold to achieve a
constrained LS approximation [3], [7], [55].
Here, it is presented as applied to the overdetermined system (Case 2a and 2b [14]) but can also be applied to other cases. A particularly important application of this section is to the design of digital filters [7].