Nonlinear LS:
x∈Rnmin21∣∣r(x)∣∣22 , where
r(x)=r1(x)r2(x)⋮rm(x) ri(x):Rn→R
i=1…m
m outputs, n inputs
Special case: Linear Least-Square:
r(x)=r1(x)r2(x)⋮rm(x)=b1−a1Txb2−a2Tx⋮bm−amTx=[ bi−aiTx] i=1m=b−Ax
where A=−a1T−−a2T−⋮−anT−, and A is m×n.
m>n: overdetermined
m<n: underdetermined
More generally, nonlinear model:
ri(x)=bi−fi(x), fi(x):Rn→R
Example: position estimation→ “sensor localization”
nonlinear LS form: to obtain position estimate
x∈R2min∑i=1m(∥x−bi∥2−si)2
m beacons bi∈R2, i=1…m
Nonlinear Least-Squares
x∈Rnmin21∥r(x)∥2=21∑i=1mri(x)2
r(x)=r1(x)⋮rm(x), ri(x):Rn→R
Sensor localization example:
r(x)=∥x−b1∥−f1∥x−b2∥−f2 ⋮∥x−bm∥−fm
residual ri(x)=∥x−bi∥2−fi
Special case:
ri(x)=aiTx−bi
r(x)=Ax−b
21∥r(x)∥2=21∥Ax−b∥2: linear least squares
Gauss-Newton
start with some guess for solution x(0)
for k=0,1,2,3,…
linearize the residuals ri at the current iterate x(k)
rˉ(k)(x) ← linearization at x(k)
x(k+1)=x∈Rnargmin21∥rˉ(k)(x)∥2
Linearization of a generic f:Rn→R at a point x∈Rn
f(x+d)=f(x)+∇f(x)Td+o(∥d∥)
or equivalently: z=x+d or d=z−x
f(z)=f(x)+∇f(x)T(z−x)+o(∥z−x∥)
(little “oh”) g∈o(ε) if ε→0limεg(ε)=0
Sensor Localization Example
r(x)=∥x−b1∥−f1∥x−b2∥−f2⋮∥x−bm∥−fm
linearization of vector-valued function r(x):Rn→Rn:
rˉ(k)(x)=[r1(x(k))+∇r1(x(k))T+o(∥x−xk∥)⋮ rm(x(k))+∇rm(x(k))T+o(∥x−xk∥)]=r1(x(k))⋮rm(x(k))+∇r1(x(k))T⋮∇rm(x(k))T(x−x(k))+o(∥x−x(k)∥)=r(x(k))+J(x(k))(x−x(k))+o(∥x−xk∥)
where J is Jacobian of r at x(k)
Jacobian matrix of a function r:Rn→Rm
J(x)=∇r1(x)T⋮∇rm(x)T
is a n×m matrix
“Gradient” of r:Rn→Rm
∇r(x)=J(x)T
Goal: x(0),x(1),x(2),…→xt
rˉ(k)(x)=r(x(k))+J(x(k))(x−x(k)) linearization at x(k)
x(k+1)=xargmin21∥r(x(k))+J(x(k))(x−x(k))∥2=argmin21∥J(x(k))x−(J(x(k))⋅x(k)−r(x(k)))∥2=argmin21∥Akx−bk∥2
where
i.e., xk+1=Ak\bk
pseudo-code for Gauss-Newton
x(0)∈Rn given; ε>0 given
for k=0,1,2,3,…
x(k+1)=Ak\bk
where Ak=J(x(k)), bk=Akx(k)−r(x(k))
if ∥x(k+1)−x(k)∥<ε, exit
return x(k+1)
Least-Squares: x is optimal for min21∥Ax−b∥2 iff AT(Ax−b)=0 or ATr=0.