Optimality for Convex Optimization

minxRnf(x) subject to xCRn\underset{x\in \R^n}{\min} f(x) \text{ subject to } x\in C\subseteq \R^n
  • f:RnRf:\R^n\to \R continuous differentiable and convex

  • CRnC\subseteq \R^n convex

Unconstrained (CRnC\subseteq \R^n)

xarg minxRn0f(x;xx),xRn=f(x)T(xx)f(x)=0\begin{aligned}x^* \in \underset{x\in \R^n}{\argmin} &\Leftrightarrow \begin{aligned}0&\le f’(x^*;x-x^*), \forall x\in\R^n \\ &=\nabla f(x^*)^T(x-x^*)\end{aligned} \\ &\Leftrightarrow \nabla f(x^*)=0\end{aligned}

Constrained (CRnC\subseteq \R^n)

xarg minxCf(x)0f(x;xx),xC=f(x)T(xx),xCx^* \in \underset{x\in C}{\argmin}f(x) \Leftrightarrow \begin{aligned}0 &\le f’(x^*;x-x^*),\forall x\in C\\&=\nabla f(x^*)^T(x-x^*),\forall x\in C\end{aligned}

directional derivative is non-negative in all feasible directions

f(x)NR+2(x)f(x)T(xx)0,xC-\nabla f(x^*)\in N_{\R_+^2}(x^*)\Leftrightarrow -\nabla f(x^*)^T(x-x^*)\le 0,\forall x\in C

Normal Cone of a set CRnC\subseteq \R^n of a point xCx\in C

NC(x)={gRngT(zx)0,zC}N_C(x)=\{g\in \R^n \mid g^T(z-x)\le 0, \forall z\in C\}

Example. C={xRnAx=b}C= \{x\in \R^n \mid Ax=b \}

NC(x)={gRngT(zx)0,zC}={gRngTd0,dnull(A)}={gRngTd=0,dnull(A)}=null(A)=range(AT)\begin{aligned}N_C(x)&=\left\{g\in \R^n \mid g^T(z-x)\le 0, \forall z\in C\right\}\\&=\left\{g\in\R^n\mid g^Td \le0,\forall d\in null(A)\right\}\\&=\left\{g\in \R^n \mid g^Td=0,\forall d\in null(A)\right\}\\&=null(A)^{\perp}\\&=range(A^T)\end{aligned}

if zCz\in C, and xCx\in C

Ax=bAx=b

Az=bAz=b

A(zx)=AzAx=bb=0A(z-x)=Az-Ax=b-b=0

range(A)null(AT)=Rmrange(A) \oplus null(A^T)=\R^m

range(AT)null(A)=Rnrange(A^T)\oplus null(A)=\R^n

Linearly Constrained Optimization

minxRnf(x) subject to Ax=bxC\underset{x\in\R^n}{\min}f(x)\text{ subject to } Ax=b \Leftrightarrow x\in C

xx^* is a minimizer only if

{f(x)=ATy for some yRmAx=b\left \{ \begin{aligned}&\nabla f(x^*)=A^Ty\text{ for some } y\in\R^m \\ &Ax^*=b\end{aligned}\right., where AA is a m×nm\times n matrix.

New Lagrange of Normal Cones

f(x)NC(x),xCf(x)range(AT),Ax=by s.t. f(x)=ATy\begin{aligned}&-\nabla f(x^*)\in N_C(x^*), x^*\in C \\ \Leftrightarrow & -\nabla f(x^*)\in range(A^T),Ax^*=b \\ \Leftrightarrow & \exist y^* \text{ s.t. } -\nabla f(x^*)=A^Ty\end{aligned}

Last updated