Convex Functions

  • definition

  • relationship to convex sets

  • operations preserve convexity

  • global optimality (necessary \equiv sufficient)

Simplex

Δn={xRnxi1,x0}\Delta_n = \{x\in \R^n \mid \sum x_i\le1, x\ge0\}

“Probability” Simplex

Δˉn={xRnxj=1,x0}\bar{\Delta}_n=\{x\in\R^n\mid\sum x_j=1, x\ge0\}

Conv(S)={i=1kλixiλΔˉk,xiS,kN}\text{Conv}(S)= \{\sum_{i=1}^k\lambda_ix_i\mid\lambda \in\bar{\Delta}_k,x_i\in S,k\in \N \}, where N\N denotes non-negative

Δ2=H(01),0H(10),0H(11),1\Delta_2=H^-_{\begin{pmatrix}0\\-1\end{pmatrix},0}\cap H^-_{\begin{pmatrix}-1\\0\end{pmatrix},0}\cap H^-_{\begin{pmatrix}1\\1\end{pmatrix},1}

Caratheodory’s Theorem (Chp6, Beck)

Hyperplane: Ha,β={xRnaTx=β}H_{a,\beta}= \{x\in \R^n\mid a^Tx=\beta \}

Halfspace: Ha,β1H^{-1}_{a,\beta}

Ha,β={xRnaTxβ},(a0)H^-_{a,\beta}=\{x\in\R^n\mid a^Tx\le\beta\}, (a\ne 0)

Definition. A function f:CRf:C\to \R, CRnC\subseteq \R^n (convex set), is convex if

f(λx+(1λ)y)λf(x)+(1λ)f(y)f(\lambda x+(1-\lambda)y)\le\lambda f(x)+(1-\lambda)f(y)

for any x,yCx,y\in C, and λ[0,1]\lambda\in [0,1]

→ Difference between strict convex and convex?

Definition. ff is concave if f-f is convex

Consequence: A function ff is concave and convex if and only if ff is affine

Example: Norm :RnR+\lVert \cdot \rVert: \R^n\to\R_+

All norms satisfy the

  1. Δ-inequality\Delta\text{-inequality}

    x+yx+y\lVert x+y\rVert\le \lVert x\rVert+\lVert y\rVert
  2. αx=αx\lVert \alpha x\rVert=\lvert \alpha\rvert \cdot\lVert x\rVert

Take any x,yRnx,y\in \R^n and any λ[0,1]\lambda \in [0,1]

Z:=λx+(1λ)yZ:=\lambda x+(1-\lambda)y

Z=λx+(1λ)yλx+(1λ)y=λx+1λy=λx+(1λ)y \begin{aligned}\lVert Z\rVert &= \lVert \lambda x+(1-\lambda)y\rVert \\ &\le \lVert \lambda x\rVert+\lVert (1-\lambda)y\rVert \\ &=\lvert \lambda\rvert \cdot\lVert x\rVert+\lvert 1-\lambda\lvert\cdot \lVert y\rVert \\ &=\lambda\lVert x\rVert+(1-\lambda)\lVert y\rVert\end{aligned}

Example: Affine functions

f(x)=aTx+βf(x)=a^Tx+\beta

Ex: f(x)=log(x),f:RRf(x)=-log(x), f:\R\to\R

Ex: f(x)=exf:RRf(x)=e^x f:\R\to\R

Operations that preserve the convexity of functions

  1. Non-negative multiples

    αf(x)\alpha f(x) is convex if f(x)f(x) is convex and α0\alpha \ge 0

  2. Sums

    f1+f2++fmf_1+f_2+\ldots+f_m is convex if f1,,fmf_1,\ldots,f_m are convex

  3. Composition of a convex function with affine function is convex, i.e.,

    if ff convex then f(Ax+b)f(Ax+b) convex for any matrix AA and vector bb

    Proof:

    Take any point x,yRnx,y\in \R^n and λ[0,1]\lambda \in [0,1]:

    f(A(λx+(1λ)y)+b)=f(λ(Ax+b)+(1λ)(Ay+b))λf(Ax+b)+(1λ)f(Ay+b)f(A(\lambda x+(1-\lambda)y)+b)=f(\lambda(Ax+b)+(1-\lambda)(Ay+b))\le \lambda f(Ax+b)+(1-\lambda)f(Ay+b)

    Example f(x1,x2,x3)=ex1x2+x3+e2x2+x1f(x_1,x_2,x_3)=e^{x_1-x_2+x_3}+e^{2x_2}+x_1

    Example For any matrix AA and vector bb, LS objective: 12Axb22\frac{1}{2}\lVert Ax-b\rVert_2^2

    12Axb22=12(aiTxbi)2=12f(aiTxbi)\frac{1}{2}\lVert Ax-b\rVert_2^2=\frac{1}{2}\sum(a_i^Tx-b_i)^2=\frac{1}{2}\sum f(a_i^Tx-b_i), where f()=()2f(\cdot)=(\cdot)^2

GLOBAL Optimality

(P) minxRnf(x)\underset{x\in\R^n}{\min}f(x) subject to xCx\in C

where f:RnRf:\R^n\to\R convex and CRnC\subseteq \R^n convex set

If xx^* is a local minimizer of (P), i.e., f(x)f(x),xBεCf(x^*)\le f(x), \forall x\in \mathbb{B}_{\varepsilon}\cap C for all ε>0\varepsilon \gt 0 small enough, then xx^* is a global minimizer of (P), i.e., f(x)f(x),xCf(x^*)\le f(x), \forall x\in C.

Proof Suppose xCx^* \in C is a local minimizer of (P), but not a global minimizer.

yC\exists y\in C s.t. f(y)<f(x)f(y)\lt f(x^*)

By convexity of CC, λx+(1λ)yC\lambda x^* +(1-\lambda)y\in C for any λ[0,1]\lambda \in [0,1] and by convexity of ff;

f(λx+(1λ)y)λf(x)+(1λ)f(y)<λf(x)+(1λ)f(x)=f(x)\begin{aligned} f(\lambda x^*+(1-\lambda)y)&\le \lambda f(x^*)+(1-\lambda)f(y) \\ &\lt \lambda f(x^*)+(1-\lambda)f(x^*) \\ &=f(x^*)\end{aligned}

Last updated