优化作业1

1

对于无约束优化问题,若函数 f(x)f(x) 在点 xˉ\bar x 可微,如果存在方向 dd 使得f(x)Td<0\nabla f(x)^Td <0,证明 dd 是点 xˉ\bar x 处的一个下降方向.

Taylor expand:

f(xˉ+ad)=f(xˉ)+af(xˉ)Td+o(ad)=f(xˉ)+a[f(xˉ)Td+o(ad)/a]\begin{aligned} f(\bar x +ad) &= f(\bar x)+ a \nabla f(\bar x)^Td+ o(\|ad\|)\\ &=f(\bar x)+ a[\nabla f(\bar x )^T d + o(\|ad\|)/a] \end{aligned}

when a0a\to 0 : o(ad)/a0\quad o(\|ad\|)/a \to 0
f(xˉ)Td<0\because \nabla f(\bar x )^Td < 0, therefore when a(0,σ)a\in (0,\sigma ) is small enough

f(xˉ)Td+o(ad)/a<0\nabla f(\bar x )^T d + o(\|ad\|)/a <0

therefore f(xˉ+ad)<f(x)f(\bar x + ad) < f(x) \quad \square

2

对正定二次函数 f(x)=12xTAx+bTxf(x)=\frac{1}{2}x^TAx+b^T x 在点 xkx_k 求出沿下降方向 dkd_k 作精确搜索的步长 aka_k

f(xk+adk)=12xkTAxk+a2dkTAddk+adkTAxk+axkTAdk+bTxk+abTdkf(a)=2adkTAdk+dkTAxk+xkTADk+bTdk\begin{aligned} f(x_k+ad_k) &=\frac{1}{2}x_k^TAx_k + a^2 d_k^TAdd_k + ad_k^TAx_k + ax_k^TAd_k +b^Tx_k + ab^Td_k\\ f'(a) &= 2a d_k^TAd_k +d_k^TAx_k +x_k^TAD_k+b^Td_k\\ \end{aligned}

let f(a)=0f'(a)=0 :

a=dkTAxk+xkTAdk+bTdk2dkTAdka=\frac{d_k^TAx_k+x_k^TAd_k+b^Td_k}{-2d_k^TAd_k}

3

3.1

{2k}\{2^{-k}\}线性收敛
when k,  2k0k\to \infty, \; 2^{-k}\to 0

limkxx+1xxkx=limk2k12k=12\lim_{k\to \infty} \frac{\|x_{x+1}-x^*\|}{\|x_k-x^*\|}=\lim_{k\to \infty} \frac{\|2^{-k-1\|}}{\|2^{-k}\|}=\frac{1}{2}

3.2

{kk}\{k^{-k}\}超线性收敛
when k,  kk0k\to \infty, \; k^{-k}\to 0

limkxx+1xxkx=limkkk1kk=limk1k=0\lim_{k\to \infty} \frac{\|x_{x+1}-x^*\|}{\|x_k-x^*\|}=\lim_{k\to \infty} \frac{\|k^{-k-1\|}}{\|k^{-k}\|}=\lim_{k\to \infty} \frac{1}{k}=0

3.3

{a2k}(0<a<1)\{a^{2^k}\}\quad (0<a<1) 二阶收敛

limkxx+1xxkx=limka2k+1(a2k)2=1\lim_{k\to \infty} \frac{\|x_{x+1}-x^*\|}{\|x_k-x^*\|}=\lim_{k\to \infty} \frac{\|a^{2^{k+1}}\|}{\|(a^{2^k})^2\|}=1

4

(非精确线搜索步长的存在性)设f(xk+adk)f(x_k+ad_k)a>0a>0 时有下界,且 f(xk)Tdk<0\nabla f(x_k)^Td_k <0,则必存在 aka_k ,在点xk+akdkx_k + a_k d_k 处满足Wolfe 准则或者 Armijo- Goldstein 准则.

proof:
Wolfe: f(xk+adk)f(x_k+ad_k)a>0a>0 时有下界, 则 l(a)=f(xk)+ρaf(xk)Tdkl(a) = f(x_k)+ \rho a\nabla f(x_k)^Td_k 一定与之相交,记交点为a0a_0 :

f(xk+a0dk)=f(xk)+ρa0f(xk)Tdkf(x_k+a_0d_k) = f(x_k)+ \rho a_0\nabla f(x_k)^Td_k

由微分中值定理可知,a1(0,a0)\exists a_1 \in (0 , a_0) 使得

f(xk+a0dk)f(xk)=a0f(xk+a1dk)Tdkf(x_k + a_0 d_k) - f(x_k) = a_0 \nabla f(x_k + a_1 d_k)^T d_k

代入得

f(xk+a1dk)Tdk=ρf(xk)Tdk\nabla f(x_k + a_1 d_k)^T d_k= \rho \nabla f(x_k)^Td_k

1>σ>ρ>01>\sigma>\rho>0 :

ρf(xk)Tdk>σf(xk)Tdk\rho \nabla f(x_k)^Td_k > \sigma \nabla f(x_k)^Td_k

f(xk+a1dk)Tdk>σf(xk)Tdk\nabla f(x_k + a_1 d_k)^T d_k > \sigma\nabla f(x_k)^Td_k

A-G: 若 ρ\rho 与0足够近则第一个不等式一定满足, 考虑第二个不等式的情况。
若不然,对于 a>0\forall a >0 :

f(xk+adk)<f(xk)+(1ρ)af(xk)Tdkf(x_k + ad_k)< f(x_k)+(1-\rho)a\nabla f(x_k)^Td_k

when a,  RHSa\to \infty, \; RHS \to -\infty\quad
f(xk+adk)f(x_k+ad_k)a>0a>0 时有下界矛盾

5

利用最优性条件求下面函数的稳定点,并确定其类型:

5.1

f(x)=2x12+x222x1x2+2x13+x14f(x) = 2x_1^2 + x_2^2 -2x_1x_2 +2x_1^3 +x_1^4

f=(4x12x2+6x12+4x132x22x1)=0\nabla f = \begin{pmatrix} 4x_1-2x_2+6x_1^2+4x_1^3\\ 2x_2-2x_1 \end{pmatrix}=0

(0,0),(1,1),(0.5,0.5)(0,0), (-1,-1), (-0.5,-0.5)

2f=(4+12x1+12x12222x2)=0 \nabla^2 f = \begin{pmatrix} 4+12x_1+12x_1^2 &-2\\ -2 & 2x_2 \end{pmatrix}=0

2f=(4222)(4220)(1221) \nabla^2 f = \begin{pmatrix} 4 &-2\\ -2 & -2 \end{pmatrix} \qquad \begin{pmatrix} 4 &-2\\ -2 & 0 \end{pmatrix} \qquad \begin{pmatrix} 1 &-2\\ -2 & -1 \end{pmatrix}

其中 (0,0)(0,0) 半正定,为极小稳定点。其余不定.

5.2

f(x)=2x133x126x1x2(x1x21)f(x)= 2x_1^3 -3x_1^2 -6x_1x_2(x_1-x_2-1)

f=(2x126x112x1x2+6x22+6x26x12+12x1x2+6x1)=0\nabla f = \begin{pmatrix} 2x_1^2-6x_1-12x_1x_2+6x_2^2+6x_2\\ -6x_1^2+12x_1x_2+6x_1 \end{pmatrix}=0

(0,0),(0,1),(1,0),(1,1)(0,0), (0,-1), (1,0), (-1,-1)

2f=6(2x1x2x1+1x2x1+12x1)=0 \nabla^2 f = 6\begin{pmatrix} 2x_1 &x_2-x_1+1\\ x_2-x_1+1 & 2x_1 \end{pmatrix}=0

162f=(0110)(0000)(2112)(2112) \frac{1}{6}\nabla^2 f = \begin{pmatrix} 0 &1\\ 1 & 0 \end{pmatrix} \qquad \begin{pmatrix} 0 &0\\ 0 & 0 \end{pmatrix} \qquad \begin{pmatrix} 2 &-1\\ -1 & 2 \end{pmatrix} \qquad \begin{pmatrix} -2 &1\\ 1 & -2 \end{pmatrix}

其中 (0,0),(0,1)(0,0), (0,-1) 不定;(1,0)(1,0) 正定,为极小值点;(1,1)(-1,-1) 负定,为极大指点。


优化作业1
https://blog.jacklit.com/2024/10/05/优化作业/
作者
Jack H
发布于
2024年10月5日
许可协议