优化作业2

1

(1)

g(x)=(2x148x28)G=(2008)g(x) = \begin{pmatrix} 2x_1 - 4 \\ 8x_2 - 8 \end{pmatrix}\qquad G = \begin{pmatrix} 2 &0\\ 0 &8 \end{pmatrix}

let g(x)=0:x1=2,x2=1g(x) = 0 :\quad x_1 = 2, x_2 =1

(2)

如果经过有限次迭代到达, 则必定存在一点 x(k)x^{(k)} , 使得在 x(k)x^{(k)} 处最速下降方向穿过 xx^* , 这要求负梯度方向为坐标轴方向。由于最速下降法梯度的正交性,即gkTgk1=0g_k^Tg_{k-1} = 0 ,所以要求上一步的方向也为坐标轴方向。以此类推,要求初始方向为坐标轴方向。但是初始点 (0,0)T(0,0)^T 处的梯度为 (4,8)T(-4,-8)^T ,不是坐标轴方向,所以不可能。

3

要求 x(1)=(2,?)x^{(1)} = (2,?)(?,1)(?,1) 即可,此时只需要一步就收敛。

2

x(0)x^{(0)} 出发,d=G1g=A1(Ax(0)+b)d = -G^{-1}g = -A^{-1}(Ax^{(0)}+b)

g(x(1))=b+A(x(1))=b+A(x(0)+d)=b+A(x(0)+A1(Ax(0)+b))=0\begin{aligned} g(x^{(1)}) &= b +A(x^{(1)})\\ & = b +A(x^{(0)}+d)\\ & = b +A(x^{(0)}+-A^{-1}(Ax^{(0)}+b))\\ &=0 \end{aligned}

所以一次迭代到达极小点。

3

α=gkTgkgkTAgkfkfk+1=12xkTAxk+bTxk12(xkαgk)TA(xkαgk)bT(xkαgk)=12α2gkTAgk+αxkTAgk+αbTgk=12α2gkTAgk+αgkTgk代入α:=(gkTgk)22gkTAgk\alpha = \frac{g_k^Tg_k}{g_k^TAg_k}\\ \begin{aligned} f_k - f_{k+1} & = \frac{1}{2}x_k^TAx_k +b^T x_k - \frac{1}{2}(x_k-\alpha g_k)^TA(x_k-\alpha g_k) - b^T (x_k-\alpha g_k)\\ &=-\frac{1}{2}\alpha^2g_k^T A g_k + \alpha x_k^T A g_k +\alpha b^T g_k\\ &=-\frac{1}{2}\alpha^2g_k^T A g_k +\alpha g_k^Tg_k\\ \text{代入}\alpha: \quad &= \frac{(g_k^Tg_k)^2}{2g_k^TAg_k} \end{aligned}

4

Bk+1=Bk+βuuT+γvvTBk+1sk=yku=yk,  β=1ykTsk,  v=Bksk,  γ=1skTBskBk+1sk=[Bk+1ykTskykykT+1skTBskBksk(Bksk)T]sk=Bksk+ykBksk=ykB_{k+1} = B_k +\beta uu^T +\gamma vv^T\\ B_{k+1}s_k = y_k\\ u =y_k, \; \beta = \frac{1}{y_k^Ts_k},\;v =B_k s_k ,\; \gamma = \frac{-1}{s_k^TBs_k}\\ \begin{aligned} B_{k+1}s_k &= [B_k +\frac{1}{y_k^Ts_k} y_ky_k^T +\frac{-1}{s_k^TBs_k} B_k s_k(B_k s_k)^T]s_k\\ & = B_ks_k + y_k - B_ks_k\\ &=y_k \end{aligned}

Shermann-Morrison-Woodbury 公式:

(A+βuuT)1=A1βA1uuTA11+βuTA1u(A+\beta uu^T)^{-1} = A^{-1} - \frac{\beta A^{-1} u u^T A^{-1}}{1+\beta u^TA^{-1}u}

Hk=Bk,Hk+0.5=Bk+0.51=(Bk+βuuT)1=Bk1βBk1uuTBk11+βuTBk1u,Hk+1=Bk+11=(Bk+0.5+γvvT)1=Bk+0.51γBk+0.51vvTBk+0.511+γvTBk+0.51v,=Bk1βBk1uuTBk11+βuTBk1uγ(Bk1βBk1uuTBk11+βuTBk1u)vvT(Bk1βBk1uuTBk11+βuTBk1u)1+γvT(Bk1βBk1uuTBk11+βuTBk1u)v,=HkβHkuuTHk1+βuTHkuγ(HkβHkuuTHk1+βuTHku)vvT(HkβHkuuTHk1+βuTHku)1+γvT(HkβHkuuTHk1+βuTHku)v\begin{aligned} H_k &= B_k, \\[8pt] H_{k+0.5} &= B_{k+0.5}^{-1} = (B_k +\beta uu^T)^{-1} \\[8pt] &= B_k^{-1} - \frac{\beta B_k^{-1} u u^T B_k^{-1}}{1 + \beta u^T B_k^{-1} u}, \\[8pt] H_{k+1} &= B_{k+1}^{-1} \\[8pt] &= \left( B_{k+0.5} + \gamma vv^T \right)^{-1} \\[8pt] &= B_{k+0.5}^{-1} - \frac{\gamma B_{k+0.5}^{-1} v v^T B_{k+0.5}^{-1}}{1 + \gamma v^T B_{k+0.5}^{-1} v}, \\[8pt] &= B_k^{-1} - \frac{\beta B_k^{-1} u u^T B_k^{-1}}{1 + \beta u^T B_k^{-1} u} \\[8pt] &\quad - \frac{\gamma \left( B_k^{-1} - \frac{\beta B_k^{-1} u u^T B_k^{-1}}{1 + \beta u^T B_k^{-1} u} \right) v v^T \left( B_k^{-1} - \frac{\beta B_k^{-1} u u^T B_k^{-1}}{1 + \beta u^T B_k^{-1} u} \right)}{1 + \gamma v^T \left( B_k^{-1} - \frac{\beta B_k^{-1} u u^T B_k^{-1}}{1 + \beta u^T B_k^{-1} u} \right) v}, \\[8pt] &= H_k - \frac{\beta H_k u u^T H_k}{1 + \beta u^T H_k u} - \frac{\gamma \left( H_k - \frac{\beta H_k u u^T H_k}{1 + \beta u^T H_k u} \right) v v^T \left( H_k - \frac{\beta H_k u u^T H_k}{1 + \beta u^T H_k u} \right)}{1 + \gamma v^T \left( H_k - \frac{\beta H_k u u^T H_k}{1 + \beta u^T H_k u} \right) v}\\ \end{aligned}

5

(1)

由于DFP算法的方向为 AA 共轭 diTAdj=0d_i^TAd_j =0 并且 si=αidis_i = \alpha_i d_i ,所以:

siAsj=αiαjdiTAdj=0s_iAs_j = \alpha_i\alpha_j d_i^TAd_j =0

(2)

原命题等价于:

Hk+1Asi=si    Hk+1A(xi+1xi)=si    Hk+1A(xi+1+bxib)=si    Hk+1(gi+1gi)=si    Hk+1yi=si\begin{aligned} &H_{k+1} A s_i = s_i\\ \iff & H_{k+1} A(x_{i+1}-x_i) = s_i\\ \iff & H_{k+1} A(x_{i+1}+b - x_i -b) = s_i\\ \iff &H_{k+1} (g_{i+1} - g_i) = s_i\\ \iff &H_{k+1} y_i = s_i \end{aligned}

i=ki = k, 则 Hi+1yi=siH_{i+1} y_i = s_i 显然成立。考虑 k=i+1k=i+1 :

Hi+2yi=(Hi+1+ΔHi+1)yi=si+(si+1si+1Tsi+1Tyi+1Hi+1yi+1yi+1THi+1yi+1THi+1yi+1)yi=si+(si+1si+1Tyisi+1Tyi+1Hi+1yi+1yi+1THi+1yiyi+1THi+1yi+1)=si+yi+1THi+1yi+1si+1si+1TyiHi+1yi+1yi+1Tsisi+1Tyi+1si+1Tyi+1yi+1THi+1yi+1=si\begin{aligned} H_{i+2} y_i & = (H_{i+1} + \Delta H_{i+1})y_i\\ & = s_i + \left(\frac{s_{i+1}s_{i+1}^T}{s_{i+1}^Ty_{i+1}} - \frac{H_{i+1}y_{i+1}y_{i+1}^TH_{i+1}}{y_{i+1}^TH_{i+1}y_{i+1}}\right)y_i\\ & = s_i +\left(\frac{s_{i+1}s_{i+1}^T y_i}{s_{i+1}^Ty_{i+1}} - \frac{H_{i+1}y_{i+1}y_{i+1}^TH_{i+1}y_i}{y_{i+1}^TH_{i+1}y_{i+1}}\right)\\ & = s_i +\frac{y_{i+1}^TH_{i+1}y_{i+1}s_{i+1}s_{i+1}^T y_i - H_{i+1}y_{i+1}y_{i+1}^Ts_is_{i+1}^Ty_{i+1}}{s_{i+1}^Ty_{i+1}y_{i+1}^TH_{i+1}y_{i+1}}\\ &=s_i \end{aligned}


优化作业2
https://blog.jacklit.com/2024/11/09/优化作业2/
作者
Jack H
发布于
2024年11月9日
许可协议