Machine Learning Homework

1

5.11

Δwhj=ηEwhj=η(Ey^jy^jβj)βjwhj=η[(y^j)(1y^j)(yjy^j)]bh=ηgjbh\begin{aligned} \Delta w_{hj} &= -\eta\frac{\partial E}{\partial w_{hj}} \\ &= -\eta (\frac{\partial E}{\partial \hat y_j} \frac{\partial \hat y_j}{\partial \beta_j})\frac{\partial \beta_j}{\partial w_{hj}}\\ &= -\eta [(-\hat y_j )(1-\hat y_j)(y_j - \hat y_j) ]b_h\\ &= \eta g_j b_h \end{aligned}

5.12

Δθj=ηEθj=ηEy^jy^jθj=η(y^j)(1y^j)(yjy^j)=ηgj\begin{aligned} \Delta\theta_j & =-\eta \frac{\partial E}{\partial \theta_{j}}\\ &= -\eta \frac{\partial E}{\partial \hat y_j}\frac{\partial \hat y_j}{\partial \theta_j}\\ &= -\eta (\hat y_j )(1-\hat y_j)(y_j - \hat y_j)\\ &= -\eta g_j \end{aligned}

5.13

Δvih=ηEvih=ηj=1lEy^jy^jβjβjbhbhαhαhvih=ηj=1lgjwhjbhαhαhvih=ηj=1lgjwhjbh(1bh)xi=ηehxi\begin{aligned} \Delta v_{ih} &= -\eta \frac{\partial E}{\partial v_{ih}}\\ &= -\eta\sum_{j=1}^l\frac{\partial E}{\partial \hat y_j} \frac{\partial \hat y_j}{\partial \beta_j}\frac{\partial \beta_j}{\partial b_h}\frac{\partial b_h}{\partial \alpha_h}\frac{\partial \alpha_h}{\partial v_{ih}}\\ &= \eta\sum_{j=1}^l g_j w_{hj}\frac{\partial b_h}{\partial \alpha_h}\frac{\partial \alpha_h}{\partial v_{ih}}\\ &= \eta \sum_{j=1}^l g_j w_{hj} \cdot b_h(1-b_h)x_i\\ &=\eta e_h x_i \end{aligned}

5.14

Δγh=ηEγh=ηj=1lEy^jy^jβjβjbhbhγh=ηj=1lgjwhj[bh(1bh)]=ηeh\begin{aligned} \Delta \gamma_{h} &= -\eta \frac{\partial E}{\partial \gamma_{h}}\\ &= -\eta\sum_{j=1}^l\frac{\partial E}{\partial \hat y_j} \frac{\partial \hat y_j}{\partial \beta_j}\frac{\partial \beta_j}{\partial b_h}\frac{\partial b_h}{\partial \gamma_{h}}\\ &= \eta \sum_{j=1}^l g_j w_{hj} \cdot [-b_h(1-b_h)]\\ &=-\eta e_h \end{aligned}

update:
paramater=paramater+Δparamaterparamater = paramater + \Delta paramater

卷积层将图像数据根据pixel卷积,将数据混合提取特征。pooling层能降低计算量,使用maxpooling提取数据大的特征。


Machine Learning Homework
https://blog.jacklit.com/2024/10/23/机器学习作业5/
作者
Jack H
发布于
2024年10月23日
许可协议