(26)盘古自研框架BackPropagation

我们的盘古自研框架BackPropagation也是同样的Back Propagation梯度下降算法求导计算过程。我们在Forward Propagation前向传播计算使用的是Sigmoid激活函数,即f(x)= 1 / ( 1 +e^(-x)   ),因此在Back Propagation反向传播是对Sigmoid进行求导。求导公式为:df(x) /dx = f(x) * ( 1 - f(x))。 如图所示:

640?wx_fmt=png



1- 42梯度求导

       循环遍历每一个权重,从前往后依次更新权重。例如:计算更新输入层的第一个输入节点到第一层隐藏层的N[1][1]节点(节点索引值4)的权重(The weight from 1 at layers[0] to 4 at layers[1]),梯度计算为derivative = weight_to_node_error* (weight_to_node_value * (1 - weight_to_node_value

)) * weight_from_node_value , 其他的神经元节点也同样进行求导计算,更新Weight权重的时候derivative乘以learning_rate学习率参数,通过weights[j].set_value(weights[j].get_value() - derivative *learning_rate)  更新权重,这是下次计算的时候能够更加准确的关键,因为把上面的计算成果运用在了调整Neuron Network的Weights上。  

 

我们这里是4条数据,把所有的数据训练一遍,然后进行更新。如果是1亿条数据,可能10万条数据进行更新,10万条数据进行更新,在一个Epoch中运行,这样就实现了整个Back Propagation。如果理解了本节的Back Propagation,也能理解RNN、CNN等深度学习算法,它们基于Back Propagation的过程进行了封装,在Back Propagation的引擎上打造出一个框架算法,CNN就是在本节Back Propagation的基础上,前面加了一个步骤,把图片的数据变成矩阵(0、1)或(0,255)的数字,加了一个预处理的过程,其引擎核心和我们的Back Propagation算法是一样的,包括推荐系统、RNN等都是这样,算法在这个引擎上进行了封装。所以,Back Propagation算法的实现非常难,是最重要的基石基础。        


Create_AI_Framework_In5Classes(Day3)版本的BackPropagation.py代码如下:

# -*- coding: utf-8 -*-from service.ForwardPropagation import ForwardPropagation
#完成Deep Learning Framework中最为核心的功能:Back Propagation:#第一步: 从误差的结果出发,从最右侧到最左侧(不包含Input Layer)遍历整个Neuron Network构成的Chain;#第二步:在遍历的时候计算每个Neuron对误差结果应该负的责任;#第三步:进行Derivative计算;#第四步:通过Gradient Desendent来调整Weights的值以减少预测的误差class BackPropagation:
def applyBackPragation(instances, nodes, weights, learning_rate):
num_of_features = len(instances[0]) - 1 #记录输入的Features的个数,instance的最后一列是Real Result
#循环遍历所有的Training Dataset 完成一个Epoch 并进行每个节点所负责的Error的记录for i in range(len(instances)):
#使用Forward Propagation从Input Layer出发,经过Hidden Layers,最后获得Outputnodes = ForwardPropagation.applyForwardPropagation(nodes, weights, instances[i])predicted_value = nodes[len(nodes) - 1].get_value() #记录该次Forward Propagation最终的误差

actual_value = instances[i][num_of_features] #获得当前instance的Real Value
minor_error = predicted_value - actual_value #计算预测值和真实值之间的误差
nodes[len(nodes)-1].set_minor_error(minor_error) #把该误差值设置进Output Layer中的输出节点中
#因为输出节点已经计算完误差,所以会减掉2;#因为Input Layer不参与计算,所以range的三个参数中的第二个参数是num_of_features#该循环遍历是从Output Layer的前面的一个Hidden Layer开始的,或者说是从最后一个Hidden Layer开始的for j in range(len(nodes)-2, num_of_features, -1):target_index = nodes[j].get_index() #从最后一个Hidden Layer的最后一个Neuron开始计算,然后依次向前
sum_minor_error = 0 #存储当前Neuron应该为误差所要负的责任
#循环遍历所有的Weights以获得以target_index为出发点的所有Weightsfor k in range(len(weights)):#如果当前的Weight是以target_index所在的Neuron为出发节点,则说明该Weight需要多结果负(直接)责任if weights[k].get_from_index() == target_index:
affecting_theta = weights[k].get_value() #获得当前Weight的Value
affected_minor_error = 1 #初始化当前Neuron对结果影响的Value
target_minor_error_index = weights[k].get_to_index() #计算当前Neuron所影响的下一个Layer中具体的Neuron的ID
for m in range(len(nodes)):if nodes[m].get_index() == target_minor_error_index:affected_minor_error = nodes[m].get_minor_error()
#获得当前Weight的触发Neuron对结果负责任的具体的值updated_minor_error = affecting_theta * affected_minor_error
#把对下一个Layer中具体误差负责任的所有误差都累加到当前Neuron并保存到当前的Neuron中sum_minor_error = sum_minor_error + updated_minor_error#保存当前的Neuron对下一个Layer的所有的Neurons所造成的Loss影响的总和nodes[j].set_minor_error(sum_minor_error)
#这里是对我们在ForwardPropagation使用的是Sigmoid Activation,所以这里是对Sigmoid进行求导# 然后更新Weights! for j in range(len(weights)):weight_from_node_value = 0weight_to_node_value = 0weight_to_node_error = 0
for k in range(len(nodes)):
if nodes[k].get_index() == weights[j].get_from_index():weight_from_node_value = nodes[k].get_value()
if nodes[k].get_index() == weights[k].get_to_index():weight_to_node_value == nodes[k].get_value()weight_to_node_error = nodes[k].get_minor_error()

#进行求导,因为我们在ForwardPropagation使用的是Sigmoid Activation,所以这里是对Sigmoid进行求导# Forward Propagation中的Sigmoid代码:target_neuron_output = 1 / (1 + math.exp(- target_neuron_input))
derivative = weight_to_node_error * (weight_to_node_value * (1 - weight_to_node_value)) * weight_from_node_value
#更新Weight,这是下次计算的时候能够更加准确的关键,因为把上面的计算成果运用在了调整Neuron Network的Weights上weights[j].set_value(weights[j].get_value() - derivative * learning_rate)

return nodes, weights

在Spyder中运行Neuron_Network_Entry.py代码,运行结果如下:

+1      V1      V2      Hidden layer creation: 1        N[1][1]         N[1][2]         N[1][3]         N[1][4]         N[1][5]         N[1][6]         N[1][7]         N[1][8]         Hidden layer creation: 2        N[2][1]         N[2][2]         N[2][3]         N[2][4]         Hidden layer creation: 3        N[3][1]         N[3][2]         
Output layer: OutputThe weight from 1 at layers[0] to 4 at layers[1] : 0.47748147399057483The weight from 1 at layers[0] to 5 at layers[1] : -0.02110707479923124The weight from 1 at layers[0] to 6 at layers[1] : 0.7733341095569151The weight from 1 at layers[0] to 7 at layers[1] : 0.5078644670528534The weight from 1 at layers[0] to 8 at layers[1] : 0.38637130020873656The weight from 1 at layers[0] to 9 at layers[1] : -0.02973261521013215The weight from 1 at layers[0] to 10 at layers[1] : 0.7863564215821441The weight from 1 at layers[0] to 11 at layers[1] : 0.584129337660958The weight from 2 at layers[0] to 4 at layers[1] : 0.9662745148343004The weight from 2 at layers[0] to 5 at layers[1] : -1.0144235052237622The weight from 2 at layers[0] to 6 at layers[1] : -0.35999827192892087The weight from 2 at layers[0] to 7 at layers[1] : 0.9452891790847016The weight from 2 at layers[0] to 8 at layers[1] : 0.8694648449853173The weight from 2 at layers[0] to 9 at layers[1] : -0.9507722030992092The weight from 2 at layers[0] to 10 at layers[1] : 0.8597852393070331The weight from 2 at layers[0] to 11 at layers[1] : 0.36650281313095845The weight from 4 at layers[1] to 13 at layers[2] : 0.2712530656158141The weight from 4 at layers[1] to 14 at layers[2] : 0.5925551419309834The weight from 4 at layers[1] to 15 at layers[2] : 0.5579557294706121The weight from 4 at layers[1] to 16 at layers[2] : -0.8978389396196884The weight from 5 at layers[1] to 13 at layers[2] : 0.7277740116885274The weight from 5 at layers[1] to 14 at layers[2] : 0.15578162972785603The weight from 5 at layers[1] to 15 at layers[2] : -0.22357710192008196The weight from 5 at layers[1] to 16 at layers[2] : 0.3453610415981725The weight from 6 at layers[1] to 13 at layers[2] : 0.550351356582435The weight from 6 at layers[1] to 14 at layers[2] : 0.5060748969250854The weight from 6 at layers[1] to 15 at layers[2] : 0.04721947834762541The weight from 6 at layers[1] to 16 at layers[2] : -0.6677890147939624The weight from 7 at layers[1] to 13 at layers[2] : -0.05961305347426327The weight from 7 at layers[1] to 14 at layers[2] : -0.16481629338107584The weight from 7 at layers[1] to 15 at layers[2] : 0.8813206653228318The weight from 7 at layers[1] to 16 at layers[2] : -0.8983196364466726The weight from 8 at layers[1] to 13 at layers[2] : -0.6736643251519103The weight from 8 at layers[1] to 14 at layers[2] : -0.6541251761441318The weight from 8 at layers[1] to 15 at layers[2] : -0.20541741096197408The weight from 8 at layers[1] to 16 at layers[2] : -0.37321077993018303The weight from 9 at layers[1] to 13 at layers[2] : -0.7878361744196614The weight from 9 at layers[1] to 14 at layers[2] : 0.33140821180812097The weight from 9 at layers[1] to 15 at layers[2] : 0.2563864547388408The weight from 9 at layers[1] to 16 at layers[2] : -1.012102683292297The weight from 10 at layers[1] to 13 at layers[2] : -0.5448559798520363The weight from 10 at layers[1] to 14 at layers[2] : -0.11627365788742017The weight from 10 at layers[1] to 15 at layers[2] : -0.7739561742471903The weight from 10 at layers[1] to 16 at layers[2] : -0.7129691921855293The weight from 11 at layers[1] to 13 at layers[2] : -0.4221276548861337The weight from 11 at layers[1] to 14 at layers[2] : 0.7386904678796575The weight from 11 at layers[1] to 15 at layers[2] : 0.1229494814415244The weight from 11 at layers[1] to 16 at layers[2] : -0.38677878819130174The weight from 13 at layers[2] to 18 at layers[3] : 0.17354990244958124The weight from 13 at layers[2] to 19 at layers[3] : -0.9227375886179872The weight from 14 at layers[2] to 18 at layers[3] : 0.30595351638798696The weight from 14 at layers[2] to 19 at layers[3] : 0.5872177123875555The weight from 15 at layers[2] to 18 at layers[3] : -0.7038694786418751The weight from 15 at layers[2] to 19 at layers[3] : -0.770278231422793The weight from 16 at layers[2] to 18 at layers[3] : 0.7888640549115773The weight from 16 at layers[2] to 19 at layers[3] : 0.8623348624881537The weight from 18 at layers[3] to 20 at layers[4] : -0.2648177640124265The weight from 19 at layers[3] to 20 at layers[4] : -0.798885807955121Congratulations! Back Propagation is completed!!!Prediction: 0.3866415296079101 while real value is: 0Prediction: 0.38814326765954466 while real value is: 1Prediction: 0.38695158367757343 while real value is: 1Prediction: 0.3879446913610613 while real value is: 0

如图所示是阿尔法狗的深度学习示意图,阿尔法狗的根据人的经验输入数据,进行监督学习,这个思路和我们的Back Propagation思路完全一样的,阿尔法狗的核心有个步骤,第一个步骤是深度学习,第二个是增强学习,从思想层面,本章节在Demo级别,我们已经完成了阿尔法狗深度学习80%的工作量。

             

640?wx_fmt=png

             

1- 43阿尔法狗深度学习


段智华 CSDN认证博客专家 Spark AI 企业级AI技术
本人从事大数据人工智能开发和运维工作十余年,码龄5年,深入研究Spark源码,参与王家林大咖主编出版Spark+AI系列图书5本,清华大学出版社最新出版2本新书《Spark大数据商业实战三部曲:内核解密|商业案例|性能调优》第二版、《企业级AI技术内幕:深度学习框架开发+机器学习案例实战+Alluxio解密》,《企业级AI技术内幕》新书分为盘古人工智能框架开发专题篇、机器学习案例实战篇、分布式内存管理系统Alluxio解密篇。Spark新书第二版以数据智能为灵魂,包括内核解密篇,商业案例篇,性能调优篇和Spark+AI解密篇。从2015年开始撰写博文,累计原创1059篇,博客阅读量达155万次
已标记关键词 清除标记
相关推荐
©️2020 CSDN 皮肤主题: 大白 设计师:CSDN官方博客 返回首页
实付 19.90元
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、C币套餐、付费专栏及课程。

余额充值