深度学习总结:用pytorch做dropout和Batch Normalization时需要注意的地方,用tensorflow做dropout和BN时需要注意的地方,
生活随笔
收集整理的這篇文章主要介紹了
深度学习总结:用pytorch做dropout和Batch Normalization时需要注意的地方,用tensorflow做dropout和BN时需要注意的地方,
小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.
用pytorch做dropout和BN時(shí)需要注意的地方
pytorch做dropout:
就是train的時(shí)候使用dropout,訓(xùn)練的時(shí)候不使用dropout,
pytorch里面是通過net.eval()固定整個(gè)網(wǎng)絡(luò)參數(shù),包括不會(huì)更新一些前向的參數(shù),沒有dropout,BN參數(shù)固定,理論上對(duì)所有的validation set都要使用net.eval()
net.train()表示會(huì)納入梯度的計(jì)算。
pytorch做Batch Normalization:
net.eval()固定整個(gè)網(wǎng)絡(luò)參數(shù),固定BN的參數(shù),moving_mean 和moving_var,不懂這個(gè)看下圖:
if self.do_bn:bn = nn.BatchNorm1d(10, momentum=0.5)setattr(self, 'bn%i' % i, bn) # IMPORTANT set layer to the Moduleself.bns.append(bn)for epoch in range(EPOCH):print('Epoch: ', epoch)for net, l in zip(nets, losses):net.eval() # set eval mode to fix moving_mean and moving_varpred, layer_input, pre_act = net(test_x)net.train() # free moving_mean and moving_varplot_histogram(*layer_inputs, *pre_acts)moving_mean 和moving_var
用tensorflow做dropout和BN時(shí)需要注意的地方
dropout和BN都有一個(gè)training的參數(shù)表明到底是train還是test, 表明test那dropout就是不dropout,BN就是固定住了BN的參數(shù);
tf_is_training = tf.placeholder(tf.bool, None) # to control dropout when training and testing# dropout net d1 = tf.layers.dense(tf_x, N_HIDDEN, tf.nn.relu) d1 = tf.layers.dropout(d1, rate=0.5, training=tf_is_training) # drop out 50% of inputs d2 = tf.layers.dense(d1, N_HIDDEN, tf.nn.relu) d2 = tf.layers.dropout(d2, rate=0.5, training=tf_is_training) # drop out 50% of inputs d_out = tf.layers.dense(d2, 1)for t in range(500):sess.run([o_train, d_train], {tf_x: x, tf_y: y, tf_is_training: True}) # train, set is_training=Trueif t % 10 == 0:# plottingplt.cla()o_loss_, d_loss_, o_out_, d_out_ = sess.run([o_loss, d_loss, o_out, d_out], {tf_x: test_x, tf_y: test_y, tf_is_training: False} # test, set is_training=False) # pytorchdef add_layer(self, x, out_size, ac=None):x = tf.layers.dense(x, out_size, kernel_initializer=self.w_init, bias_initializer=B_INIT)self.pre_activation.append(x)# the momentum plays important rule. the default 0.99 is too high in this case!if self.is_bn: x = tf.layers.batch_normalization(x, momentum=0.4, training=tf_is_train) # when have BNout = x if ac is None else ac(x)return out當(dāng)BN的training的參數(shù)為train時(shí),只是表示BN的參數(shù)是可變化的,并不是代表BN會(huì)自己更新moving_mean 和moving_var,因?yàn)檫@個(gè)操作是前向更新的op,在做train之前必須確保moving_mean 和moving_var更新了,更新moving_mean 和moving_var的操作在tf.GraphKeys.UPDATE_OPS
# !! IMPORTANT !! the moving_mean and moving_variance need to be updated,# pass the update_ops with control_dependencies to the train_opupdate_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)with tf.control_dependencies(update_ops):self.train = tf.train.AdamOptimizer(LR).minimize(self.loss)總結(jié)
以上是生活随笔為你收集整理的深度学习总结:用pytorch做dropout和Batch Normalization时需要注意的地方,用tensorflow做dropout和BN时需要注意的地方,的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 深度学习总结:GAN,3种方式实现fix
- 下一篇: 深度学习总结:Tensorboard可视