當(dāng)前位置：首頁 > 人工智能 > pytorch >内容正文

pytorch

深度学习总结：用pytorch做dropout和Batch Normalization时需要注意的地方，用tensorflow做dropout和BN时需要注意的地方,

發(fā)布時(shí)間：2024/9/15 pytorch 56 豆豆

生活随笔收集整理的這篇文章主要介紹了深度学习总结：用pytorch做dropout和Batch Normalization时需要注意的地方，用tensorflow做dropout和BN时需要注意的地方, 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

用pytorch做dropout和BN時(shí)需要注意的地方

pytorch做dropout:

就是train的時(shí)候使用dropout,訓(xùn)練的時(shí)候不使用dropout,
pytorch里面是通過net.eval()固定整個(gè)網(wǎng)絡(luò)參數(shù)，包括不會(huì)更新一些前向的參數(shù)，沒有dropout，BN參數(shù)固定，理論上對(duì)所有的validation set都要使用net.eval()
net.train()表示會(huì)納入梯度的計(jì)算。

net_dropped = torch.nn.Sequential(torch.nn.Linear(1, N_HIDDEN),torch.nn.Dropout(0.5), # drop 50% of the neurontorch.nn.ReLU(),torch.nn.Linear(N_HIDDEN, N_HIDDEN),torch.nn.Dropout(0.5), # drop 50% of the neurontorch.nn.ReLU(),torch.nn.Linear(N_HIDDEN, 1), )for t in range(500):pred_drop = net_dropped(x)loss_drop = loss_func(pred_drop, y)optimizer_drop.zero_grad()loss_drop.backward()optimizer_drop.step()if t % 10 == 0:# change to eval mode in order to fix drop out effectnet_dropped.eval() # parameters for dropout differ from train modetest_pred_drop = net_dropped(test_x)# change back to train modenet_dropped.train()

pytorch做Batch Normalization:

net.eval()固定整個(gè)網(wǎng)絡(luò)參數(shù)，固定BN的參數(shù)，moving_mean 和moving_var，不懂這個(gè)看下圖:

if self.do_bn:bn = nn.BatchNorm1d(10, momentum=0.5)setattr(self, 'bn%i' % i, bn) # IMPORTANT set layer to the Moduleself.bns.append(bn)for epoch in range(EPOCH):print('Epoch: ', epoch)for net, l in zip(nets, losses):net.eval() # set eval mode to fix moving_mean and moving_varpred, layer_input, pre_act = net(test_x)net.train() # free moving_mean and moving_varplot_histogram(*layer_inputs, *pre_acts)

moving_mean 和moving_var

用tensorflow做dropout和BN時(shí)需要注意的地方

dropout和BN都有一個(gè)training的參數(shù)表明到底是train還是test, 表明test那dropout就是不dropout，BN就是固定住了BN的參數(shù)；

tf_is_training = tf.placeholder(tf.bool, None) # to control dropout when training and testing# dropout net d1 = tf.layers.dense(tf_x, N_HIDDEN, tf.nn.relu) d1 = tf.layers.dropout(d1, rate=0.5, training=tf_is_training) # drop out 50% of inputs d2 = tf.layers.dense(d1, N_HIDDEN, tf.nn.relu) d2 = tf.layers.dropout(d2, rate=0.5, training=tf_is_training) # drop out 50% of inputs d_out = tf.layers.dense(d2, 1)for t in range(500):sess.run([o_train, d_train], {tf_x: x, tf_y: y, tf_is_training: True}) # train, set is_training=Trueif t % 10 == 0:# plottingplt.cla()o_loss_, d_loss_, o_out_, d_out_ = sess.run([o_loss, d_loss, o_out, d_out], {tf_x: test_x, tf_y: test_y, tf_is_training: False} # test, set is_training=False) # pytorchdef add_layer(self, x, out_size, ac=None):x = tf.layers.dense(x, out_size, kernel_initializer=self.w_init, bias_initializer=B_INIT)self.pre_activation.append(x)# the momentum plays important rule. the default 0.99 is too high in this case!if self.is_bn: x = tf.layers.batch_normalization(x, momentum=0.4, training=tf_is_train) # when have BNout = x if ac is None else ac(x)return out

當(dāng)BN的training的參數(shù)為train時(shí)，只是表示BN的參數(shù)是可變化的，并不是代表BN會(huì)自己更新moving_mean 和moving_var，因?yàn)檫@個(gè)操作是前向更新的op,在做train之前必須確保moving_mean 和moving_var更新了，更新moving_mean 和moving_var的操作在tf.GraphKeys.UPDATE_OPS

# !! IMPORTANT !! the moving_mean and moving_variance need to be updated,# pass the update_ops with control_dependencies to the train_opupdate_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)with tf.control_dependencies(update_ops):self.train = tf.train.AdamOptimizer(LR).minimize(self.loss)

總結(jié)

以上是生活随笔為你收集整理的深度学习总结：用pytorch做dropout和Batch Normalization时需要注意的地方，用tensorflow做dropout和BN时需要注意的地方,的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇：深度学习总结：GAN，3种方式实现fix
下一篇：深度学习总结：Tensorboard可视