一文简单弄懂tensorflow_在tensorflow中设置梯度衰减
我是從keras入門深度學(xué)習(xí)的,第一個(gè)用的demo是keras實(shí)現(xiàn)的yolov3,代碼很好懂(其實(shí)也不是很好懂,第一次也搞了很久才弄懂)
然后是做的車牌識(shí)別,用了tiny-yolo來(lái)檢測(cè)車牌位置,當(dāng)時(shí)訓(xùn)練有4w張圖片,用了一天來(lái)訓(xùn)練,當(dāng)時(shí)覺(jué)得時(shí)間可能就是這么長(zhǎng),也不懂GPU訓(xùn)練的時(shí)候GPU利用率,所以不怎么在意,后來(lái)隨著項(xiàng)目圖片片的增多,訓(xùn)練時(shí)間越來(lái)越大,受不了了,看了一片文章才注意到GPU利用率的問(wèn)題.想到要用tensorflow原生的api去訓(xùn)練,比如用tf.data.dataset
就找到了這個(gè)tensorflow原生實(shí)現(xiàn)yolo的項(xiàng)目,在訓(xùn)練的時(shí)候發(fā)現(xiàn)他沒(méi)加梯度衰減,訓(xùn)練了一段時(shí)間total loss下不去了,所以加了一個(gè)梯度衰減。想寫一下文章,小白的第一篇文章哈哈哈,大神別噴我的內(nèi)容太簡(jiǎn)單
YunYang1994/tensorflow-yolov3?github.com他好像改了train.py
原來(lái)是這樣的
import tensorflow as tf from core import utils, yolov3 from core.dataset import dataset, Parser sess = tf.Session()IMAGE_H, IMAGE_W = 416, 416 BATCH_SIZE = 8 EPOCHS = 2000*1000 LR = 0.0001 SHUFFLE_SIZE = 1000 CLASSES = utils.read_coco_names('./data/voc.names') ANCHORS = utils.get_anchors('./data/voc_anchors.txt') NUM_CLASSES = len(CLASSES)train_tfrecord = "../VOC/train/voc_train*.tfrecords" test_tfrecord = "../VOC/test/voc_test*.tfrecords"parser = Parser(IMAGE_H, IMAGE_W, ANCHORS, NUM_CLASSES) trainset = dataset(parser, train_tfrecord, BATCH_SIZE, shuffle=SHUFFLE_SIZE) testset = dataset(parser, test_tfrecord , BATCH_SIZE, shuffle=None)is_training = tf.placeholder(tf.bool) example = tf.cond(is_training, lambda: trainset.get_next(), lambda: testset.get_next())images, *y_true = example model = yolov3.yolov3(NUM_CLASSES, ANCHORS)with tf.variable_scope('yolov3'):y_pred = model.forward(images, is_training=is_training)loss = model.compute_loss(y_pred, y_true)optimizer = tf.train.AdamOptimizer(LR) saver = tf.train.Saver(max_to_keep=2)tf.summary.scalar("loss/coord_loss", loss[1]) tf.summary.scalar("loss/sizes_loss", loss[2]) tf.summary.scalar("loss/confs_loss", loss[3]) tf.summary.scalar("loss/class_loss", loss[4])write_op = tf.summary.merge_all() writer_train = tf.summary.FileWriter("./data/train") writer_test = tf.summary.FileWriter("./data/test")update_var = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope="yolov3/yolo-v3") with tf.control_dependencies(update_var):train_op = optimizer.minimize(loss[0], var_list=update_var,global_step=global_step) # only update yolo layersess.run(tf.global_variables_initializer()) pretrained_weights = tf.global_variables(scope="yolov3/darknet-53") load_op = utils.load_weights(var_list=pretrained_weights,weights_file="./darknet53.conv.74") sess.run(load_op)for epoch in range(EPOCHS):run_items = sess.run([train_op, write_op] + loss, feed_dict={is_training:True})writer_train.add_summary(run_items[1], global_step=epoch)writer_train.flush() # Flushes the event file to diskif (epoch+1)%1000 == 0: saver.save(sess, save_path="./checkpoint/yolov3.ckpt", global_step=epoch)run_items = sess.run([write_op] + loss, feed_dict={is_training:False})writer_test.add_summary(run_items[0], global_step=epoch)writer_test.flush() # Flushes the event file to diskprint("EPOCH:%7d tloss_xy:%7.4f tloss_wh:%7.4f tloss_conf:%7.4f tloss_class:%7.4f"%(epoch, run_items[2], run_items[3], run_items[4], run_items[5]))然后我發(fā)現(xiàn)沒(méi)有梯度下降,所以就找了怎么實(shí)現(xiàn)
實(shí)現(xiàn)如下 optimizer = tf.train.AdamOptimizer(LR) 改為 global_step = tf.Variable(0, trainable=False) learning_rate = tf.train.exponential_decay(LR,100,0.93,staircase=True,global_step=global_step) optimizer = tf.train.AdamOptimizer(learning_rate)learningrate 是梯度的類,LR是初始梯度,100是每一百次初始梯度乘以衰減度,這里是第三個(gè)參數(shù)0.93代表了衰減度,globalstep_step = global_step是一定要加的,不然梯度一直保持了初始梯度。
最后加個(gè)打印
tf.summary.scalar('learning_rate',learning_rate)就可以爽快的去訓(xùn)練了
總結(jié)
以上是生活随笔為你收集整理的一文简单弄懂tensorflow_在tensorflow中设置梯度衰减的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: js实现一键复制到剪切板上_js实现各种
- 下一篇: wamp php5.6 mysql5.6