日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 > 人工智能 > Caffe >内容正文

Caffe

Caffe官方教程翻译(7):Fine-tuning for Style Recognition

發(fā)布時(shí)間:2025/3/21 Caffe 94 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Caffe官方教程翻译(7):Fine-tuning for Style Recognition 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

前言

最近打算重新跟著官方教程學(xué)習(xí)一下caffe,順便也自己翻譯了一下官方的文檔。自己也做了一些標(biāo)注,都用斜體標(biāo)記出來了。中間可能額外還加了自己遇到的問題或是運(yùn)行結(jié)果之類的。歡迎交流指正,拒絕噴子!
官方教程的原文鏈接:http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/02-fine-tuning.ipynb

Fine-tuning a Pretrained Network for Style Recognition

在這個(gè)例子中,我們會(huì)一起探索一種在現(xiàn)實(shí)世界的應(yīng)用中比較常用的方法:使用一個(gè)預(yù)訓(xùn)練模型,并使用用戶自定義的數(shù)據(jù)集來微調(diào)網(wǎng)絡(luò)的參數(shù)。
這個(gè)方法的優(yōu)點(diǎn)就是,既然預(yù)訓(xùn)練的網(wǎng)絡(luò)已經(jīng)事先使用很大的數(shù)據(jù)集訓(xùn)練好了,那么網(wǎng)絡(luò)的中間層可以獲取大多數(shù)視覺上的語義(“semantics”)信息。有關(guān)這里說到的“語義”(“semantics”)這個(gè)詞,我們把它當(dāng)一個(gè)黑盒子來看,就將它想象成一個(gè)非常強(qiáng)大且通用的視覺特征就行。更重要的是,我們只需要一個(gè)相對(duì)較小的數(shù)據(jù)集就足夠在目標(biāo)任務(wù)上取得不錯(cuò)的結(jié)果。
首先,我們需要準(zhǔn)備好數(shù)據(jù)集。包括以下幾個(gè)步驟:(1)通過提供的shell腳本來獲取ImageNet ilsvrc的預(yù)訓(xùn)練模型。(2)從整個(gè)Flickr style數(shù)據(jù)集中下載一個(gè)子數(shù)據(jù)集。(3)編譯下載好的Flickr數(shù)據(jù)集成Caffe可以使用的格式。

# 指定caffe路徑 caffe_root = '/home/xhb/caffe/caffe/' # this file should be run from {caffe_root}/examples (otherwise change this line)# 導(dǎo)入caffe import sys sys.path.insert(0, caffe_root + 'python') import caffe# 備注:我是在筆記本上用CPU跑的,所以改成了CPU模式 # caffe.set_device(0) # caffe.set_mode_gpu() caffe.set_mode_cpu()import numpy as np from pylab import * %matplotlib inline import tempfile# 對(duì)圖像格式等做一下處理 # Helper function for deprocessing preprocessed images, e.g., for display. def deprocess_net_image(image):image = image.copy() # don't modify destructivelyimage = image[::-1] # BGR -> RGBimage = image.transpose(1, 2, 0) # CHW -> HWCimage += [123, 117, 104] # (approximately) undo mean subtraction# clamp values in [0, 255]image[image < 0], image[image > 255] = 0, 255# round and cast from float32 to uint8image = np.round(image)image = np.require(image, dtype=np.uint8)return image

1.準(zhǔn)備數(shù)據(jù)集

下載此例子需要的數(shù)據(jù)集。

  • get_ilsvrc_aux.sh:下載ImageNet數(shù)據(jù)集的均值文件,標(biāo)簽文件等。
  • download_model_binary.py:下載預(yù)訓(xùn)練好的模型。
  • finetune_flickr_style/assemble_data.py:下載用于圖像風(fēng)格檢測的訓(xùn)練和測試的數(shù)據(jù)集,后面就簡稱style數(shù)據(jù)集。

我們?cè)谙旅娴木毩?xí)中,會(huì)從整個(gè)數(shù)據(jù)集中下載一個(gè)較小的子數(shù)據(jù)集:8萬張圖片中只下載2000張,從20個(gè)風(fēng)格類別中只下載5種類別。(如果想要下載完整數(shù)據(jù)集,修改下面對(duì)應(yīng)的代碼成full_dataset=True即可。)

# Download just a small subset of the data for this exercise. # (2000 of 80K images, 5 of 20 labels.) # To download the entire dataset, set `full_dataset = True`. full_dataset = False if full_dataset:NUM_STYLE_IMAGES = NUM_STYLE_LABELS = -1 else:NUM_STYLE_IMAGES = 2000NUM_STYLE_LABELS = 5# This downloads the ilsvrc auxiliary data (mean file, etc), # and a subset of 2000 images for the style recognition task. # 備注:下面這段代碼我運(yùn)行時(shí)將其注釋了,因?yàn)槲沂孪仍诿钚邢逻\(yùn)行過腳本,事先下載好了要用到的幾個(gè)文件 ''' import os os.chdir(caffe_root) # run scripts from caffe root !data/ilsvrc12/get_ilsvrc_aux.sh !scripts/download_model_binary.py models/bvlc_reference_caffenet !python examples/finetune_flickr_style/assemble_data.py \--workers=-1 --seed=1701 \--images=$NUM_STYLE_IMAGES --label=$NUM_STYLE_LABELS # back to examples os.chdir('examples') ''' "\nimport os\nos.chdir(caffe_root) # run scripts from caffe root\n!data/ilsvrc12/get_ilsvrc_aux.sh\n!scripts/download_model_binary.py models/bvlc_reference_caffenet\n!python examples/finetune_flickr_style/assemble_data.py --workers=-1 --seed=1701 --images=$NUM_STYLE_IMAGES --label=$NUM_STYLE_LABELS\n# back to examples\nos.chdir('examples')\n"

定義weights,路徑指向我之前下載的使用ImageNet數(shù)據(jù)集預(yù)訓(xùn)練好的權(quán)重,請(qǐng)確保這個(gè)文件要存在。

import os weights = os.path.join(caffe_root, 'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel') assert os.path.exists(weights)

從ilsvrc12/synset_words.txt導(dǎo)入1000個(gè)ImageNet的標(biāo)簽,并從finetune_flickr_style/style_names.txt導(dǎo)入5個(gè)style數(shù)據(jù)集的標(biāo)簽。

# Load ImageNet labels to imagenet_labels imagenet_label_file = caffe_root + 'data/ilsvrc12/synset_words.txt' imagenet_labels = list(np.loadtxt(imagenet_label_file, str, delimiter='\t')) assert len(imagenet_labels) == 1000 print 'Loaded ImageNet labels:\n', '\n'.join(imagenet_labels[:10] + ['...'])# Load style labels to style_labels style_label_file = caffe_root + 'examples/finetune_flickr_style/style_names.txt' style_labels = list(np.loadtxt(style_label_file, str, delimiter='\n')) if NUM_STYLE_LABELS > 0:style_labels = style_labels[:NUM_STYLE_LABELS] print '\nLoaded style labels:\n', ', '.join(style_labels) Loaded ImageNet labels: n01440764 tench, Tinca tinca n01443537 goldfish, Carassius auratus n01484850 great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias n01491361 tiger shark, Galeocerdo cuvieri n01494475 hammerhead, hammerhead shark n01496331 electric ray, crampfish, numbfish, torpedo n01498041 stingray n01514668 cock n01514859 hen n01518878 ostrich, Struthio camelus ...Loaded style labels: Detailed, Pastel, Melancholy, Noir, HDR

2.定義網(wǎng)絡(luò)并運(yùn)行

我們一開始先定義caffenet()函數(shù),用來初始化CaffeNet結(jié)構(gòu)(AlexNet的一個(gè)小的變種網(wǎng)絡(luò))。該函數(shù)使用參數(shù)來指定數(shù)據(jù)和輸出類別的數(shù)量。

from caffe import layers as L from caffe import params as Pweight_param = dict(lr_mult=1, decay_mult=1) bias_param = dict(lr_mult=2, decay_mult=0) learned_param = [weight_param, bias_param]frozen_param = [dict(lr_mult=0)] * 2# 卷積層加ReLU單元 def conv_relu(bottom, ks, nout, stride=1, pad=0, group=1,param=learned_param,weight_filler=dict(type='gaussian', std=0.01),bias_filler=dict(type='constant', value=0.1)):conv = L.Convolution(bottom, kernel_size=ks, stride=stride,num_output=nout, pad=pad, group=group,param=param, weight_filler=weight_filler,bias_filler=bias_filler)return conv, L.ReLU(conv, in_place=True)# 全連接層加ReLU單元 def fc_relu(bottom, nout, param=learned_param,weight_filler=dict(type='gaussian', std=0.005),bias_filler=dict(type='constant', value=0.1)):fc = L.InnerProduct(bottom, num_output=nout, param=param,weight_filler=weight_filler,bias_filler=bias_filler)return fc, L.ReLU(fc, in_place=True)# 最大池化 def max_pool(bottom, ks, stride=1):return L.Pooling(bottom, pool=P.Pooling.MAX, kernel_size=ks, stride=stride)# caffenet網(wǎng)絡(luò) def caffenet(data, label=None, train=True, num_classes=1000,classifier_name='fc8', learn_all=False):"""Returns a NetSpec specifying CaffeNet, following the original proto textspecification (./models/bvlc_reference_caffenet/train_val.prototxt)."""n = caffe.NetSpec()# 按照套路來,一層一層接下去n.data = dataparam = learned_param if learn_all else frozen_paramn.conv1, n.relu1 = conv_relu(n.data, 11, 96, stride=4, param=param)n.pool1 = max_pool(n.relu1, 3, stride=2)n.norm1 = L.LRN(n.pool1, local_size=5, alpha=1e-4, beta=0.75)n.conv2, n.relu2 = conv_relu(n.norm1, 5, 256, pad=2, group=2, param=param)n.pool2 = max_pool(n.relu2, 3, stride=2)n.norm2 = L.LRN(n.pool2, local_size=5, alpha=1e-4, beta=0.75)n.conv3, n.relu3 = conv_relu(n.norm2, 3, 384, pad=1, param=param)n.conv4, n.relu4 = conv_relu(n.relu3, 3, 384, pad=1, group=2, param=param)n.conv5, n.relu5 = conv_relu(n.relu4, 3, 256, pad=1, group=2, param=param)n.pool5 = max_pool(n.relu5, 3, stride=2)n.fc6, n.relu6 = fc_relu(n.pool5, 4096, param=param)# 訓(xùn)練集還要加上一個(gè)Dropout,測試集就不需要;加上Dropout,以防止過擬合if train:n.drop6 = fc7input = L.Dropout(n.relu6, in_place=True)else:fc7input = n.relu6n.fc7, n.relu7 = fc_relu(fc7input, 4096, param=param)# 訓(xùn)練集還要加上一個(gè)Dropout,測試集就不需要;加上Dropout,以防止過擬合if train:n.drop7 = fc8input = L.Dropout(n.relu7, in_place=True)else:fc8input = n.relu7# always learn fc8 (param=learned_param)fc8 = L.InnerProduct(fc8input, num_output=num_classes, param=learned_param)# give fc8 the name specified by argument `classifier_name`n.__setattr__(classifier_name, fc8)# 如果不是訓(xùn)練模式,即測試模式,fc8接上一個(gè)softmax,輸出置信率if not train:n.probs = L.Softmax(fc8)# 如果給了label,建立loss和acc層,loss為損失函數(shù),acc計(jì)算準(zhǔn)確率if label is not None:n.label = labeln.loss = L.SoftmaxWithLoss(fc8, n.label)n.acc = L.Accuracy(fc8, n.label)# write the net to a temporary file and return its filenamewith tempfile.NamedTemporaryFile(delete=False) as f:f.write(str(n.to_proto()))return f.name

現(xiàn)在,我們來建立一個(gè)CaffeNet,輸入為沒打標(biāo)簽的”dummy data”。這樣子,我們可以從外部設(shè)置它的輸入數(shù)據(jù),也能看看它預(yù)測的ImageNet類別是哪個(gè)。

dummy_data = L.DummyData(shape=dict(dim=[1, 3, 227, 227])) imagenet_net_filename = caffenet(data=dummy_data, train=False) imagenet_net = caffe.Net(imagenet_net_filename, weights, caffe.TEST)

定義一個(gè)style_net函數(shù),調(diào)用前面的caffenet函數(shù),輸入的數(shù)據(jù)為Flickr style數(shù)據(jù)集。
這個(gè)新的網(wǎng)絡(luò)也有CaffeNet的結(jié)構(gòu),但是區(qū)別在輸入和輸出上:

  • 輸入是我們下載好的Flickr style數(shù)據(jù)集,使用ImageData層將其讀入
  • 輸出是一個(gè)超過20個(gè)類的分布,而不是原始的ImageNet類別的1000個(gè)類
  • 分類層由fc8被重命名或fc8_flickr,以告訴Caffe不要從ImageNet預(yù)訓(xùn)練模型中導(dǎo)入原始的fc8層
def style_net(train=True, learn_all=False, subset=None):if subset is None:subset = 'train' if train else 'test'source = caffe_root + 'data/flickr_style/%s.txt' % subsettransform_param = dict(mirror=train, crop_size=227,mean_file=caffe_root + 'data/ilsvrc12/imagenet_mean.binaryproto')style_data, style_label = L.ImageData(transform_param=transform_param, source=source,batch_size=50, new_height=256, new_width=256, ntop=2)return caffenet(data=style_data, label=style_label, train=train,num_classes=NUM_STYLE_LABELS,classifier_name='fc8_flickr',learn_all=learn_all)

使用style_net函數(shù)來初始化untrained_style_net,結(jié)構(gòu)也是CaffeNet,但是輸入圖像是來自style數(shù)據(jù)集,權(quán)重是來自ImageNet預(yù)訓(xùn)練模型。
在untrained_style_net調(diào)用forward函數(shù)來從style數(shù)據(jù)集獲取一個(gè)batch。

untrained_style_net = caffe.Net(style_net(train=False, subset='train'),weights, caffe.TEST) untrained_style_net.forward() style_data_batch = untrained_style_net.blobs['data'].data.copy() style_label_batch = np.array(untrained_style_net.blobs['label'].data, dtype=np.int32)

從一個(gè)batch的50張圖像中選取一整圖像輸入style net(前面使用style_net()函數(shù)定義的網(wǎng)絡(luò),后面都簡稱style net)。這里我們?nèi)我膺x一張圖片,就選取一個(gè)batch中的第8張圖片。將圖片顯示出來,然后跑一邊ImageNet預(yù)訓(xùn)練模型imagenet_net,接著顯示從1000個(gè)ImageNet類中預(yù)測的前5個(gè)結(jié)果。
下面,我們選了一張圖片,這張圖片中是有關(guān)海灘的,由于”sandbar”和”seashore”都是ImageNet-1000中的類別,所以網(wǎng)絡(luò)在預(yù)測這張圖片時(shí)預(yù)測結(jié)果還算合理。然而對(duì)于其他圖片,預(yù)測結(jié)果就不怎么好了,有時(shí)由于網(wǎng)絡(luò)沒能檢測到圖片中的物體,也可能不是所有圖片都包含ImageNet的100個(gè)類別中的物體。修改batch_index變量,它的默認(rèn)值是8,也可改成0-49中的任意數(shù)值(因?yàn)橐粋€(gè)batch就只有50個(gè)樣本),來看看預(yù)測結(jié)果。(如果不想使用這個(gè)batch的50張圖像,可以運(yùn)行上面的cell重新導(dǎo)入一個(gè)新的batch到style_net)

def disp_preds(net, image, labels, k=5, name='ImageNet'):input_blob = net.blobs['data']net.blobs['data'].data[0, ...] = imageprobs = net.forward(start='conv1')['probs'][0]top_k = (-probs).argsort()[:k]print 'top %d predicted %s labels =' % (k, name)print '\n'.join('\t(%d) %5.2f%% %s' % (i+1, 100*probs[p], labels[p])for i, p in enumerate(top_k))def disp_imagenet_preds(net, image):disp_preds(net, image, imagenet_labels, name='ImageNet')def disp_style_preds(net, image):disp_preds(net, image, style_labels, name='style') batch_index = 8 image = style_data_batch[batch_index] plt.imshow(deprocess_net_image(image)) print 'actual label =', style_labels[style_label_batch[batch_index]] actual label = Melancholy

disp_imagenet_preds(imagenet_net, image) top 5 predicted ImageNet labels =(1) 69.89% n09421951 sandbar, sand bar(2) 21.75% n09428293 seashore, coast, seacoast, sea-coast(3) 3.22% n02894605 breakwater, groin, groyne, mole, bulwark, seawall, jetty(4) 1.89% n04592741 wing(5) 1.23% n09332890 lakeside, lakeshore disp_style_preds(untrained_style_net, image) top 5 predicted style labels =(1) 20.00% Detailed(2) 20.00% Pastel(3) 20.00% Melancholy(4) 20.00% Noir(5) 20.00% HDR

因?yàn)檫@兩個(gè)模型在conv1到fc7層之間使用的是相同的預(yù)訓(xùn)練權(quán)重,所以我們也可以在分類層變成與ImageNet預(yù)訓(xùn)練模型一樣之前,驗(yàn)證在fc7上的激勵(lì)函數(shù)輸出。

diff = untrained_style_net.blobs['fc7'].data[0] - imagenet_net.blobs['fc7'].data[0] error = (diff ** 2).sum() assert error < 1e-8

刪除untrained_style_net以節(jié)約內(nèi)存。imagenet_net先放一放,后面還會(huì)用到。

del untrained_style_net

3.訓(xùn)練風(fēng)格分類器

現(xiàn)在我們要?jiǎng)?chuàng)建一個(gè)函數(shù)solver來創(chuàng)建caffe的solver,我們可以用這個(gè)solver來訓(xùn)練網(wǎng)絡(luò)。在這個(gè)函數(shù)中,我們將為各種用于訓(xùn)練網(wǎng)絡(luò)、顯示、”snapshotting”的各種參數(shù)設(shè)置初值。請(qǐng)參考注釋理解各個(gè)參數(shù)的作用。你也可以試著自己修改一些參數(shù),看看能不能取得更好的效果!

from caffe.proto import caffe_pb2def solver(train_net_path, test_net_path=None, base_lr=0.001):s = caffe_pb2.SolverParameter()# Specify locations of the train and (maybe) test networks.s.train_net = train_net_path# 設(shè)置了與測試相關(guān)的參數(shù):test_interval:每訓(xùn)練1000次測試一次;test_iter:測試中,每次迭代送入100個(gè)batch;if test_net_path is not None:s.test_net.append(test_net_path)s.test_interval = 1000 # Test after every 1000 training iterations.s.test_iter.append(100) # Test on 100 batches each time we test.# The number of iterations over which to average the gradient.# Effectively boosts the training batch size by the given factor, without# affecting memory utilization.s.iter_size = 1# 最大迭代次數(shù)s.max_iter = 100000 # # of times to update the net (training iterations)# Solve using the stochastic gradient descent (SGD) algorithm.# Other choices include 'Adam' and 'RMSProp'.s.type = 'SGD' # 迭代使用的優(yōu)化算法:SGD——隨機(jī)梯度下降法,也可以試試其他的算法比如:Adam、RMSProp# Set the initial learning rate for SGD.s.base_lr = base_lr # SGD的初始學(xué)習(xí)率# Set `lr_policy` to define how the learning rate changes during training.# Here, we 'step' the learning rate by multiplying it by a factor `gamma`# every `stepsize` iterations.s.lr_policy = 'step's.gamma = 0.1s.stepsize = 20000# Set other SGD hyperparameters. Setting a non-zero `momentum` takes a# weighted average of the current gradient and previous gradients to make# learning more stable. L2 weight decay regularizes learning, to help prevent# the model from overfitting.s.momentum = 0.9s.weight_decay = 5e-4# Display the current training loss and accuracy every 1000 iterations.s.display = 1000 # 每迭代1000次,會(huì)在終端打印信息,包括訓(xùn)練的loss值和準(zhǔn)確率# Snapshots are files used to store networks we've trained. Here, we'll# snapshot every 10K iterations -- ten times during training.s.snapshot = 10000 # 每過10000次迭代,保存一次當(dāng)前網(wǎng)絡(luò)s.snapshot_prefix = caffe_root + 'models/finetune_flickr_style/finetune_flickr_style' # 保存網(wǎng)絡(luò)的路徑# Train on the GPU. Using the CPU to train large networks is very slow. # s.solver_mode = caffe_pb2.SolverParameter.GPUs.solver_mode = caffe_pb2.SolverParameter.CPU # 原本這里應(yīng)該是GPU模式,我在筆記本上跑,所以換成了CPU模式# Write the solver to a temporary file and return its filename.# 寫入臨時(shí)文件with tempfile.NamedTemporaryFile(delete=False) as f:f.write(str(s))return f.name

現(xiàn)在我們要調(diào)用上面定義好的solver來訓(xùn)練style網(wǎng)絡(luò)的分類層。
不過,如果想要在命令行下調(diào)用solver來訓(xùn)練網(wǎng)絡(luò),也是可以的。指令如下:
build/tools/caffe train \ -solver models/finetune_flickr_style/solver.prototxt \ -weights models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel \ -gpu 0
補(bǔ)充:如果不適用gpu模式不加-gpu 0就可以了。
但是我們?cè)谶@個(gè)例子中使用python來訓(xùn)練網(wǎng)絡(luò)。
首先,定義一個(gè)run_solvers函數(shù),這個(gè)函數(shù)會(huì)在循環(huán)中獲取solvers列表的每個(gè)元素,并一步一步迭代訓(xùn)練網(wǎng)絡(luò),同時(shí)記錄每次迭代的損失值和準(zhǔn)確率。最后會(huì)將訓(xùn)練好的權(quán)重保存到一個(gè)文件中。

def run_solvers(niter, solvers, disp_interval=10):"""Run solvers for niter iterations,returning the loss and accuracy recorded each iteration.`solvers` is a list of (name, solver) tuples."""blobs = ('loss', 'acc')loss, acc = ({name: np.zeros(niter) for name, _ in solvers}for _ in blobs)for it in range(niter):for name, s in solvers:s.step(1) # run a single SGD step in Caffeloss[name][it], acc[name][it] = (s.net.blobs[b].data.copy()for b in blobs)if it % disp_interval == 0 or it + 1 == niter:loss_disp = '; '.join('%s: loss=%.3f, acc=%2d%%' %(n, loss[n][it], np.round(100*acc[n][it]))for n, _ in solvers)print '%3d) %s' % (it, loss_disp) # Save the learned weights from both nets.weight_dir = tempfile.mkdtemp()weights = {}for name, s in solvers:filename = 'weights.%s.caffemodel' % nameweights[name] = os.path.join(weight_dir, filename)s.net.save(weights[name])return loss, acc, weights

接下來,運(yùn)行創(chuàng)建好的solver來訓(xùn)練那個(gè)用于風(fēng)格識(shí)別的網(wǎng)絡(luò)。我們接下來會(huì)創(chuàng)建兩個(gè)網(wǎng)絡(luò)——一個(gè)(style_solver)的使用ImageNet預(yù)訓(xùn)練網(wǎng)絡(luò)的權(quán)重初始化,另一個(gè)(scratch_style_solver)使用隨機(jī)初始化的權(quán)重。
在訓(xùn)練過程中,我們會(huì)看到使用ImageNet預(yù)訓(xùn)練好的權(quán)重的網(wǎng)絡(luò),相比隨機(jī)初始化權(quán)重的網(wǎng)絡(luò)訓(xùn)練得更快,準(zhǔn)確率也更高。

niter = 200 # number of iterations to train# Reset style_solver as before. style_solver_filename = solver(style_net(train=True)) style_solver = caffe.get_solver(style_solver_filename) style_solver.net.copy_from(weights)# For reference, we also create a solver that isn't initialized from # the pretrained ImageNet weights. scratch_style_solver_filename = solver(style_net(train=True)) scratch_style_solver = caffe.get_solver(scratch_style_solver_filename)print 'Running solvers for %d iterations...' % niter solvers = [('pretrained', style_solver),('scratch', scratch_style_solver)] loss, acc, weights = run_solvers(niter, solvers) print 'Done.'train_loss, scratch_train_loss = loss['pretrained'], loss['scratch'] train_acc, scratch_train_acc = acc['pretrained'], acc['scratch'] style_weights, scratch_style_weights = weights['pretrained'], weights['scratch']# Delete solvers to save memory. del style_solver, scratch_style_solver, solvers Running solvers for 200 iterations...0) pretrained: loss=1.609, acc= 0%; scratch: loss=1.609, acc= 0%10) pretrained: loss=1.371, acc=46%; scratch: loss=1.625, acc=14%20) pretrained: loss=1.082, acc=58%; scratch: loss=1.641, acc=12%30) pretrained: loss=0.994, acc=58%; scratch: loss=1.612, acc=22%40) pretrained: loss=0.893, acc=58%; scratch: loss=1.593, acc=24%50) pretrained: loss=1.240, acc=52%; scratch: loss=1.611, acc=30%60) pretrained: loss=1.096, acc=54%; scratch: loss=1.621, acc=16%70) pretrained: loss=0.989, acc=50%; scratch: loss=1.591, acc=28%80) pretrained: loss=0.962, acc=68%; scratch: loss=1.593, acc=34%90) pretrained: loss=1.172, acc=56%; scratch: loss=1.606, acc=24% 100) pretrained: loss=0.849, acc=64%; scratch: loss=1.587, acc=30% 110) pretrained: loss=1.005, acc=52%; scratch: loss=1.587, acc=30% 120) pretrained: loss=0.870, acc=64%; scratch: loss=1.595, acc=24% 130) pretrained: loss=0.970, acc=62%; scratch: loss=1.590, acc=28% 140) pretrained: loss=0.908, acc=58%; scratch: loss=1.603, acc=18% 150) pretrained: loss=0.608, acc=76%; scratch: loss=1.614, acc=20% 160) pretrained: loss=0.816, acc=70%; scratch: loss=1.598, acc=26% 170) pretrained: loss=1.281, acc=52%; scratch: loss=1.622, acc=16% 180) pretrained: loss=0.870, acc=72%; scratch: loss=1.630, acc=12% 190) pretrained: loss=0.909, acc=66%; scratch: loss=1.609, acc=20% 199) pretrained: loss=1.086, acc=62%; scratch: loss=1.616, acc=18% Done.

對(duì)比兩個(gè)網(wǎng)絡(luò)訓(xùn)練的loss和acc看看。可以看出使用ImageNet預(yù)訓(xùn)練網(wǎng)絡(luò)的loss的下降速度非常快,而隨機(jī)初始化權(quán)重的網(wǎng)絡(luò)的訓(xùn)練速度則慢很多。

plot(np.vstack([train_loss, scratch_train_loss]).T) xlabel('Iteration #') ylabel('Loss') Text(0,0.5,u'Loss')

plot(np.vstack([train_acc, scratch_train_acc]).T) xlabel('Iteration #') ylabel('Accuracy') Text(0,0.5,u'Accuracy')

再來看看迭代200次后的測試準(zhǔn)確率。要預(yù)測的只有5個(gè)類,那么平均下來的準(zhǔn)確率應(yīng)該是20%左右。我們肯定期望,兩個(gè)網(wǎng)絡(luò)取得的結(jié)果都超過隨機(jī)狀況下的20%的準(zhǔn)確率,另外,我們還期望使用ImageNet預(yù)訓(xùn)練網(wǎng)絡(luò)的結(jié)果遠(yuǎn)比隨機(jī)初始化網(wǎng)絡(luò)的結(jié)果好。讓我們拭目以待吧!

def eval_style_net(weights, test_iters=10):test_net = caffe.Net(style_net(train=False), weights, caffe.TEST)accuracy = 0for it in xrange(test_iters):accuracy += test_net.forward()['acc']accuracy /= test_itersreturn test_net, accuracy test_net, accuracy = eval_style_net(style_weights) print 'Accuracy, trained from ImageNet initialization: %3.1f%%' % (100*accuracy, ) scratch_test_net, scratch_accuracy = eval_style_net(scratch_style_weights) print 'Accuracy, trained from random initialization: %3.1f%%' % (100*scratch_accuracy, ) Accuracy, trained from ImageNet initialization: 51.4% Accuracy, trained from random initialization: 23.6%

端到端微調(diào)風(fēng)格網(wǎng)絡(luò)

最后,我們?cè)俅斡?xùn)練前面的兩個(gè)網(wǎng)絡(luò),就從剛才訓(xùn)練學(xué)習(xí)到的參數(shù)開始繼續(xù)訓(xùn)練。這次唯一的區(qū)別是我們會(huì)以“端到端”的方式訓(xùn)練參數(shù),即訓(xùn)練網(wǎng)絡(luò)中所有的層,從起始的conv1層直接送入圖像。我們將learn_all=True傳遞給前面定義的style_net函數(shù),如此一來,在網(wǎng)絡(luò)中會(huì)給所有的參數(shù)都乘上一個(gè)非0的lr_mult參數(shù)。在默認(rèn)情況下,是learn_all=False,所有預(yù)訓(xùn)練層(conv1到fc7)的參數(shù)都被凍結(jié)了(lr_mult=0),我們訓(xùn)練的只有分類層fc8_flickr。
請(qǐng)注意,這兩個(gè)網(wǎng)絡(luò)開始訓(xùn)練時(shí)的準(zhǔn)確率大致相當(dāng)于之前訓(xùn)練結(jié)束時(shí)的準(zhǔn)確率。為了更科學(xué)一些,我們還要使用與之相同的訓(xùn)練步驟,但結(jié)構(gòu)不是端到端的,來確認(rèn)我們的結(jié)果并不是因?yàn)橛?xùn)練了兩倍的時(shí)間長度才取得更好的結(jié)果的。

end_to_end_net = style_net(train=True, learn_all=True)# Set base_lr to 1e-3, the same as last time when learning only the classifier. # You may want to play around with different values of this or other # optimization parameters when fine-tuning. For example, if learning diverges # (e.g., the loss gets very large or goes to infinity/NaN), you should try # decreasing base_lr (e.g., to 1e-4, then 1e-5, etc., until you find a value # for which learning does not diverge). base_lr = 0.001style_solver_filename = solver(end_to_end_net, base_lr=base_lr) style_solver = caffe.get_solver(style_solver_filename) style_solver.net.copy_from(style_weights)scratch_style_solver_filename = solver(end_to_end_net, base_lr=base_lr) scratch_style_solver = caffe.get_solver(scratch_style_solver_filename) scratch_style_solver.net.copy_from(scratch_style_weights)print 'Running solvers for %d iterations...' % niter solvers = [('pretrained, end-to-end', style_solver),('scratch, end-to-end', scratch_style_solver)] _, _, finetuned_weights = run_solvers(niter, solvers) print 'Done.'style_weights_ft = finetuned_weights['pretrained, end-to-end'] scratch_style_weights_ft = finetuned_weights['scratch, end-to-end']# Delete solvers to save memory. del style_solver, scratch_style_solver, solvers Running solvers for 200 iterations...0) pretrained, end-to-end: loss=0.851, acc=68%; scratch, end-to-end: loss=1.584, acc=28%10) pretrained, end-to-end: loss=1.312, acc=56%; scratch, end-to-end: loss=1.637, acc=14%20) pretrained, end-to-end: loss=0.802, acc=70%; scratch, end-to-end: loss=1.627, acc=16%30) pretrained, end-to-end: loss=0.786, acc=66%; scratch, end-to-end: loss=1.595, acc=22%40) pretrained, end-to-end: loss=0.748, acc=74%; scratch, end-to-end: loss=1.575, acc=24%50) pretrained, end-to-end: loss=0.818, acc=72%; scratch, end-to-end: loss=1.595, acc=34%60) pretrained, end-to-end: loss=0.773, acc=68%; scratch, end-to-end: loss=1.560, acc=26%70) pretrained, end-to-end: loss=0.617, acc=84%; scratch, end-to-end: loss=1.540, acc=28%80) pretrained, end-to-end: loss=0.561, acc=76%; scratch, end-to-end: loss=1.494, acc=46%90) pretrained, end-to-end: loss=0.824, acc=62%; scratch, end-to-end: loss=1.521, acc=30% 100) pretrained, end-to-end: loss=0.624, acc=80%; scratch, end-to-end: loss=1.482, acc=30% 110) pretrained, end-to-end: loss=0.586, acc=76%; scratch, end-to-end: loss=1.566, acc=32% 120) pretrained, end-to-end: loss=0.633, acc=72%; scratch, end-to-end: loss=1.547, acc=26% 130) pretrained, end-to-end: loss=0.547, acc=82%; scratch, end-to-end: loss=1.458, acc=28% 140) pretrained, end-to-end: loss=0.431, acc=80%; scratch, end-to-end: loss=1.469, acc=28% 150) pretrained, end-to-end: loss=0.514, acc=78%; scratch, end-to-end: loss=1.508, acc=32% 160) pretrained, end-to-end: loss=0.475, acc=82%; scratch, end-to-end: loss=1.440, acc=28% 170) pretrained, end-to-end: loss=0.490, acc=78%; scratch, end-to-end: loss=1.554, acc=40% 180) pretrained, end-to-end: loss=0.449, acc=80%; scratch, end-to-end: loss=1.470, acc=32% 190) pretrained, end-to-end: loss=0.367, acc=84%; scratch, end-to-end: loss=1.463, acc=34% 199) pretrained, end-to-end: loss=0.492, acc=82%; scratch, end-to-end: loss=1.364, acc=52% Done.

讓我們現(xiàn)在測試一下端到端微調(diào)模型。由于網(wǎng)絡(luò)中所有的層都參與到了訓(xùn)練當(dāng)中,所以我們期望這次的結(jié)果會(huì)比之前只讓分類層參與到訓(xùn)練中的網(wǎng)絡(luò)取得更好的效果。

test_net, accuracy = eval_style_net(style_weights_ft) print 'Accuracy, finetuned from ImageNet initialization: %3.1f%%' % (100*accuracy, ) scratch_test_net, scratch_accuracy = eval_style_net(scratch_style_weights_ft) print 'Accuracy, finetuned from random initialization: %3.1f%%' % (100*scratch_accuracy, ) Accuracy, finetuned from ImageNet initialization: 54.4% Accuracy, finetuned from random initialization: 40.2%

先看看輸入的圖片,和它在端到端模型中的預(yù)測結(jié)果。

plt.imshow(deprocess_net_image(image)) disp_style_preds(test_net, image) top 5 predicted style labels =(1) 87.82% Melancholy(2) 6.10% Pastel(3) 5.66% HDR(4) 0.41% Detailed(5) 0.01% Noir

喔!預(yù)測結(jié)果相比之前好了不少。但是請(qǐng)注意,這個(gè)圖片是來自數(shù)據(jù)集的,所以網(wǎng)絡(luò)在訓(xùn)練時(shí)就看過它的標(biāo)簽了。
接下來,我們從測試集中取出一張圖片,看看端到端模型的預(yù)測結(jié)果如何。

batch_index = 1 image = test_net.blobs['data'].data[batch_index] plt.imshow(deprocess_net_image(image)) print 'actual label =', style_labels[int(test_net.blobs['label'].data[batch_index])] actual label = Pastel

disp_style_preds(test_net, image) top 5 predicted style labels =(1) 99.48% Pastel(2) 0.47% Detailed(3) 0.05% HDR(4) 0.00% Melancholy(5) 0.00% Noir

我們也可以看看這張圖片在scratch網(wǎng)絡(luò)中的預(yù)測結(jié)果。它也輸出了正確的結(jié)果,盡管置信率較另一個(gè)(使用預(yù)訓(xùn)練權(quán)重的網(wǎng)絡(luò))更低。

disp_style_preds(scratch_test_net, image) top 5 predicted style labels =(1) 46.02% Pastel(2) 23.50% Melancholy(3) 16.43% Detailed(4) 11.64% HDR(5) 2.40% Noir

當(dāng)然我們還可以看看在ImageNet模型上的預(yù)測結(jié)果:

disp_imagenet_preds(imagenet_net, image) top 5 predicted ImageNet labels =(1) 34.90% n07579787 plate(2) 21.63% n04263257 soup bowl(3) 17.75% n07875152 potpie(4) 5.72% n07711569 mashed potato(5) 5.27% n07584110 consomme

總結(jié)

以上是生活随笔為你收集整理的Caffe官方教程翻译(7):Fine-tuning for Style Recognition的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。