Caffe学习笔记4图像特征进行可视化
?
Caffe學(xué)習(xí)筆記4圖像特征進(jìn)行可視化
本文為原創(chuàng)作品,未經(jīng)本人同意,禁止轉(zhuǎn)載,禁止用于商業(yè)用途!本人對(duì)博客使用擁有最終解釋權(quán)
?
歡迎關(guān)注我的博客:http://blog.csdn.net/hit2015spring和http://www.cnblogs.com/xujianqing/
?
這篇文章主要參考的是http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/00-classification.ipynb
可以算是對(duì)它的翻譯的總結(jié)吧,它可以算是學(xué)習(xí)筆記2的一個(gè)發(fā)展,2是介紹怎么提取特征,這是介紹怎么可視化特征
1、準(zhǔn)備工作
首先安裝依賴項(xiàng)
pip install cython pip install h5py pip install ipython pip install leveldb pip install matplotlib pip install networkx pip install nose pip install numpy pip install pandas pip install protobuf pip install python-gflags pip install scikit-image pip install scikit-learn pip install scipy |
接下來的操作都是在ipython中執(zhí)行的,在ipython中用!表示執(zhí)行shell命令,用$表示將python的變量轉(zhuǎn)化為shell變量。通過這兩種符號(hào)便可以實(shí)現(xiàn)shell命令和ipython的交互。
Ipython可以在終端運(yùn)行,但是為了方便我們使用的是ipython notebook,這個(gè)玩意的介紹網(wǎng)上有很多的資源,這里就不贅述了,所以還要在你的主機(jī)上面配置ipython notebook
2、開始
2.1初始化并加載caffe
在終端輸入
ipython notebook
新建一個(gè)文件
?
在終端輸入代碼,這里shift+enter表示運(yùn)行代碼,接下來每一個(gè)代碼段輸入完成后,運(yùn)行一下!
# set up Python environment: numpy for numerical routines, and #matplotlib for plotting import numpy as np import matplotlib.pyplot as plt # display plots in this notebook %matplotlib inline |
調(diào)入numpy子程序和matplotlib.pyplot(用于畫圖) 子程序,并將它們分別命名為np和plt
# set display defaults plt.rcParams['figure.figsize'] = (10, 10) # large images plt.rcParams['image.interpolation'] = 'nearest' # don't interpolate: show #square pixels plt.rcParams['image.cmap'] = 'gray' # use grayscale output rather than #a (potentially misleading) color heatmap |
設(shè)置顯示圖片的一些默認(rèn)參數(shù)大小最大,圖片插值原則為最近鄰插值,圖像為灰度圖
# The caffe module needs to be on the Python path; # we'll add it here explicitly. import sys caffe_root = '/home/wangshuo/caffe/' # this file should be run from {caffe_root}/examples (otherwise change this line) sys.path.insert(0, caffe_root + 'python') ? import caffe # If you get "No module named _caffe", either you have not built pycaffe #or you have the wrong path. |
加載caffe,把caffe的路徑添加到當(dāng)前的python路徑下面來,如果沒有添加路徑,則會(huì)在python的路徑下進(jìn)行檢索,會(huì)報(bào)錯(cuò),caffe模塊不存在。
import os if os.path.isfile(caffe_root + 'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel'): print 'CaffeNet found.' else: print 'Downloading pre-trained CaffeNet model...' !../scripts/download_model_binary.py ../models/bvlc_reference_caffenet |
加載模型,這個(gè)模型在學(xué)習(xí)筆記1的時(shí)候已經(jīng)下載過了,如果沒有下載該模型可以用命令行
./examples/imagenet/get_caffe_reference_imagenet_model.sh
這里注意該模型的大小有233M左右,聯(lián)網(wǎng)下載可能不全,注意查看,是個(gè)坑注意躲避!
2.2加載網(wǎng)絡(luò)模型
設(shè)置為cpu工作模式并加載網(wǎng)絡(luò)模型
caffe.set_mode_cpu() ? model_def = caffe_root + 'models/bvlc_reference_caffenet/deploy.prototxt' model_weights = caffe_root + 'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel' ? net = caffe.Net(model_def, # defines the structure of the model model_weights, # contains the trained weights caffe.TEST) # use test mode (e.g., don't perform dropout) |
model_def:定義模型的結(jié)構(gòu)文件
model_weights:訓(xùn)練的權(quán)重
# load the mean ImageNet image (as distributed with Caffe) for subtraction mu = np.load(caffe_root + 'python/caffe/imagenet/ilsvrc_2012_mean.npy') mu = mu.mean(1).mean(1) # 獲取BGR像素均值 print 'mean-subtracted values:', zip('BGR', mu) ? # create transformer for the input called 'data' transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape}) ? transformer.set_transpose('data', (2,0,1)) # move image channels to outermost dimension transformer.set_mean('data', mu) # 各個(gè)通道的像素值減去圖像均值 transformer.set_raw_scale('data', 255) # 設(shè)置灰度級(jí)為[0,255]而不是[0,1] transformer.set_channel_swap('data', (2,1,0)) # 把RGB轉(zhuǎn)為BGR |
這段是設(shè)置輸入的進(jìn)行預(yù)處理,調(diào)用的是caffe.io.Transformer來進(jìn)行預(yù)處理,這些預(yù)處理步驟是獨(dú)立于其他的caffe部分的,所以可以使用傳統(tǒng)的算法進(jìn)行處理。
在caffe的默認(rèn)設(shè)置中,使用的是BGR圖像格式進(jìn)行處理。像素的級(jí)數(shù)為[0,255],然后對(duì)這些圖片進(jìn)行預(yù)處理是減去均值,最后通道的信息會(huì)被轉(zhuǎn)到第一個(gè)維度上來。
但是在matplotlib中加載的圖片像素灰度級(jí)是[0,1]之間,且其通道數(shù)是在最里層的維度上的RGB格式故需要以上的轉(zhuǎn)換
2.3CPU分類
# set the size of the input (we can skip this if we're happy # with the default; we can also change it later, e.g., for different batch sizes) net.blobs['data'].reshape(50, # 批次大小 3, # 3-channel (BGR) images 227, 227) # image size is 227x227 |
加載一個(gè)圖片
image = caffe.io.load_image(caffe_root + 'examples/images/cat.jpg') transformed_image = transformer.preprocess('data', image) plt.imshow(image) |
在這里運(yùn)行完之后正常應(yīng)該顯示一張圖片,然而并沒有,具體看問題1
現(xiàn)在開始分類
# copy the image data into the memory allocated for the net net.blobs['data'].data[...] = transformed_image ? ### perform classification output = net.forward() ? output_prob = output['prob'][0] # the output probability vector for the first image in the batch ? print 'predicted class is:', output_prob.argmax()#分類輸出最大概率的類別 |
第281類
但是我們需要更加準(zhǔn)確地分類標(biāo)簽,因此加載分類標(biāo)簽
# load ImageNet labels labels_file = caffe_root + 'data/ilsvrc12/synset_words.txt' if not os.path.exists(labels_file): !../data/ilsvrc12/get_ilsvrc_aux.sh ? labels = np.loadtxt(labels_file, str, delimiter='\t') ? print 'output label:', labels[output_prob.argmax()] |
輸出:output label: n02123045 tabby, tabby cat
判斷準(zhǔn)確,再看看把它判斷成其他類別的輸出概率:
# sort top five predictions from softmax output top_inds = output_prob.argsort()[::-1][:5] # reverse sort and take five largest items ? print 'probabilities and labels:' zip(output_prob[top_inds], labels[top_inds]) |
輸出:
probabilities and labels:
[(0.31243637, 'n02123045 tabby, tabby cat'),
(0.2379719, 'n02123159 tiger cat'),
(0.12387239, 'n02124075 Egyptian cat'),
(0.10075711, 'n02119022 red fox, Vulpes vulpes'),
(0.070957087, 'n02127052 lynx, catamount')]
可以看到最小的置信概率下輸出的標(biāo)簽也比較明智
2.4切換到GPU模式
計(jì)算分類所用的時(shí)間,并把它和gpu模式比較
%timeit net.forward() |
選擇GPU模式
caffe.set_device(0) # if we have multiple GPUs, pick the first one caffe.set_mode_gpu() net.forward() # run once before timing to set up memory %timeit net.forward() |
3查看中間的輸出
在每一層中我們主要關(guān)注的是激活的形狀,它的典型格式為,批次,第二個(gè)是特征數(shù),第三個(gè)第四個(gè)是每個(gè)神經(jīng)元中圖片的長寬
# for each layer, show the output shape for layer_name, blob in net.blobs.iteritems(): print layer_name + '\t' + str(blob.data.shape) |
查看每一層的輸出形狀
輸出:
data????(50, 3, 227, 227)
conv1????(50, 96, 55, 55)
pool1????(50, 96, 27, 27)
norm1????(50, 96, 27, 27)
conv2????(50, 256, 27, 27)
pool2????(50, 256, 13, 13)
norm2????(50, 256, 13, 13)
conv3????(50, 384, 13, 13)
conv4????(50, 384, 13, 13)
conv5????(50, 256, 13, 13)
pool5????(50, 256, 6, 6)
fc6????(50, 4096)
fc7????(50, 4096)
fc8????(50, 1000)
prob????(50, 1000)
再看看參數(shù)的結(jié)構(gòu),參數(shù)結(jié)構(gòu)是被另外一個(gè)函數(shù)定義的選擇輸出的類型,0代表權(quán)重,1代表偏差,權(quán)重有4維特征(輸出的通道數(shù),輸入通道數(shù),濾波的高,濾波的寬),偏差有一維(輸出的通道)
for layer_name, param in net.params.iteritems(): print layer_name + '\t' + str(param[0].data.shape), str(param[1].data.shape) |
對(duì)這些特征進(jìn)行可視化:定義一個(gè)函數(shù)
def vis_square(data): """Take an array of shape (n, height, width) or (n, height, width, 3) and visualize each (height, width) thing in a grid of size approx. sqrt(n) by sqrt(n)""" ? # normalize data for display data = (data - data.min()) / (data.max() - data.min()) ? # force the number of filters to be square n = int(np.ceil(np.sqrt(data.shape[0]))) padding = (((0, n ** 2 - data.shape[0]), (0, 1), (0, 1)) # add some space between filters + ((0, 0),) * (data.ndim - 3)) # don't pad the last dimension (if there is one) data = np.pad(data, padding, mode='constant', constant_values=1) # pad with ones (white) ? # tile the filters into an image data = data.reshape((n, n) + data.shape[1:]).transpose((0, 2, 1, 3) + tuple(range(4, data.ndim + 1))) data = data.reshape((n * data.shape[1], n * data.shape[3]) + data.shape[4:]) ipt.show() plt.imshow(data); plt.axis('off') |
顯示第一個(gè)卷積層的特征
# the parameters are a list of [weights, biases] filters = net.params['conv1'][0].data vis_square(filters.transpose(0, 2, 3, 1)) |
?
遇到問題
在運(yùn)行plt.imshow(image)之后應(yīng)該顯示相應(yīng)的圖片,最后沒有顯示,這里有一個(gè)問題就是,需要在導(dǎo)入nump的時(shí)候也要導(dǎo)入pylab
import pylab as ipt |
在顯示之前添上
ipt.show() |
所以問題1的答案為
image = caffe.io.load_image(caffe_root + 'examples/images/cat.jpg') transformed_image = transformer.preprocess('data', image) ipt.show() plt.imshow(image) |
總結(jié)
以上是生活随笔為你收集整理的Caffe学习笔记4图像特征进行可视化的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 12月12日习题答案大剖析!再接再厉
- 下一篇: Caffe和MATLAB