基于CNN的性别、年龄识别及Demo实现
一、相關(guān)理論
本篇博文主要講解2015年一篇paper《Age and Gender Classification using Convolutional Neural Networks》paper的創(chuàng)新點在哪里。難道是因為利用CNN做年齡和性別分類的paper很少嗎?網(wǎng)上搜索了一下,性別預測,以前很多都是用SVM算法,用CNN搞性別分類就只搜索到這一篇文章。
性別分類自然而然是二分類問題,然而對于年齡怎么搞?年齡預測是回歸問題嗎?paper采用的方法是把年齡劃分為多個年齡段,每個年齡段相當于一個類別,這樣性別也就多分類問題了。
言歸正傳,下面開始講解2015年paper《Age and Gender Classification using Convolutional Neural Networks》的網(wǎng)絡(luò)結(jié)構(gòu),這篇文章沒有什么新算法,只有調(diào)參,改變網(wǎng)絡(luò)層數(shù)、卷積核大小等……所以如果已經(jīng)對Alexnet比較熟悉的,可能會覺得看起來沒啥意思,這篇papar的相關(guān)源碼和訓練數(shù)據(jù),文獻作者有給我們提供,可以到Caffe zoo model:https://github.com/BVLC/caffe/wiki/Model-Zoo 或者文獻的主頁:http://www.openu.ac.il/home/hassner/projects/cnn_agegender/。下載相關(guān)訓練好的模型,paper性別、年齡預測的應用場景比較復雜,都是一些非常糟糕的圖片,比較模糊的圖片等,所以如果我們想要直接利用paper訓練好的模型,用到我們自己的項目上,可能精度會比較低,后面我將會具體講一下利用paper的模型進行fine-tuning,以適應我們的應用,提高我們自己項目的識別精度。
二、算法實現(xiàn)
因為paper的主頁,有提供網(wǎng)絡(luò)結(jié)構(gòu)的源碼,我將結(jié)合網(wǎng)絡(luò)結(jié)構(gòu)文件進行講解。
1、 網(wǎng)絡(luò)結(jié)構(gòu)
Paper所用的網(wǎng)絡(luò)包含:3個卷積層,還有2個全連接層。這個算是層數(shù)比較少的CNN網(wǎng)絡(luò)模型了,這樣可以避免過擬合。對于年齡的識別,paper僅僅有8個年齡段,相當于8分類模型;然后對于性別識別自然而然是二分類問題了。
然后圖像處理直接采用3通道彩色圖像進行處理,圖片6都統(tǒng)一縮放到256*256,然后再進行裁剪,為227*227(訓練過程隨機裁剪,驗證測試過程通過矩形的四個角+中心裁剪),也就是說網(wǎng)絡(luò)的輸入時227*227的3通道彩色圖像,總之基本上跟Alexnet一樣。
網(wǎng)絡(luò)模型:
1、 網(wǎng)絡(luò)結(jié)構(gòu)
(1)第一層:采用96個卷積核,每個卷積核參數(shù)個數(shù)為3*7*7,這個就相當于3個7*7大小的卷積核在每個通道進行卷積。激活函數(shù)采用ReLU,池化采用最大重疊池化,池化的size選擇3*3,strides選擇2。然后接著再來一個局部響應歸一化層。什么叫局部響應歸一化,自己可以查看一下文獻:《ImageNet Classification with Deep Convolutional Neural Networks》,局部響應歸一化可以提高網(wǎng)絡(luò)的泛化能力。
局部響應歸一化,這個分成兩種情況,一種是3D的歸一化,也就是特征圖之間對應像素點的一個歸一化。還有一種是2D歸一化,就是對特征圖的每個像素的局部做歸一化。局部響應歸一化其實這個可有可無,精度提高不了多少,如果你還不懂上面那個公式也沒有關(guān)系。我們可以利用最新的算法:Batch Normalize ,這個才牛逼呢,2015年,我覺得最牛逼的算法之一,不僅提高了訓練速度,連精度也提高了。過程:通過7*7大小的卷積核,對227*227圖片卷積,然后特征圖的個數(shù)為96個,每個特征圖都是三通道的,這個作者沒有講到卷積層的stride大小,不過我們大體可以推測出來,因為paper的網(wǎng)絡(luò)結(jié)構(gòu)是模仿:ImageNet Classification with Deep Convolutional Neural Networks的網(wǎng)絡(luò)結(jié)構(gòu)的,連輸入圖片的大小也是一樣的,這篇文獻的第一層如下所示:
我們可以推測出,paper選擇的卷積步長為4,這樣經(jīng)過卷積后,然后pad為2,這樣經(jīng)過卷積后圖片的大小為:(227-7)/4+1=56。然后經(jīng)過3*3,且步長為2的大小,進行重疊池化,可以得到:56/2=28*28大小的圖片,具體邊界需要補齊。下面是原文的第一層結(jié)構(gòu)示意圖:
layers { name: "conv1" type: CONVOLUTION bottom: "data" top: "conv1" blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 96 kernel_size: 7 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layers { name: "relu1" type: RELU bottom: "conv1" top: "conv1" } layers { name: "pool1" type: POOLING bottom: "conv1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layers { name: "norm1" type: LRN bottom: "pool1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } }(2)第二層:
第二層的輸入也就是96*28*28的單通道圖片,因為我們上一步已經(jīng)把三通道合在一起進行卷積了。第二層結(jié)構(gòu),選擇256個濾波器,濾波器大小為5*5,卷積步長為1,這個也可以參考AlexNet的結(jié)構(gòu)。池化也是選擇跟上面的一樣的參數(shù)。
layers { name: "conv2" type: CONVOLUTION bottom: "norm1" top: "conv2" blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 256 pad: 2 kernel_size: 5 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 1 } } } layers { name: "relu2" type: RELU bottom: "conv2" top: "conv2" } layers { name: "pool2" type: POOLING bottom: "conv2" top: "pool2" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layers { name: "norm2" type: LRN bottom: "pool2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } }(3)第三層:濾波器個數(shù)選擇384,卷積核大小為3*3
layers { name: "conv3" type: CONVOLUTION bottom: "norm2" top: "conv3" blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layers { name: "relu3" type: RELU bottom: "conv3" top: "conv3" } layers { name: "pool5" type: POOLING bottom: "conv3" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } }
(4)第四層:第一個全連接層,神經(jīng)元個數(shù)選擇512
layers { name: "fc6" type: INNER_PRODUCT bottom: "pool5" top: "fc6" blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 inner_product_param { num_output: 512 weight_filler { type: "gaussian" std: 0.005 } bias_filler { type: "constant" value: 1 } } } layers { name: "relu6" type: RELU bottom: "fc6" top: "fc6" } layers { name: "drop6" type: DROPOUT bottom: "fc6" top: "fc6" dropout_param { dropout_ratio: 0.5 } }(5)第五層:第二個全連接層,神經(jīng)元個數(shù)也是選擇512
layers { name: "fc7" type: INNER_PRODUCT bottom: "fc6" top: "fc7" blobs_lr: 1 blobs_lr: 2 weight_decay: 1 weight_decay: 0 inner_product_param { num_output: 512 weight_filler { type: "gaussian" std: 0.005 } bias_filler { type: "constant" value: 1 } } } layers { name: "relu7" type: RELU bottom: "fc7" top: "fc7" } layers { name: "drop7" type: DROPOUT bottom: "fc7" top: "fc7" dropout_param { dropout_ratio: 0.5 } }
(6)第六層:輸出層,對于性別來說是二分類,輸入神經(jīng)元個數(shù)為2
layers { name: "fc8" type: INNER_PRODUCT bottom: "fc7" top: "fc8" blobs_lr: 10 blobs_lr: 20 weight_decay: 1 weight_decay: 0 inner_product_param { num_output: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layers { name: "accuracy" type: ACCURACY bottom: "fc8" bottom: "label" top: "accuracy" include: { phase: TEST } } layers { name: "loss" type: SOFTMAX_LOSS bottom: "fc8" bottom: "label" top: "loss" }
網(wǎng)絡(luò)方面,paper沒有什么創(chuàng)新點,模仿AlexNet結(jié)構(gòu)。
2、網(wǎng)絡(luò)訓練
(1)初始化參數(shù):權(quán)重初始化方法采用標準差為0.01,均值為0的高斯正太分布。(2)網(wǎng)絡(luò)訓練:采用dropout,來限制過擬合。Drop out比例采用0.5,還有就是數(shù)據(jù)擴充,數(shù)據(jù)擴充石通過輸入256*256的圖片,然后進行隨機裁剪,裁剪為227*227的圖片,當然裁剪要以face中心為基礎(chǔ),進行裁剪。(3)訓練方法采用,隨機梯度下降法,min-batch 大小選擇50,學習率大小0.001,然后當?shù)?span id="ozvdkddzhkzd" class="hljs-number">10000次以后,把學習率調(diào)為0.0001。(4)結(jié)果預測:預測方法采用輸入一張256*256的圖片,然后進行裁剪5張圖片為227*227大小,其中四張圖片的裁剪方法分別采用以256*256的圖片的4個角為基點點,進行裁剪。然后最后一張,以人臉的中心為基點進行裁剪。然后對這5張圖片進行預測,最后對預測結(jié)果進行平均。三、Age and Gender Classification Using Convolutional Neural Networks - Demo
下載 cnn_age_gender_models_and_data.0.0.2.zip
cd caffe-master/python/cnn_age_gender_models_and_data.0.0.2 ipython notebook接著修改一下 caffe所在的位置
caffe_root = '../../../caffe-master/' #為caffe所在目錄出現(xiàn)的問題1
#Loading the age network age_net_pretrained='./age_net.caffemodel' age_net_model_file='./deploy_age.prototxt' age_net = caffe.Classifier(age_net_model_file, age_net_pretrained,mean=mean,channel_swap=(2,1,0),raw_scale=255,image_dims=(256, 256))報錯
File "/home/XXX/caffe-master/python/caffe/io.py", line 255, in set_mean raise ValueError('Mean shape incompatible with input shape.') ValueError: Mean shape incompatible with input shape.解決方案
gedit ./caffe-master/python/caffe/io.py用
if ms != self.inputs[in_][1:]: print(self.inputs[in_]) in_shape = self.inputs[in_][1:] m_min, m_max = mean.min(), mean.max() normal_mean = (mean - m_min) / (m_max - m_min) mean = resize_image(normal_mean.transpose((1,2,0)),in_shape[1:]).transpose((2,0,1)) * (m_max - m_min) + m_min #raise ValueError('Mean shape incompatible with input shape.')替代
if ms != self.inputs[in_][1:]: raise ValueError('Mean shape incompatible with input shape.')出現(xiàn)的問題2
feat = age_net.blobs['conv1'].data[4, :49] vis_square(feat, padval=1)報錯
IndexError: index 4 is out of bounds for axis 0 with size 1解決方案
feat = age_net.blobs['conv1'].data[0, :49] vis_square(feat, padval=1)代碼示例
# coding=utf-8 import os import numpy as np import matplotlib.pyplot as plt %matplotlib inline import cv2 import shutil import time#設(shè)置caffe源碼所在的路徑set the path where the Caffe source code caffe_root = '../../../caffe-master/' import sys sys.path.insert(0, caffe_root + 'python') import caffe plt.rcParams['figure.figsize'] = (10, 10) plt.rcParams['image.interpolation'] = 'nearest' plt.rcParams['image.cmap'] = 'gray'#加載均值文件mean file loading mean_filename='./imagenet_mean.binaryproto' proto_data = open(mean_filename, "rb").read() a = caffe.io.caffe_pb2.BlobProto.FromString(proto_data) mean = caffe.io.blobproto_to_array(a)[0] #創(chuàng)建網(wǎng)絡(luò),并加載已經(jīng)訓練好的模型文件 create a network, and have good training model file loading gender_net_pretrained='./mytask_train_iter_8000.caffemodel' gender_net_model_file='./deploy.prototxt' gender_net = caffe.Classifier(gender_net_model_file, gender_net_pretrained, mean=mean, channel_swap=(2,1,0),#RGB通道與BGR raw_scale=255,#把圖片歸一化到0~1之間 image_dims=(256, 256))#設(shè)置輸入圖片的大小 #載入年齡網(wǎng)絡(luò),并加載已經(jīng)訓練好的模型文件 age_net_pretrained='./age_net.caffemodel' age_net_model_file='./deploy_age.prototxt' age_net = caffe.Classifier(age_net_model_file, age_net_pretrained,mean=mean,channel_swap=(2,1,0),raw_scale=255,image_dims=(256, 256))#Label類別 age_list=['(0, 2)','(4, 6)','(8, 12)','(15, 20)','(25, 32)','(38, 43)','(48, 53)','(60, 100)'] gender_list=['Male','Female'] #類別#Reading and plotting the input image example_image = './example_image.jpg' input_image = caffe.io.load_image(example_image) # read pictures 讀取圖片 _ = plt.imshow(input_image) #顯示原圖片#預測結(jié)果 prediction = age_net.predict([input_image]) print 'predicted age:', age_list[prediction[0].argmax()]prediction = gender_net.predict([input_image]) print 'predicted gender:', gender_list[prediction[0].argmax()]for k, v in gender_net.params.items:print 'weight' # in blob parameters of each layer, using vector to store two blob variable Caffe, v[0] weight print (k, v[0].data.shape)print bprint (k, v[1].data.shape) v[1] # bias#預測分類及其特征可視化#Filters visualizations 濾波可視化 def showimage(im): if im.ndim == 3: im = im[:, :, ::-1] plt.set_cmap('jet') plt.imshow(im) #Display # feature visualization, padval is used to adjust the brightness def vis_square(data, padsize=1, padval=0): data -= data.min() data /= data.max() # force the number of filters to be square n = int(np.ceil(np.sqrt(data.shape[0]))) padding = ((0, n ** 2 - data.shape[0]), (0, padsize), (0, padsize)) + ((0, 0),) * (data.ndim - 3) data = np.pad(data, padding, mode='constant', constant_values=(padval, padval)) # tile the filters into an image data = data.reshape((n, n) + data.shape[1:]).transpose((0, 2, 1, 3) + tuple(range(4, data.ndim + 1))) data = data.reshape((n * data.shape[1], n * data.shape[3]) + data.shape[4:]) showimage(data) #Loading the gender network without the mean image - just for better visualizations age_net = caffe.Classifier(age_net_model_file, age_net_pretrained,channel_swap=(2,1,0),raw_scale=255,image_dims=(256, 256))prediction = age_net.predict([input_image]) #Input image _ = plt.imshow(input_image)#The first conv layer filters, conv1 filters = gender_net.params['conv1'][0].data[:49] #conv1濾波器可視化 conv1 filter visualization vis_square(filters.transpose(0, 2, 3, 1)) #conv2 filter visualization filters = gender_net.params['conv2'][0].data[:49] #conv2濾波器可視化 conv2 filter visualization vis_square(filters.transpose(0, 2, 3, 1)) #feat map feat = gender_net.blobs['conv1'].data[4, :49] #The first Conv layer output, conv1 (rectified responses of the filters above) vis_square(feat, padval=1) for k, v in gender_net.blobs.items: print (k, v.data.shape); Feat = gender_net.blobs[k].data[0,0:4]# shows the name of the K network layer, the first picture generated by the 4 maps feature Vis_square (feat, padval=1) # shows the original image, and the classification results Str_gender=gender_list[prediction_gender[0].argmax ()] print Str_gender plt.imshow (input_image) plt.title (str_gender) plt.show ()參考文獻
1、《Age and Gender Classification using Convolutional Neural Networks》
2、《ImageNet Classification with Deep Convolutional Neural Networks》
3、caffe finetune predict and classify the lung nodule( 肺結(jié)節(jié)的分類)
4、https://github.com/BVLC/caffe/wiki/Model-Zoo
5、深度學習(十四)基于CNN的性別、年齡識別
6、 http://stackoverflow.com/questions/30808735/error-when-using-classify-in-caffe
7、Age and Gender Classification Using Convolutional Neural Networks - Demo-ipynb
8、 caffe預測、特征可視化python接口調(diào)用
9、Deep learning (nine) Caffe prediction, feature visualization Python interface call
總結(jié)
以上是生活随笔為你收集整理的基于CNN的性别、年龄识别及Demo实现的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 战舰少女r战斗机哪个好 战舰少女r战斗机
- 下一篇: Faste R-CNN的安装及测试