當前位置：首頁 >

Windows下Caffe的学习与应用（一）——训练自己的数据模型(GoogleNet)

發布時間：2025/3/21 76 豆豆

生活随笔收集整理的這篇文章主要介紹了 Windows下Caffe的学习与应用（一）——训练自己的数据模型(GoogleNet) 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

前言

之前有用OpenCv的SUFT特征提取和SVM、BOW做過按圖像里的內容進行分類的相關項目，耗時長，準確率又不是很高，各種優化之后準確率也只有百分七十到八十，所以一直想用caffe試試。

一、系統環境

1.windows 7 64位
之前一直在linux下（Ubuntu 16.04 64位）使用過caffe，然后也有在win7 32位試過，能編譯，但是在訓練過程中出現各種小問題，所有就換64位系統，在訓練沒有遇到什么問題。
2.Anaconda3
安裝Anaconda3盡量裝3.4，這樣就不用再把python的版本降到3.5。
3.caffe CPU
caffe我使用的是CPU版本。

二、數據收集與處理

1.收集數據
圖像數據是從ZOL壁紙網站下載，里面有分類好的壁紙，可以整個系列下載。下載之后新建文件夾放同類型的圖像，我收集了四個類型的然后手工分類放到相關的文件夾里，每個種類收集了差不多150張圖像。

比如我這里把動漫人物放到這個文件夾下：

2.更改文件名
但下載下來的文件的文件名很混亂，所以要更改成與文件夾對應的文件名，方便之后訓練使用，編寫python腳本更改整個文件夾的文件名，每個類型的文件夾運行一次
rename.ipynb

import os def rename(): path="E:/caffe/4/" #文件路徑ex = 4filelist = os.listdir(path) #該文件夾下的所有文件count = 0for file in filelist: #遍歷所有文件包括文件夾Olddir = os.path.join(path,file)#原來文件夾的路徑if os.path.isdir(Olddir):#如果是文件夾，則跳過continuefilename = os.path.splitext(file)[0] #文件名filetype = ".jpg"#os.path.splitext(file)[1] 文件擴展名p = str(count).zfill(3)Newdir = os.path.join(path,str(ex)+p+filetype) #新的文件路徑os.rename(Olddir,Newdir) #重命名count += 1 rename()

得到統一遞增的文件名，文件名前綴是當前的文件夾名，生成訓練文件名列表是以文件名前綴打上標簽。

3.統一圖像大小
下載下來的圖像文件大小有很多用類型的，編寫python腳本把每個文件夾下的圖像改成統一大小的像素的，該腳本把所有圖像改成寬384和高256的圖像。
resize.ipynb

from PIL import Image import glob, os w,h = 384,256 #更改成的分辨率 def timage():for files in glob.glob('E:/caffe/5/*.jpg'): #原文件路徑filepath,filename = os.path.split(files)filterame,exts = os.path.splitext(filename)opfile = r'E:/caffe/data/5/' #保存的文件路徑if (os.path.isdir(opfile)==False):os.mkdir(opfile)im=Image.open(files)im_ss=im.resize((int(w), int(h)))try:im_ss.save(opfile+filterame+'.jpg')except:print (filterame)os.remove(opfile+filterame+'.jpg')if __name__=='__main__':timage()

4.可以從這里下載我分好類的正樣本和測試樣本，下載地址：https://download.csdn.net/download/matt45m/11044661

三、準備訓練

1.創建數據文件夾
（1）在caffe-windows/data路徑下創建一個自己存放數據的文件夾，這里起名為classify，在classify創建兩個文件夾，分別為train和test,如下圖：

（2）把要訓練的圖像文件放到train文件夾下，這里每個類別選了120張照片放進來，剩下的圖像放到test文件夾里面，如下圖：
（3）test文件夾里放著測試用的圖像，如下圖：

2.得到數據集文件名列表
（1）編寫python代碼，得到train與test文件夾下的文件列表并標記
getFileNameList.ipynb

import osif __name__ == "__main__":data_dir = 'E:/LIB/caffe-windows/data/classify/test/' #要遍歷的文件夾fid = open("E:/LIB/caffe-windows/data/classify/test.txt","w") #保存的文件列表files = os.listdir(data_dir)index = 0for ii, file in enumerate(files,1):fid.write("{0}{1} {2}\n".format("",file, int(file[0])-2))index = index + 1if index%100 == 0:print("{0} images processed!".format(index))print("All images processed!")fid.close()

運行之后在classify文件夾生成兩個train.txt和test.txt

（2）得到的train.txt和test.txt文件內容如下：
test.txt的一部分內容，后面數字為類型標記

train.txt的內容，后面數字為類型標記

四、轉換數據

在caffe-windows/data/classify文件夾下編寫腳本，把圖像數據改成Leveldb格式
data_convention.bat

E:/LIB/caffe-windows/build/tools/Release/convert_imageset.exe --shuffle --resize_height=256 --resize_width=256 --shuffle --backend=leveldb E:/LIB/caffe-windows/data/classify/train/ E:/LIB/caffe-windows/data/classify/train.txt E:/LIB/caffe-windows/data/classify/train_leveldb E:/LIB/caffe-windows/build/tools/Release/convert_imageset.exe --shuffle --resize_height=256 --resize_width=256 --shuffle --backend=leveldb E:/LIB/caffe-windows/data/classify/test/ E:/LIB/caffe-windows/data/classify/test.txt E:/LIB/caffe-windows/data/classify/test_leveldb pause

其中resize_height和resize_width表示將原圖像更改為相應的大小，這里改成256是因為選取的網絡（ImageNet）的要求，shuffle是將數據隨機打亂的意思，backend表示將數據轉換的格式，這里選擇Leveldb。
出現下面的窗口代表轉換成功

注：Caffe生成的數據分為2種格式：Lmdb和Leveldb,它們都是鍵/值對（Key/Value Pair）嵌入式數據庫管理系統編程庫。lmdb的內存消耗是leveldb的1.1倍，但是lmdb的速度比leveldb快10%至15%，更重要的是lmdb允許多種訓練模型同時讀取同一組數據集。因此之后lmdb取代了leveldb成為Caffe默認的數據集生成格式。但上面還是使用Leveldb數據類型。
2.運行之后在caffe-windows/data/classify生成兩個文件夾，test_leveldb和train_leveldb兩個文件夾：
test_leveldb文件夾下內容

train_leveldb文件夾下內容

五、生成均值文件

在caffe-windows/data/classify文件夾下編寫腳本，點擊運行，生成均值文件
data_mean.bat

E:/LIB/caffe-windows/build/tools/Release/compute_image_mean.exe E:/LIB/caffe-windows/data/classify/train_leveldb --backend=leveldb E:/LIB/caffe-windows/data/classify/train_mean.binaryproto E:/LIB/caffe-windows/build/tools/Release/compute_image_mean.exe E:/LIB/caffe-windows/data/classify/test_leveldb --backend=leveldb E:/LIB/caffe-windows/data/classify/test_mean.binaryproto pause

其中backend的參數要與上面轉換時的格式保持一致，運行完成后，會在caffe-windows/data/classify文件夾下生成train_mean.binaryproto和test_mean.binaryproto文件
出現以下窗口代表生成成功

在caffe-windows/data/classify生成兩個均值文件，如下：

六、訓練數據

1.將caffe-windows/models/bvlc_reference_caffenet文件夾下的deploy.prototxt、solver.prototxt和train_val.prototxt拷貝到caffe-windows/data/classify下。
bvlc_reference_caffenet文件夾：

復制到classify文件夾下：

2.更改solver.prototxt

#訓練樣本為480張圖像，batch_size = 60，480 / 60 = 8 那么test_interval（測試間隔）的值要大于或者等于8,即處理完一次所有的訓練數據后，才去進行測試. #如果想訓練100代，max_iter 則最大迭代次數為800。 #測試數據為100張圖像，batch_size = 25，100 / 25 = 4 那么test_interval（測試間隔）的值要大于或者等于4,即需要4次才能完整的測試一次。 #stepsize（學習率變化規律）置為隨著迭代次數的增加，慢慢變低。總共迭代800次，我們將變化5次，所以stepsize設置為800/5=160，即每迭代160次，就要降低一次學習率。 net: "data/classify/train_val.prototxt" #訓練或者測試配置文件 test_iter:4 #完成一次測試需要的迭代次數 test_interval: 8 #測試間隔 base_lr: 0.001 #基礎學習率 lr_policy: "step" #學習率變化規律 gamma: 0.1 #學習率變化指數 stepsize: 160 #學習率變化頻率 (stepsize不能太小，如果太小會導致學習率再后來越來越小，達不到充分收斂的效果) display: 20 #屏幕顯示間隔 max_iter: 800 #最大迭代次數 momentum: 0.9 #動量 weight_decay: 0.0005 #權重衰減 snapshot: 5000 #保存模型間隔 snapshot_prefix: "data/classify/caffenet_train" #保存模型的前綴 solver_mode: CPU #使用GPU或者CPU

3.更改train_val
對trian_val文件進行修改，更改source路徑,batch_size,backend和mean_file，其中batch_size看計算機的配置，計算機配置較高，可以設大一點，訓練的結果準確率會有些提升。

name: "CaffeNet" layer {name: "data"type: "Data"top: "data" #輸出數據top: "label" #輸出標簽include {phase: TRAIN #訓練階段}transform_param {mirror: true #映射是否開啟crop_size: 227 #圖的尺寸mean_file: "data/classifyCPP/train_mean.binaryproto" #均值文件路徑} # mean pixel / channel-wise mean instead of mean image # transform_param { # crop_size: 227 # mean_value: 104 # mean_value: 117 # mean_value: 123 # mirror: true # }data_param {source: "data/classifyCPP/train_lmdb" #訓練集的lmdb數據路徑batch_size: 60 #每一批的大小backend: leveldb #數據格式leveldb} } layer {name: "data"type: "Data"top: "data"top: "label"include {phase: TEST #測試階段}transform_param {mirror: false #映射是否開啟crop_size: 227 #測試圖的尺寸mean_file: "data/classifyCPP/test_mean.binaryproto" #測試集的均值文件} # mean pixel / channel-wise mean instead of mean image # transform_param { # crop_size: 227 # mean_value: 104 # mean_value: 117 # mean_value: 123 # mirror: false # }data_param {source: "data/classifyCPP/test_lmdb" #測試集的lmdb數據路徑batch_size: 25 #測試圖像個數backend: LMDB} } layer {name: "conv1"type: "Convolution"bottom: "data"top: "conv1"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 96kernel_size: 11stride: 4weight_filler {type: "gaussian"std: 0.01}bias_filler {type: "constant"value: 0}} } layer {name: "relu1"type: "ReLU"bottom: "conv1"top: "conv1" } layer {name: "pool1"type: "Pooling"bottom: "conv1"top: "pool1"pooling_param {pool: MAXkernel_size: 3stride: 2} } layer {name: "norm1"type: "LRN"bottom: "pool1"top: "norm1"lrn_param {local_size: 5alpha: 0.0001beta: 0.75} } layer {name: "conv2"type: "Convolution"bottom: "norm1"top: "conv2"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 256pad: 2kernel_size: 5group: 2weight_filler {type: "gaussian"std: 0.01}bias_filler {type: "constant"value: 1}} } layer {name: "relu2"type: "ReLU"bottom: "conv2"top: "conv2" } layer {name: "pool2"type: "Pooling"bottom: "conv2"top: "pool2"pooling_param {pool: MAXkernel_size: 3stride: 2} } layer {name: "norm2"type: "LRN"bottom: "pool2"top: "norm2"lrn_param {local_size: 5alpha: 0.0001beta: 0.75} } layer {name: "conv3"type: "Convolution"bottom: "norm2"top: "conv3"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 384pad: 1kernel_size: 3weight_filler {type: "gaussian"std: 0.01}bias_filler {type: "constant"value: 0}} } layer {name: "relu3"type: "ReLU"bottom: "conv3"top: "conv3" } layer {name: "conv4"type: "Convolution"bottom: "conv3"top: "conv4"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 384pad: 1kernel_size: 3group: 2weight_filler {type: "gaussian"std: 0.01}bias_filler {type: "constant"value: 1}} } layer {name: "relu4"type: "ReLU"bottom: "conv4"top: "conv4" } layer {name: "conv5"type: "Convolution"bottom: "conv4"top: "conv5"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 256pad: 1kernel_size: 3group: 2weight_filler {type: "gaussian"std: 0.01}bias_filler {type: "constant"value: 1}} } layer {name: "relu5"type: "ReLU"bottom: "conv5"top: "conv5" } layer {name: "pool5"type: "Pooling"bottom: "conv5"top: "pool5"pooling_param {pool: MAXkernel_size: 3stride: 2} } layer {name: "fc6"type: "InnerProduct"bottom: "pool5"top: "fc6"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}inner_product_param {num_output: 4096weight_filler {type: "gaussian"std: 0.005}bias_filler {type: "constant"value: 1}} } layer {name: "relu6"type: "ReLU"bottom: "fc6"top: "fc6" } layer {name: "drop6"type: "Dropout"bottom: "fc6"top: "fc6"dropout_param {dropout_ratio: 0.5} } layer {name: "fc7"type: "InnerProduct"bottom: "fc6"top: "fc7"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}inner_product_param {num_output: 4096weight_filler {type: "gaussian"std: 0.005}bias_filler {type: "constant"value: 1}} } layer {name: "relu7"type: "ReLU"bottom: "fc7"top: "fc7" } layer {name: "drop7"type: "Dropout"bottom: "fc7"top: "fc7"dropout_param {dropout_ratio: 0.5} } layer {name: "fc8"type: "InnerProduct"bottom: "fc7"top: "fc8"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}inner_product_param {num_output: 4 #訓練的種類weight_filler {type: "gaussian"std: 0.01}bias_filler {type: "constant"value: 0}} } layer {name: "accuracy"type: "Accuracy"bottom: "fc8"bottom: "label"top: "accuracy"include {phase: TEST} } layer {name: "loss"type: "SoftmaxWithLoss"bottom: "fc8"bottom: "label"top: "loss" }

4.編寫訓練腳本
trainSc.bat

cd ../../ "E:/LIB/caffe-windows/build/tools/Release/caffe.exe" train --solver=data/classify/solver.prototxt pause

點擊運行

等待運行結束，在classify會多出兩個訓練好的模型

所有訓練完成，之后就是如何測試和使用模型。
七.測試模型
1.修改caffe-windows/data/classify/deploy.prototxt文件，訓練是4個類型的數據，那么這里要改成4，注意看行數，不要改前面。

2.編寫腳本data_test.bat,運行classification.exe，如果報錯，手動查找classification.exe這個文件，腳本里改成它所在的位置，運行。

E:\LIB\caffe-windows\build\examples\cpp_classification\Release\classification.exe ..\..\data\classifyCPP\deploy.prototxt ..\..\data\classifyCPP\caffenet_train_iter_800.caffemodel ..\..\data\classifyCPP\test_mean.binaryproto ..\..\data\classifyCPP\labels.txt ..\..\data\classifyCPP\test\5136.jpg pause

3.運行結果，有些特征類似的圖像還是不能很好的判斷，這個要去更改相關配置，重新訓練。
（1）測試圖像

運行結果：

（2）測試圖像

運行結果：

（3）測試圖像

運行結果：

這個結果是判斷錯誤的。

后記：

1.以上只是訓練成模型的部分，是于如何在win7下編譯caffe的辦法，現在沒有時間去整理，如果有需要問的可以私信我一起探討。
2.關于python的幾個腳本，不熟悉python的，也可用C++實現，C++要使用boost庫讀取文件操作相對簡單一些。
3.之后有時間會寫caffe的fine tuning和使用opnecv調用caffe訓練好的模型。
4.有興趣討論學習可以加群：487350510。

《新程序員》：云原生和全面數字化實踐50位技術專家共同創作，文字、視頻、音頻交互閱讀

總結

以上是生活随笔為你收集整理的Windows下Caffe的学习与应用（一）——训练自己的数据模型(GoogleNet)的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： OpenCV3特征提取与目标检测之HOG
下一篇： Windows下Caffe的学习与应用（