當前位置：首頁 >

Windows下Caffe的学习与应用（二）——优化自己训练的模型（fine-tuning）

發布時間：2025/3/21 44 豆豆

生活随笔收集整理的這篇文章主要介紹了 Windows下Caffe的学习与应用（二）——优化自己训练的模型（fine-tuning）小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

前言

在前面的博文中，我們也看到，在進行訓練的樣本比較少，每個種類的樣本只有120張，迭代次數為100代，訓練得到的模型在測試那些特征不那么明顯圖像時，準確率達不到我們想要的效果。如果下圖：

測試的結果如下：

這個測試結果是不準確的，像這種概率是根本無法在項目中應用。一般深度學習的庫都要求樣本量盡量在在1萬個以上，但有時候我們的樣本又達不到這個數量，什么辦呢，這里就要用到別的用戶的訓練好的成熟的數據模型進行微調（fine-tuning），借用訓練好的參數，可以更快的收斂，達到比原來樣本小的時候訓練效果要好一些。但使用的網絡必須一樣。
圖像分類這塊，caffe團隊用ImageNet進行訓練，迭代30多萬次，訓練出來一個model。這個model將圖片分為1000類，應該是目前為止最好的圖片分類model了,這里就會用caffe的官方模型來進行遷移學習。

一、數據準備

在data這個文件夾下新建立一個，把上次轉換的數據，均值文件文件都移到到這個文件夾，然后把caffe的圖像分類官方模型下載到當前文件夾，下載地址是：http://dl.caffe.berkeleyvision.org/bvlc_reference_caffenet.caffemodel。
當前文件夾所包含的文件如下圖：

所有包含的文件，不能少，如果沒有這些文件，可以往回看我之前寫的博文，如何生成這幾個文件的。
2.數據處理

二、更改代碼

修改配置三個文件的，這里我貼出我的文件，路徑是我電腦上的，可能參考，是于參數的說明，可能看上一個博文。
1.修改solever.prototxt文件

net: "E:/LIB/caffe-windows/data/fine_tuning/train_val.prototxt" test_iter: 4 test_interval: 8 base_lr: 0.001 lr_policy: "step" gamma: 0.1 stepsize: 160 display: 20 max_iter: 800 momentum: 0.9 weight_decay: 0.0005 snapshot: 5000 snapshot_prefix: "E:/LIB/caffe-windows/data/fine_tuning/train" solver_mode: CPU

2.train_val.prototxt文件

name: "CaffeNet" layer {name: "data"type: "Data"top: "data"top: "label"include {phase: TRAIN}transform_param {mirror: truecrop_size: 227mean_file: "data/fine_tuning/train_mean.binaryproto" #均值文件路徑} # mean pixel / channel-wise mean instead of mean image # transform_param { # crop_size: 227 # mean_value: 104 # mean_value: 117 # mean_value: 123 # mirror: true # }data_param {source: "data/fine_tuning/train_leveldb" #轉換文件路徑batch_size: 60backend: LEVELDB #數據類型} } layer {name: "data"type: "Data"top: "data"top: "label"include {phase: TEST}transform_param {mirror: falsecrop_size: 227mean_file: "data/fine_tuning/test_mean.binaryproto" #均值文件路徑} # mean pixel / channel-wise mean instead of mean image # transform_param { # crop_size: 227 # mean_value: 104 # mean_value: 117 # mean_value: 123 # mirror: false # }data_param {source: "data/fine_tuning/test_leveldb" #轉換文件路徑batch_size: 25backend: LEVELDB} } layer {name: "conv1"type: "Convolution"bottom: "data"top: "conv1"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 96kernel_size: 11stride: 4weight_filler {type: "gaussian"std: 0.01}bias_filler {type: "constant"value: 0}} } layer {name: "relu1"type: "ReLU"bottom: "conv1"top: "conv1" } layer {name: "pool1"type: "Pooling"bottom: "conv1"top: "pool1"pooling_param {pool: MAXkernel_size: 3stride: 2} } layer {name: "norm1"type: "LRN"bottom: "pool1"top: "norm1"lrn_param {local_size: 5alpha: 0.0001beta: 0.75} } layer {name: "conv2"type: "Convolution"bottom: "norm1"top: "conv2"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 256pad: 2kernel_size: 5group: 2weight_filler {type: "gaussian"std: 0.01}bias_filler {type: "constant"value: 1}} } layer {name: "relu2"type: "ReLU"bottom: "conv2"top: "conv2" } layer {name: "pool2"type: "Pooling"bottom: "conv2"top: "pool2"pooling_param {pool: MAXkernel_size: 3stride: 2} } layer {name: "norm2"type: "LRN"bottom: "pool2"top: "norm2"lrn_param {local_size: 5alpha: 0.0001beta: 0.75} } layer {name: "conv3"type: "Convolution"bottom: "norm2"top: "conv3"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 384pad: 1kernel_size: 3weight_filler {type: "gaussian"std: 0.01}bias_filler {type: "constant"value: 0}} } layer {name: "relu3"type: "ReLU"bottom: "conv3"top: "conv3" } layer {name: "conv4"type: "Convolution"bottom: "conv3"top: "conv4"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 384pad: 1kernel_size: 3group: 2weight_filler {type: "gaussian"std: 0.01}bias_filler {type: "constant"value: 1}} } layer {name: "relu4"type: "ReLU"bottom: "conv4"top: "conv4" } layer {name: "conv5"type: "Convolution"bottom: "conv4"top: "conv5"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 256pad: 1kernel_size: 3group: 2weight_filler {type: "gaussian"std: 0.01}bias_filler {type: "constant"value: 1}} } layer {name: "relu5"type: "ReLU"bottom: "conv5"top: "conv5" } layer {name: "pool5"type: "Pooling"bottom: "conv5"top: "pool5"pooling_param {pool: MAXkernel_size: 3stride: 2} } layer {name: "fc6"type: "InnerProduct"bottom: "pool5"top: "fc6"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}inner_product_param {num_output: 4096weight_filler {type: "gaussian"std: 0.005}bias_filler {type: "constant"value: 1}} } layer {name: "relu6"type: "ReLU"bottom: "fc6"top: "fc6" } layer {name: "drop6"type: "Dropout"bottom: "fc6"top: "fc6"dropout_param {dropout_ratio: 0.5} } layer {name: "fc7"type: "InnerProduct"bottom: "fc6"top: "fc7"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}inner_product_param {num_output: 4096weight_filler {type: "gaussian"std: 0.005}bias_filler {type: "constant"value: 1}} } layer {name: "relu7"type: "ReLU"bottom: "fc7"top: "fc7" } layer {name: "drop7"type: "Dropout"bottom: "fc7"top: "fc7"dropout_param {dropout_ratio: 0.5} } layer {name: "fc8-re" #要改的位置type: "InnerProduct"bottom: "fc7"top: "fc8"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}inner_product_param {num_output: 4 #訓練的種類weight_filler {type: "gaussian"std: 0.01}bias_filler {type: "constant"value: 0}} } layer {name: "accuracy"type: "Accuracy"bottom: "fc8"bottom: "label"top: "accuracy"include {phase: TEST} } layer {name: "loss"type: "SoftmaxWithLoss"bottom: "fc8"bottom: "label"top: "loss" }

3.deploy.prototxt文件

name: "CaffeNet" layer {name: "data"type: "Input"top: "data"input_param { shape: { dim: 10 #一批的數量（bach of image）dim: 3 #圖像通道數量（channels 彩色圖是3通道RGB）dim: 227 #圖像的高度dim: 227 #圖像的寬度} } } layer {name: "conv1" #層的名稱type: "Convolution" #層的類型bottom: "data" #層的輸入（對應上面的data）top: "conv1" #層的輸出（對應的是本層卷積的結果）convolution_param { #卷積的參數num_output: 96 #過濾器的個數，可以看做的是卷積和的個數吧kernel_size: 11 #卷積核的大小stride: 4 #圖像中的間隔多少進行卷積（一次窗口滑動的步長）} } layer {name: "relu1"type: "ReLU"bottom: "conv1"top: "conv1" } layer {name: "pool1"type: "Pooling"bottom: "conv1"top: "pool1"pooling_param {pool: MAX #使用最大池化kernel_size: 3stride: 2} } layer {name: "norm1"type: "LRN"bottom: "pool1"top: "norm1"lrn_param {local_size: 5alpha: 0.0001beta: 0.75} } layer {name: "conv2"type: "Convolution"bottom: "norm1"top: "conv2"convolution_param {num_output: 256pad: 2 #邊界處補2個行和2個列kernel_size: 5group: 2 #卷積分組} } layer {name: "relu2"type: "ReLU" #激活函數bottom: "conv2"top: "conv2" } layer {name: "pool2"type: "Pooling"bottom: "conv2"top: "pool2"pooling_param {pool: MAXkernel_size: 3stride: 2} } layer {name: "norm2"type: "LRN" #側抑制bottom: "pool2"top: "norm2"lrn_param { #主要是LRN的三個主要的參數local_size: 5alpha: 0.0001beta: 0.75} } layer {name: "conv3"type: "Convolution"bottom: "norm2"top: "conv3"convolution_param {num_output: 384pad: 1kernel_size: 3} } layer {name: "relu3"type: "ReLU"bottom: "conv3"top: "conv3" } layer {name: "conv4"type: "Convolution"bottom: "conv3"top: "conv4"convolution_param {num_output: 384pad: 1kernel_size: 3group: 2} } layer {name: "relu4"type: "ReLU"bottom: "conv4"top: "conv4" } layer {name: "conv5"type: "Convolution"bottom: "conv4"top: "conv5"convolution_param {num_output: 256pad: 1 #對圖像進行補充像素的設置（在圖像的高和寬進行補充）kernel_size: 3group: 2} } layer {name: "relu5"type: "ReLU"bottom: "conv5"top: "conv5" } layer {name: "pool5"type: "Pooling"bottom: "conv5"top: "pool5"pooling_param {pool: MAXkernel_size: 3stride: 2} } layer {name: "fc6"type: "InnerProduct"bottom: "pool5"top: "fc6"inner_product_param {num_output: 4096} } layer {name: "relu6"type: "ReLU"bottom: "fc6"top: "fc6" } layer {name: "drop6"type: "Dropout"bottom: "fc6"top: "fc6"dropout_param {dropout_ratio: 0.5 #使用的drop進行網絡的參數的隱藏時的參數} } layer {name: "fc7"type: "InnerProduct"bottom: "fc6"top: "fc7"inner_product_param {num_output: 4096 #過濾器的個數（輸出的個數）} } layer {name: "relu7"type: "ReLU" #relu的激活函數bottom: "fc7"top: "fc7" } layer {name: "drop7"type: "Dropout" #dropout將一部分的權重置零不參與運算bottom: "fc7"top: "fc7"dropout_param {dropout_ratio: 0.5} } layer {name: "fc8-re" #要改的位置type: "InnerProduct" #內積（全連接層）bottom: "fc7"top: "fc8"inner_product_param {num_output: 4 #輸出的類別（要改成對應的）} } layer {name: "prob"type: "Softmax" #Softmax分類層bottom: "fc8"top: "prob" }

三、開始訓練

1.編寫腳本train.bat

cd ../../ E:/LIB/caffe-windows/build/tools/Release/caffe.exe train --solver=data/fine_tuning/solver.prototxt --weights=data/fine_tuning/bvlc_reference_caffenet.caffemodel pause

2.運行腳本

完成之后會多出兩個文件，如下，代表訓練成功。

四、測試數據

1.新建labels.txt文件，寫上以下內容：

0 cat 1 dog 2 flower 3 cartoon

2.新建data_test.bat文件

E:\LIB\caffe-windows\build\examples\cpp_classification\Release\classification.exe ..\..\data\fine_tuning\deploy.prototxt ..\..\data\fine_tuning\train_iter_800.caffemodel ..\..\data\fine_tuning\test_mean.binaryproto ..\..\data\fine_tuning\labels.txt ..\..\data\fine_tuning\test\5127.jpg pause

保存，點擊運行，還是測試文章開頭的那張圖像，我們來看看測試結果明顯提高了.

后記

1.以上所有的圖像分類的訓練已經完成，之后就是如何在在項目中使用訓練好的模型。
2.有興趣討論學習可以加群：487350510。

總結

以上是生活随笔為你收集整理的Windows下Caffe的学习与应用（二）——优化自己训练的模型（fine-tuning）的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： Windows下Caffe的学习与应用（
下一篇： Windows下Caffe的学习与应用（