當(dāng)前位置：首頁(yè) > 运维知识 > 数据库 >内容正文

数据库

Caffe 在自己的数据库上训练步骤

發(fā)布時(shí)間：2023/12/4 数据库 48 豆豆

生活随笔收集整理的這篇文章主要介紹了 Caffe 在自己的数据库上训练步骤小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

回憶ImageNet的步驟：http://caffe.berkeleyvision.org/gathered/examples/imagenet.html

Brewing ImageNet

This guide is meant to get you ready to train your own model on your own data. If you just want an ImageNet-trained network, then note that since training takes a lot of energy and we hate global warming, we provide the CaffeNet model trained as described below in the?model zoo.

Data Preparation

The guide specifies all paths and assumes all commands are executed from the root caffe directory.

By “ImageNet” we here mean the ILSVRC12 challenge, but you can easily train on the whole of ImageNet as well, just with more disk space, and a little longer training time.

We assume that you already have downloaded the ImageNet training data and validation data, and they are stored on your disk like:

/path/to/imagenet/train/n01440764/n01440764_10026.JPEG /path/to/imagenet/val/ILSVRC2012_val_00000001.JPEG

You will first need to prepare some auxiliary data for training. This data can be downloaded by:

./data/ilsvrc12/get_ilsvrc_aux.sh

The training and validation input are described in?train.txt?and?val.txt?as text listing all the files and their labels. Note that we use a different indexing for labels than the ILSVRC devkit: we sort the synset names in their ASCII order, and then label them from 0 to 999. See?synset_words.txt?for the synset/name mapping.

You may want to resize the images to 256x256 in advance. By default, we do not explicitly do this because in a cluster environment, one may benefit from resizing images in a parallel fashion, using mapreduce. For example, Yangqing used his lightweight?mincepie?package. If you prefer things to be simpler, you can also use shell commands, something like:

for name in /path/to/imagenet/val/*.JPEG; doconvert -resize 256x256\! $name $name done

Take a look at?examples/imagenet/create_imagenet.sh. Set the paths to the train and val dirs as needed, and set “RESIZE=true” to resize all images to 256x256 if you haven’t resized the images in advance. Now simply create the leveldbs with?examples/imagenet/create_imagenet.sh. Note thatexamples/imagenet/ilsvrc12_train_leveldb?and?examples/imagenet/ilsvrc12_val_leveldb?should not exist before this execution. It will be created by the script.?GLOG_logtostderr=1?simply dumps more information for you to inspect, and you can safely ignore it.

Compute Image Mean

The model requires us to subtract the image mean from each image, so we have to compute the mean.?tools/compute_image_mean.cpp?implements that - it is also a good example to familiarize yourself on how to manipulate the multiple components, such as protocol buffers, leveldbs, and logging, if you are not familiar with them. Anyway, the mean computation can be carried out as:

./examples/imagenet/make_imagenet_mean.sh

which will make?data/ilsvrc12/imagenet_mean.binaryproto.

Model Definition

We are going to describe a reference implementation for the approach first proposed by Krizhevsky, Sutskever, and Hinton in their?NIPS 2012 paper.

The network definition (models/bvlc_reference_caffenet/train_val.prototxt) follows the one in Krizhevsky et al. Note that if you deviated from file paths suggested in this guide, you’ll need to adjust the relevant paths in the?.prototxt?files.

If you look carefully at?models/bvlc_reference_caffenet/train_val.prototxt, you will notice severalinclude?sections specifying either?phase: TRAIN?or?phase: TEST. These sections allow us to define two closely related networks in one file: the network used for training and the network used for testing. These two networks are almost identical, sharing all layers except for those marked with?include { phase: TRAIN }?or?include { phase: TEST }. In this case, only the input layers and one output layer are different.

Input layer differences:?The training network’s?data?input layer draws its data fromexamples/imagenet/ilsvrc12_train_leveldb?and randomly mirrors the input image. The testing network’s?data?layer takes data from?examples/imagenet/ilsvrc12_val_leveldb?and does not perform random mirroring.

Output layer differences:?Both networks output the?softmax_loss?layer, which in training is used to compute the loss function and to initialize the backpropagation, while in validation this loss is simply reported. The testing network also has a second output layer,?accuracy, which is used to report the accuracy on the test set. In the process of training, the test network will occasionally be instantiated and tested on the test set, producing lines like?Test score #0: xxx?and?Test score #1: xxx. In this case score 0 is the accuracy (which will start around 1/1000 = 0.001 for an untrained network) and score 1 is the loss (which will start around 7 for an untrained network).

We will also lay out a protocol buffer for running the solver. Let’s make a few plans:

We will run in batches of 256, and run a total of 450,000 iterations (about 90 epochs).
For every 1,000 iterations, we test the learned net on the validation data.
We set the initial learning rate to 0.01, and decrease it every 100,000 iterations (about 20 epochs).
Information will be displayed every 20 iterations.
The network will be trained with momentum 0.9 and a weight decay of 0.0005.
For every 10,000 iterations, we will take a snapshot of the current status.

Sound good? This is implemented in?models/bvlc_reference_caffenet/solver.prototxt.

Training ImageNet

Ready? Let’s train.

./build/tools/caffe train --solver=models/bvlc_reference_caffenet/solver.prototxt

Sit back and enjoy!

數(shù)據(jù)集準(zhǔn)備：

ImageNet consists of variable-resolution images, while our system requires a constant input dimensionality.Therefore, we down-sampled the images to a fixed resolution of 256 × 256. Given arectangular image, we first rescaled the image such that the shorter side was of length 256, and thencropped out the central 256×256 patch from the resulting image. We did not pre-process the imagesin any other way, except for subtracting the mean activity over the training set from each pixel.

參照?http://blog.csdn.net/u010417185/article/details/52651761

Data augmentation中的crop：

[python]?view plaincopy

layer?{??

??name:?"data"??

??type:?"Data"??

??top:?"data"??

??top:?"label"??

??include?{??

????phase:?TRAIN??

??}??

??transform_param?{??

????mirror:?true??

????crop_size:?600??

????mean_file:?"examples/images/imagenet_mean.binaryproto"??

??}??

??data_param?{??

????source:?"examples/images/train_lmdb"??

????batch_size:?256??

????backend:?LMDB??

??}??

}??

layer?{??

??name:?"data"??

??type:?"Data"??

??top:?"data"??

??top:?"label"??

??include?{??

????phase:?TEST??

??}??

??transform_param?{??

????mirror:?false??

????crop_size:?600??

????mean_file:?"examples/images/imagenet_mean.binaryproto"??

??}??

??data_param?{??

????source:?"examples/images/val_lmdb"??

????batch_size:?50??

????backend:?LMDB??

??}??

} ?

從上面的數(shù)據(jù)層的定義,看得出用了鏡像和crop_size,還定義了 mean_file。

利用crop_size這種方式可以剪裁中心關(guān)注點(diǎn)和邊角特征,mirror可以產(chǎn)生鏡像,彌補(bǔ)小數(shù)據(jù)集的不足.

這里要重點(diǎn)講一下crop_size在訓(xùn)練層與測(cè)試層的區(qū)別：

首先我們需要了解mean_file和crop_size沒什么大關(guān)系。mean_file是根據(jù)訓(xùn)練集圖片制作出來的，crop_size是對(duì)訓(xùn)練集圖像進(jìn)行裁剪，兩個(gè)都是對(duì)原始的訓(xùn)練集圖像進(jìn)行處理。如果原始訓(xùn)練圖像的尺寸大小為800*800，crop_size的圖片為600*600，則mean_file與crop_size的圖片均為800*800的圖像集。

文中用的是從256x256圖像上crop224x224區(qū)域，而如果尺寸超過256，則crop size也需要增大，盡管在multi-scale training中，提倡將同一大小的crop用在不同大小輸入圖像上，但那里最大也就是512，差距還好。

在caffe中，如果定義了crop_size，那么在train時(shí)會(huì)對(duì)大于crop_size的圖片進(jìn)行隨機(jī)裁剪，而在test時(shí)只是截取中間部分（詳見/caffe/src/caffe/data_transformer.cpp）：

[python]?view plaincopy

//We?only?do?random?crop?when?we?do?training.??

????if?(phase_?==?TRAIN)?{??

??????h_off?=?Rand(datum_height?-?crop_size?+?1);??

??????w_off?=?Rand(datum_width?-?crop_size?+?1);??

????}?else?{??

??????h_off?=?(datum_height?-?crop_size)?/?2;??

??????w_off?=?(datum_width?-?crop_size)?/?2;??

????}??

??} ?

從上述的代碼可以看出，如果我們輸入的圖片尺寸大于crop_size，那么圖片會(huì)被裁剪。當(dāng) phase 模式為 TRAIN 時(shí)，裁剪是隨機(jī)進(jìn)行裁剪，而當(dāng)為TEST 模式時(shí)，其裁剪方式則只是裁剪圖像的中間區(qū)域。

下面是我在網(wǎng)上找到的自己進(jìn)行圖像裁剪的程序：

可對(duì)照給出的網(wǎng)址進(jìn)行詳細(xì)閱讀：http://blog.csdn.NET/u011762313/article/details/48343799

我們可以手動(dòng)將圖片裁剪并導(dǎo)入pycaffe中，這樣能夠提高識(shí)別率（pycaffe利用caffemodel進(jìn)行分類中：進(jìn)行分類這一步改為如下）：

[python]?view plaincopy

#記錄分類概率分布??

pridects?=?np.zeros((1,?CLASS_NUM))??

#?圖片維度（高、寬）??

img_shape?=?np.array(img.shape)??

#?裁剪的大小（高、寬）??

crop_dims?=?(32,?96)??

crop_dims?=?np.array(crop_dims)??

#?這里使用的圖片高度全部固定為32，長(zhǎng)度可變，最小為96??

#?裁剪起點(diǎn)為0，終點(diǎn)為w_range??

w_range?=?img_shape[1]?-?crop_dims[1]??

#?從左往右剪一遍，再?gòu)挠彝蠹粢槐?#xff0c;步長(zhǎng)為96/4=24??

for?k?in?range(0,?w_range?+?1,?crop_dims[1]?/?4)?+?range(w_range,?1,?-crop_dims[1]?/?4):??

????#?裁剪圖片??

????crop_img?=?img[:,?k:k?+?crop_dims[1],?:]??

????#?數(shù)據(jù)輸入、預(yù)處理??

????net.blobs['data'].data[...]?=?transformer.preprocess('data',?crop_img)??

????#?前向迭代，即分類??

????out?=?net.forward()??

????#?每一次分類，概率分布疊加??

????pridects?+=?out['prob']??

#?取最大的概率分布為最終結(jié)果??

pridect?=?pridects.argmax() ?

caffe中提供了過采樣的方法（oversample），詳見/caffe/python/caffe/io.py，裁剪的是圖片中央、4個(gè)角以及鏡像共10張圖片。

在使用pycaffe定義網(wǎng)絡(luò)、使用pycaffe進(jìn)行網(wǎng)絡(luò)訓(xùn)練與測(cè)試之后得到caffemodel文件，下面利用caffemodel進(jìn)行分類：

導(dǎo)入相關(guān)庫(kù)

配置

GPU模式測(cè)試

數(shù)據(jù)輸入預(yù)處理

<code class="language-python hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;"># 'data'對(duì)應(yīng)于deploy文件： # input: "data" # input_dim: 1 # input_dim: 3 # input_dim: 32 # input_dim: 96 transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape}) # python讀取的圖片文件格式為H×W×K，需轉(zhuǎn)化為K×H×W transformer.set_transpose('data', (2, 0, 1)) # python中將圖片存儲(chǔ)為[0, 1]，而caffe中將圖片存儲(chǔ)為[0, 255]， # 所以需要一個(gè)轉(zhuǎn)換 transformer.set_raw_scale('data', 255) # caffe中圖片是BGR格式，而原始格式是RGB，所以要轉(zhuǎn)化 transformer.set_channel_swap('data', (2, 1, 0)) # 將輸入圖片格式轉(zhuǎn)化為合適格式（與deploy文件相同） net.blobs['data'].reshape(1, 3, 32, 96)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li></ul>

讀取圖片

進(jìn)行分類

最可能分類

在AlexNet訓(xùn)練中，trainset的batch_size是256，testset的batch_size是50，與兩個(gè)集合的大小不成比例。

關(guān)于輸入圖像尺寸問題：http://caffecn.cn/?/question/74

建議讀一下caffe.proto文件，里面有對(duì)每種layer的詳細(xì)參數(shù)定義，在ConvolutioalParameter里可以找到你想找到的。

看examples/imagenet里面的convert_imageset.sh

GLOG_logtostderr=1 $TOOLS/convert_imageset \
???? --resize_height=$RESIZE_HEIGHT \
??? --resize_width=$RESIZE_WIDTH \
??? --shuffle \
??? $TRAIN_DATA_ROOT \
??? $DATA/train.txt \
??? $EXAMPLE/ilsvrc12_train_lmdb

構(gòu)造一個(gè)網(wǎng)絡(luò)首先要保證數(shù)據(jù)流是通的，即各層的輸出形狀的是整數(shù)，不能是小數(shù)。至于構(gòu)造出來的網(wǎng)絡(luò)效果好不好，按下不表。只要數(shù)據(jù)流通暢，你的輸入圖像是什么形狀的都無所謂了。
?
如果你的圖像是邊長(zhǎng)為 256 的正方形。那么卷積層的輸出就滿足 [ (256 - kernel_size)/ stride ] + 1 ，這個(gè)數(shù)值得是整數(shù)才行，否則沒有物理意義。例如，你算得一個(gè)邊長(zhǎng)為 7.7 的 feature map 是沒有物理意義的。?pooling 層同理可得。FC 層的輸出形狀總是滿足整數(shù)，其唯一的要求就是整個(gè)訓(xùn)練過程中 FC 層的輸入得是定長(zhǎng)的。
?
如果你的圖像不是正方形。那么可以在制作 leveldb / lmdb 數(shù)據(jù)庫(kù)時(shí)，縮放到統(tǒng)一大小（非正方形）。然后再使用非正方形的 kernel_size 來使得卷積層的輸出依然是整數(shù)。

其他問題：http://blog.csdn.net/u010417185/article/details/52649178

1、均值計(jì)算是否需要統(tǒng)一圖像的尺寸？

在圖像計(jì)算均值時(shí)，應(yīng)該先統(tǒng)一圖像的尺寸，否則會(huì)報(bào)出錯(cuò)誤的。

粘貼一部分官方語(yǔ)言：

均值削減是數(shù)據(jù)預(yù)處理中常見的處理方式，按照之前在學(xué)習(xí)ufldl教程PCA的一章時(shí)，對(duì)于圖像介紹了兩種：第一種常用的方式叫做dimension_mean（個(gè)人命名），是依據(jù)輸入數(shù)據(jù)的維度，每個(gè)維度內(nèi)進(jìn)行削減，這個(gè)也是常見的做法；第二種叫做per_image_mean，ufldl教程上說，在natural images上訓(xùn)練網(wǎng)絡(luò)時(shí)；給每個(gè)像素（這里只每個(gè)dimension）計(jì)算一個(gè)獨(dú)立的均值和方差是make little sense的；這是因?yàn)閳D像本身具有統(tǒng)計(jì)不變性，即在圖像的一部分的統(tǒng)計(jì)特性和另一部分相同。作者最后建議，如果你訓(xùn)練你的算法在非natural images（如mnist，或者在白背景存在單個(gè)獨(dú)立的物體），其他類型的規(guī)則化是值得考慮的。但是當(dāng)在natural images上訓(xùn)練時(shí)，per_image_mean是一個(gè)合理的默認(rèn)選擇。

這段話意在告訴我們?cè)谟?xùn)練的圖像不同，我們均值采用的方法亦可發(fā)生變化。

了解完后我們來看一下如果圖像尺寸不統(tǒng)一會(huì)報(bào)出什么樣子的錯(cuò)誤：

上圖中很明顯爆出了“size_in_datum == data_size ” 的錯(cuò)誤。

下面是小編找到的問題原因：

在把圖片轉(zhuǎn)化到levelDB中遇到了Check failed: data.size() == data_size，歸根究底還是源碼沒細(xì)看，找到出錯(cuò)的行在F0714 20:31:14.899121 26565 convert_imageset.cpp:84] convert_imageset.cpp中的第84行，? CHECK_EQ(data.size(), data_size) << "Incorrect data field size "?<< data.size();就是說兩個(gè)大小不一致，再看代碼

[cpp]?view plaincopy

int?data_size;??

???bool?data_size_initialized?=?false;??

???for?(int?line_id?=?0;?line_id?<?lines.size();?++line_id)?{??

?????if?(!ReadImageToDatum(root_folder?+?lines[line_id].first,lines[line_id].second,?datum))?{??

???????continue;??

?????}??

?????if?(!data_size_initialized)?{??

???????data_size?=?datum.channels()?*?datum.height()?*?datum.width();??

???????data_size_initialized?=?true;??

?????}?else?{??

???????const?string&?data?=?datum.data();??

???????CHECK_EQ(data.size(),?data_size)?<<?"Incorrect?data?field?size?"??

???????????<<?data.size();??

?????} ?

從上面的代碼可知，第一次循環(huán)中，data_size_initialized=false，然后進(jìn)入到if (!data_size_initialized) 中，把data_size設(shè)為了datum.channels() * datum.height() * datum.width()，同時(shí)把data_size_initialized=true，在以后的迭代中，都是執(zhí)行else語(yǔ)句，從而加入圖片大小不一致會(huì)報(bào)錯(cuò)，處理的辦法可選的是，在轉(zhuǎn)換到數(shù)據(jù)庫(kù)levelDB前，讓圖片resize到一樣的大小，或者把ReadImageToDatum改成ReadImageToDatum(root_folder + lines[line_id].first,lines[line_id].second,width,height ,datum)。

參考博文地址：http://blog.csdn.NET/alan317/article/details/37772457

2、caffe實(shí)際運(yùn)行中圖像大小不一，放大縮小時(shí)都有可能失真，此時(shí)該如何處理數(shù)據(jù)？

如果處理的圖像大小不一且過度放大或者過度縮小會(huì)造成圖像嚴(yán)重失真且丟失信息，則不能直接對(duì)圖像尺寸進(jìn)行歸一化。

措施：

可以采用一個(gè)居中的尺寸，例如統(tǒng)一圖像的寬度為600，而高度根據(jù)寬度的大小按照比例進(jìn)行縮放。處理完之后可以對(duì)圖像進(jìn)行切片處理，進(jìn)而將圖像尺寸進(jìn)行歸一化。

3、Crop_size的作用？

對(duì)圖像進(jìn)行裁剪，如果原圖為800*800，而我們只需進(jìn)行600*600圖像檢測(cè)時(shí)，我們可以使用crop_size進(jìn)行圖像截取。當(dāng)截取的模式為TRAIN時(shí)，截取方式為隨機(jī)截取。其他的模式則只截取圖像的中間區(qū)域。具體可查看http://blog.csdn.net/u010417185/article/details/52651761

4、在網(wǎng)絡(luò)配置文件中的 test_iter 值得確定

[python]?view plaincopy

#?reduce?the?learning?rate?after?8?epochs?(4000?iters)?by?a?factor?of?10??

#?The?train/test?net?protocol?buffer?definition??

net:?"examples/cifar10/cifar10_quick_train_test.prototxt"??

#?test_iter?specifies?how?many?forward?passes?the?test?should?carry?out.??

#?In?the?case?of?MNIST,?we?have?test?batch?size?100?and?100?test?iterations,??

#?covering?the?full?10,000?testing?images.??

test_iter:?100??

#?Carry?out?testing?every?500?training?iterations.??

test_interval:?100??

#?The?base?learning?rate,?momentum?and?the?weight?decay?of?the?network.??

base_lr:?0.001??

momentum:?0.9??

weight_decay:?0.004??

#?The?learning?rate?policy??

lr_policy:?"fixed"??

#?Display?every?100?iterations??

display:?100??

#?The?maximum?number?of?iterations??

max_iter:?4000??

#?snapshot?intermediate?results??

snapshot:?4000??

snapshot_format:?HDF5??

snapshot_prefix:?"examples/cifar10/cifar10_quick"??

#?solver?mode:?CPU?or?GPU??

solver_mode:?CPU ?

在設(shè)置配置時(shí)，對(duì)于test_iter值的計(jì)算有一點(diǎn)模糊，不知是根據(jù)batch size 值與整體圖像庫(kù)（測(cè)試集合與訓(xùn)練集合）還是單獨(dú)的某個(gè)圖像集合數(shù)據(jù)計(jì)算獲得。后來通過認(rèn)真讀給出的解釋與實(shí)例，最終確定該值是batch size 值與測(cè)試圖像集合計(jì)算獲得的。若batch size 值為100，而訓(xùn)練集合含有6000幅圖片，測(cè)試集含有1000幅圖片，則test_iter值為1000/10，與訓(xùn)練集的圖片量無關(guān)。

整體步驟：參照http://blog.csdn.net/alexqiweek/article/details/51281240

1.數(shù)據(jù)準(zhǔn)備

在caffe/data下新建目錄myself，并在myself里又新建兩個(gè)目錄train、val。

注意：圖片的格式必須為.jpeg格式

train存放訓(xùn)練用的數(shù)據(jù)源；該目錄下又兩個(gè)目錄bird(70張圖)、cat(70張圖)

val存放用于測(cè)試的數(shù)據(jù)源；bird和cat各20張圖

在終端下切換到caffe/data/myself目錄下，利用上面的數(shù)據(jù)源生成train.txt、val.txt、test.txt。

test.txt的內(nèi)容和val.txt相同，只是沒有后面的數(shù)字標(biāo)識(shí)。

生成val.txt的命令：find? -name *.jpeg |grep -v train | cut -d/ -f3>val.txt

生成train.txt的命令：find? -name *.jpeg |grep? train | cut -d/ -f3-4 > train.txt；但由于bird和cat的圖片需要通過在后面添加不同的數(shù)字區(qū)分開來，因此還需命令：sed -i '1,70s/.*/&? 0/' train.txt和sed -i'71,141s/.*/&? 1/' train.txt

2創(chuàng)建數(shù)據(jù)庫(kù)

在caffe/example目錄下新建目錄myself。并將caffe/examples/imagenet目錄下create_imagenet.sh文件拷貝到myself中。

create_imagenet.sh的內(nèi)容如下：

第5行的EXAMPLE指定生成的數(shù)據(jù)庫(kù)文件存放路徑。

第6行的DATA指定生成數(shù)據(jù)庫(kù)所需文件來源路徑。

第9行的TRAIN_DATA_ROOT指明存放訓(xùn)練數(shù)據(jù)的絕對(duì)路徑。

第10行的VAL_DATA_ROOT指明存放測(cè)試數(shù)據(jù)的絕對(duì)路徑。TRAIN_DATA_ROOT和VAL_DATA_ROOT寫錯(cuò)了，就會(huì)報(bào)一堆找不到圖片的錯(cuò)誤。

第12行到21行用于將圖片調(diào)節(jié)成統(tǒng)一大小，256X256。

第45、55行指定生成的數(shù)據(jù)庫(kù)文件夾的名稱。

在caffe的主目錄下輸了命令./examples/myself/create_imagenet.sh就會(huì)在create_imagenet.sh中的EXAMPLE所指定的目錄下(此次為example/myself)生成兩個(gè)數(shù)據(jù)庫(kù)文件。

3訓(xùn)練網(wǎng)絡(luò)【使用CaffeNet網(wǎng)絡(luò)進(jìn)行訓(xùn)練的時(shí)間可能比LeNet網(wǎng)絡(luò)用的時(shí)間多,本次實(shí)驗(yàn)使用的網(wǎng)絡(luò)是CaffeNet】

①??拷貝models/bvlc_alexnet目錄下的train_val.prototxt文件到example/myself目錄下。

該文件的定義的為待訓(xùn)練網(wǎng)絡(luò)的結(jié)構(gòu)。

②拷貝models/bvlc_alexnet目錄下的solver.prototxt文件到example/myself目錄下。

該文件為訓(xùn)練網(wǎng)絡(luò)時(shí)的所需的一些配置和設(shè)置

第1行指定定義網(wǎng)絡(luò)結(jié)構(gòu)的文件的相對(duì)路徑。

③??拷貝examples/imagenet目錄下的make_imagenet_mean.sh文件到examples/myself目錄下。用于計(jì)算圖像均值，使用的源文件在/tools/compute_image_mean.cpp。

④??拷貝examples/imagenet目錄下的train_caffenet.sh文件到example/myself目錄下。

該文件為一個(gè)腳本文件，內(nèi)容為訓(xùn)練網(wǎng)絡(luò)的命令

在caffe的主目錄下輸入命令：./ examples/myself/train_caffenet.sh開始訓(xùn)練網(wǎng)絡(luò)。

4使用測(cè)試數(shù)據(jù)測(cè)試網(wǎng)絡(luò)

使用命令：./build/tools/caffe.bintest --model=examples/myself/train_val.prototxt?--weights=examples/myself/caffenet_model/caffenet_train_iter_16000.caffemodel對(duì)網(wǎng)絡(luò)進(jìn)行測(cè)試。Train_val.prototxt為網(wǎng)絡(luò)的定義；caffenet_train_iter_16000.caffemodel為訓(xùn)練網(wǎng)絡(luò)時(shí)生成的模型。

[出現(xiàn)的問題]

[解決辦法]

復(fù)制上圖中的三個(gè)文件到/caffe/examples/mysel下。使其和與CaffeNet有關(guān)的網(wǎng)絡(luò)結(jié)構(gòu)定義文件.protxt、訓(xùn)練網(wǎng)絡(luò)時(shí)生成的.caffemodel和.solversate文件在同一目錄下。

5.使用某張圖片測(cè)試網(wǎng)絡(luò)，并顯示所提取的特征。

編寫Classification:Instant Recognition with Caffe有關(guān)的文件。

在/caffe/examples/myself/下使用命令Python?./xxxxx.py命令運(yùn)行Classification:Instant Recognition with Caffe有關(guān)的文件有關(guān)文件。

【出錯(cuò)1】

【解決辦法】

修改定義CaffeNet訓(xùn)練網(wǎng)絡(luò)結(jié)構(gòu)的定義.prototxt文件的有關(guān)內(nèi)容，使相對(duì)路徑變成絕對(duì)路徑。

【出錯(cuò)2】

【解決辦法2】

該問題無法解決，因?yàn)楝F(xiàn)在測(cè)試所用的網(wǎng)絡(luò)與訓(xùn)練所用的網(wǎng)絡(luò)是同一個(gè)網(wǎng)絡(luò)。可以考慮用其它的網(wǎng)絡(luò)來測(cè)試訓(xùn)練生成的模型是否準(zhǔn)確。修改前面提到的xxxxxxx.py文件

總結(jié)

以上是生活随笔為你收集整理的Caffe 在自己的数据库上训练步骤的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇：苹果怎么查询激活时间
下一篇： Mac（OS X）安装、配置并使用MyS

日韩av黄I国产麻豆传媒I国产91av视频在线观看I日韩一区二区三区在线看I美女国产在线I麻豆视频国产在线观看I成人黄色短片

数据库

Caffe 在自己的数据库上训练步骤

Data Preparation

Compute Image Mean

Model Definition

Training ImageNet

在使用pycaffe定義網(wǎng)絡(luò)、使用pycaffe進(jìn)行網(wǎng)絡(luò)訓(xùn)練與測(cè)試之后得到caffemodel文件，下面利用caffemodel進(jìn)行分類：

總結(jié)