回憶ImageNet的步驟 :http://caffe.berkeleyvision.org/gathered/examples/imagenet.html
Brewing ImageNet
This guide is meant to get you ready to train your own model on your own data. If you just want an ImageNet-trained network, then note that since training takes a lot of energy and we hate global warming, we provide the CaffeNet model trained as described below in the?model zoo.
Data Preparation
The guide specifies all paths and assumes all commands are executed from the root caffe directory.
By “ImageNet” we here mean the ILSVRC12 challenge, but you can easily train on the whole of ImageNet as well, just with more disk space, and a little longer training time.
We assume that you already have downloaded the ImageNet training data and validation data, and they are stored on your disk like:
/path/to/imagenet/train/n01440764/n01440764_10026.JPEG
/path/to/imagenet/val/ILSVRC2012_val_00000001.JPEG
You will first need to prepare some auxiliary data for training. This data can be downloaded by:
./data/ilsvrc12/get_ilsvrc_aux.sh
The training and validation input are described in?train.txt?and?val.txt?as text listing all the files and their labels. Note that we use a different indexing for labels than the ILSVRC devkit: we sort the synset names in their ASCII order, and then label them from 0 to 999. See?synset_words.txt?for the synset/name mapping.
You may want to resize the images to 256x256 in advance. By default, we do not explicitly do this because in a cluster environment, one may benefit from resizing images in a parallel fashion, using mapreduce. For example, Yangqing used his lightweight?mincepie?package. If you prefer things to be simpler, you can also use shell commands, something like:
for name in /path/to/imagenet/val/*.JPEG; doconvert -resize 256x256\! $name $name
done
Take a look at?examples/imagenet/create_imagenet.sh. Set the paths to the train and val dirs as needed, and set “RESIZE=true” to resize all images to 256x256 if you haven’t resized the images in advance. Now simply create the leveldbs with?examples/imagenet/create_imagenet.sh. Note thatexamples/imagenet/ilsvrc12_train_leveldb?and?examples/imagenet/ilsvrc12_val_leveldb?should not exist before this execution. It will be created by the script.?GLOG_logtostderr=1?simply dumps more information for you to inspect, and you can safely ignore it.
Compute Image Mean
The model requires us to subtract the image mean from each image, so we have to compute the mean.?tools/compute_image_mean.cpp?implements that - it is also a good example to familiarize yourself on how to manipulate the multiple components, such as protocol buffers, leveldbs, and logging, if you are not familiar with them. Anyway, the mean computation can be carried out as:
./examples/imagenet/make_imagenet_mean.sh
which will make?data/ilsvrc12/imagenet_mean.binaryproto.
Model Definition
We are going to describe a reference implementation for the approach first proposed by Krizhevsky, Sutskever, and Hinton in their?NIPS 2012 paper.
The network definition (models/bvlc_reference_caffenet/train_val.prototxt) follows the one in Krizhevsky et al. Note that if you deviated from file paths suggested in this guide, you’ll need to adjust the relevant paths in the?.prototxt?files.
If you look carefully at?models/bvlc_reference_caffenet/train_val.prototxt, you will notice severalinclude?sections specifying either?phase: TRAIN?or?phase: TEST. These sections allow us to define two closely related networks in one file: the network used for training and the network used for testing. These two networks are almost identical, sharing all layers except for those marked with?include { phase: TRAIN }?or?include { phase: TEST }. In this case, only the input layers and one output layer are different.
Input layer differences: ?The training network’s?data?input layer draws its data fromexamples/imagenet/ilsvrc12_train_leveldb?and randomly mirrors the input image. The testing network’s?data?layer takes data from?examples/imagenet/ilsvrc12_val_leveldb?and does not perform random mirroring.
Output layer differences: ?Both networks output the?softmax_loss?layer, which in training is used to compute the loss function and to initialize the backpropagation, while in validation this loss is simply reported. The testing network also has a second output layer,?accuracy, which is used to report the accuracy on the test set. In the process of training, the test network will occasionally be instantiated and tested on the test set, producing lines like?Test score #0: xxx?and?Test score #1: xxx. In this case score 0 is the accuracy (which will start around 1/1000 = 0.001 for an untrained network) and score 1 is the loss (which will start around 7 for an untrained network).
We will also lay out a protocol buffer for running the solver. Let’s make a few plans:
We will run in batches of 256, and run a total of 450,000 iterations (about 90 epochs). For every 1,000 iterations, we test the learned net on the validation data. We set the initial learning rate to 0.01, and decrease it every 100,000 iterations (about 20 epochs). Information will be displayed every 20 iterations. The network will be trained with momentum 0.9 and a weight decay of 0.0005. For every 10,000 iterations, we will take a snapshot of the current status.
Sound good? This is implemented in?models/bvlc_reference_caffenet/solver.prototxt.
Training ImageNet
Ready? Let’s train.
./build/tools/caffe train --solver=models/bvlc_reference_caffenet/solver.prototxt
Sit back and enjoy!
數(shù)據(jù)集準(zhǔn)備:
ImageNet consists of variable-resolution images, while our system requires a constant input dimensionality.Therefore, we down-sampled the images to a fixed resolution of 256 × 256. Given arectangular image, we first rescaled the image such that the shorter side was of length 256, and thencropped out the central 256×256 patch from the resulting image. We did not pre-process the imagesin any other way, except for subtracting the mean activity over the training set from each pixel.
參照?http://blog.csdn.net/u010417185/article/details/52651761
Data augmentation中的crop:
[python] ?view plaincopy
layer?{?? ??name:?"data" ?? ??type:?"Data" ?? ??top:?"data" ?? ??top:?"label" ?? ??include?{?? ????phase:?TRAIN?? ??}?? ??transform_param?{?? ????mirror:?true?? ????crop_size:?600 ?? ????mean_file:?"examples/images/imagenet_mean.binaryproto" ?? ??}?? ??data_param?{?? ????source:?"examples/images/train_lmdb" ?? ????batch_size:?256 ?? ????backend:?LMDB?? ??}?? }?? layer?{?? ??name:?"data" ?? ??type:?"Data" ?? ??top:?"data" ?? ??top:?"label" ?? ??include?{?? ????phase:?TEST?? ??}?? ??transform_param?{?? ????mirror:?false?? ????crop_size:?600 ?? ????mean_file:?"examples/images/imagenet_mean.binaryproto" ?? ??}?? ??data_param?{?? ????source:?"examples/images/val_lmdb" ?? ????batch_size:?50 ?? ????backend:?LMDB?? ??}?? } ?
從上面的 數(shù)據(jù)層的定義,看得出用了鏡像和crop_size,還定義了 mean_file。
利用crop_size這種方式可以剪裁中心關(guān)注點(diǎn)和邊角特征,mirror可以產(chǎn)生鏡像,彌補(bǔ)小數(shù)據(jù)集的不足.
這里要重點(diǎn)講一下crop_size在訓(xùn)練層與測(cè)試層的區(qū)別:
首先我們需要了解mean_file和crop_size沒什么大關(guān)系。mean_file是根據(jù)訓(xùn)練集圖片制作出來的,crop_size是對(duì)訓(xùn)練集圖像進(jìn)行裁剪,兩個(gè)都是對(duì)原始的訓(xùn)練集圖像進(jìn)行處理。如果原始訓(xùn)練圖像的尺寸大小為800*800,crop_size的圖片為600*600,則mean_file與crop_size的圖片均為800*800的圖像集。
文中用的是從256x256圖像上crop224x224區(qū)域,而如果尺寸超過256,則crop size也需要增大,盡管在multi-scale training中,提倡將同一大小的crop用在不同大小輸入圖像上,但那里最大也就是512,差距還好。
在caffe中,如果定義了crop_size,那么在train時(shí)會(huì)對(duì)大于crop_size的圖片進(jìn)行隨機(jī)裁剪,而在test時(shí)只是截取中間部分(詳見/caffe/src/caffe/data_transformer.cpp ):
[python] ?view plaincopy
//We?only?do?random?crop?when?we?do?training.?? ????if ?(phase_?==?TRAIN)?{?? ??????h_off?=?Rand(datum_height?-?crop_size?+?1 );?? ??????w_off?=?Rand(datum_width?-?crop_size?+?1 );?? ????}?else ?{?? ??????h_off?=?(datum_height?-?crop_size)?/?2 ;?? ??????w_off?=?(datum_width?-?crop_size)?/?2 ;?? ????}?? ??} ?
下面是我在網(wǎng)上找到的自己進(jìn)行圖像裁剪的程序:
可對(duì)照給出的網(wǎng)址進(jìn)行詳細(xì)閱讀:http://blog.csdn.NET/u011762313/article/details/48343799
我們可以手動(dòng)將圖片裁剪并導(dǎo)入pycaffe中,這樣能夠提高識(shí)別率(pycaffe利用caffemodel進(jìn)行分類中:進(jìn)行分類這一步改為如下):
[python] ?view plaincopy
?? pridects?=?np.zeros((1 ,?CLASS_NUM))?? ?? ?? img_shape?=?np.array(img.shape)?? ?? crop_dims?=?(32 ,? 96 )?? crop_dims?=?np.array(crop_dims)?? ?? ?? w_range?=?img_shape[1 ]?-?crop_dims[ 1 ]?? ?? for ?k? in ?range( 0 ,?w_range?+? 1 ,?crop_dims[ 1 ]?/? 4 )?+?range(w_range,? 1 ,?-crop_dims[ 1 ]?/? 4 ):?? ?????? ????crop_img?=?img[:,?k:k?+?crop_dims[1 ],?:]?? ?????? ????net.blobs['data' ].data[...]?=?transformer.preprocess( 'data' ,?crop_img)?? ?????? ????out?=?net.forward()?? ?????? ????pridects?+=?out['prob' ]?? ?? ?? pridect?=?pridects.argmax() ?
caffe中提供了過采樣的方法(oversample ),詳見/caffe/python/caffe/io.py ,裁剪的是圖片中央、4個(gè)角以及鏡像共10張圖片。
在使用pycaffe定義網(wǎng)絡(luò)、使用pycaffe進(jìn)行網(wǎng)絡(luò)訓(xùn)練與測(cè)試之后得到caffemodel文件,下面利用caffemodel進(jìn)行分類: <code class="language-python hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> caffe</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># caffemodel文件</span>
MODEL_FILE = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'model/_iter_10000.caffemodel'</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># deploy文件,參考/caffe/models/bvlc_alexnet/deploy.prototxt</span>
DEPLOY_FILE = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'deploy.prototxt'</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 測(cè)試圖片存放文件夾</span>
TEST_ROOT = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'datas/'</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li></ul>
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;">caffe.set_mode_gpu()
net = caffe.Net(DEPLOY_FILE, MODEL_FILE, caffe.TEST)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul>
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 'data'對(duì)應(yīng)于deploy文件:</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># input: "data"</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># input_dim: 1</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># input_dim: 3</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># input_dim: 32</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># input_dim: 96</span>
transformer = caffe.io.Transformer({<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'data'</span>: net.blobs[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'data'</span>].data.shape})
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># python讀取的圖片文件格式為H×W×K,需轉(zhuǎn)化為K×H×W</span>
transformer.set_transpose(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'data'</span>, (<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>))
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># python中將圖片存儲(chǔ)為[0, 1],而caffe中將圖片存儲(chǔ)為[0, 255],</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 所以需要一個(gè)轉(zhuǎn)換</span>
transformer.set_raw_scale(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'data'</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">255</span>)
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># caffe中圖片是BGR格式,而原始格式是RGB,所以要轉(zhuǎn)化</span>
transformer.set_channel_swap(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'data'</span>, (<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>))
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 將輸入圖片格式轉(zhuǎn)化為合適格式(與deploy文件相同)</span>
net.blobs[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'data'</span>].reshape(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">3</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">32</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">96</span>)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li></ul>
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 詳見/caffe/python/caffe/io.py</span>
img = caffe.io.load_image(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'temp.jpg'</span>)
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 讀取的圖片文件格式為H×W×K,需轉(zhuǎn)化</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li></ul>
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 數(shù)據(jù)輸入、預(yù)處理</span>
net.blobs[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'data'</span>].data[...] = transformer.preprocess(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'data'</span>, img)
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 前向迭代,即分類</span>
out = net.forward()
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 輸出結(jié)果為各個(gè)可能分類的概率分布</span>
pridects = out[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'prob'</span>]
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 上述'prob'來源于deploy文件:</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># layer {</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># name: "prob"</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># type: "Softmax"</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># bottom: "ip2"</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># top: "prob"</span>
<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># }</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right: 1px solid rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li></ul>
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; background: transparent; color: inherit; box-sizing: border-box; font-family: "Source Code Pro", monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal;">pridect = pridects.argmax()</code>
注:如果圖片過大, 需要適當(dāng)縮小batch_size的值,否則使用GPU時(shí)可能超出其緩存大小而報(bào)錯(cuò)
在AlexNet訓(xùn)練中,trainset的batch_size是256,testset的batch_size是50,與兩個(gè)集合的大小不成比例。
關(guān)于輸入圖像尺寸問題:http://caffecn.cn/?/question/74
建議讀一下caffe.proto文件,里面有對(duì)每種layer的詳細(xì)參數(shù)定義,在ConvolutioalParameter里可以找到你想找到的。
看examples/imagenet里面的convert_imageset.sh
GLOG_logtostderr=1 $TOOLS/convert_imageset \
????
--resize_height=$RESIZE_HEIGHT \ ??? --resize_width=$RESIZE_WIDTH \
??? --shuffle \
??? $TRAIN_DATA_ROOT \
??? $DATA/train.txt \
??? $EXAMPLE/ilsvrc12_train_lmdb
構(gòu)造一個(gè)網(wǎng)絡(luò)首先要保證數(shù)據(jù)流是通的,即各層的輸出形狀的是整數(shù),不能是小數(shù)。至于構(gòu)造出來的網(wǎng)絡(luò)效果好不好,按下不表。只要數(shù)據(jù)流通暢,你的輸入圖像是什么形狀的都無所謂了。 ? 如果你的圖像是邊長(zhǎng)為 256 的正方形。那么卷積層的輸出就滿足 [ (256 - kernel_size)/ stride ] + 1 ,這個(gè)數(shù)值得是整數(shù)才行,否則沒有物理意義。例如,你算得一個(gè)邊長(zhǎng)為 7.7 的 feature map 是沒有物理意義的。?pooling 層同理可得。FC 層的輸出形狀總是滿足整數(shù),其唯一的要求就是整個(gè)訓(xùn)練過程中 FC 層的輸入得是定長(zhǎng)的。 ? 如果你的圖像不是正方形。那么可以 在制作 leveldb / lmdb 數(shù)據(jù)庫(kù)時(shí),縮放到統(tǒng)一大小(非正方形)。然后再 使用非正方形的 kernel_size 來使得卷積層的輸出依然是整數(shù)。
其他問題:http://blog.csdn.net/u010417185/article/details/52649178
1、均值計(jì)算是否需要統(tǒng)一圖像的尺寸?
在圖像計(jì)算均值時(shí),應(yīng)該先統(tǒng)一圖像的尺寸,否則會(huì)報(bào)出錯(cuò)誤的。
粘貼一部分官方語(yǔ)言:
均值削減是數(shù)據(jù)預(yù)處理中常見的處理方式,按照之前在學(xué)習(xí)ufldl教程PCA的一章時(shí),對(duì)于圖像介紹了兩種:第一種常用的方式叫做dimension_mean(個(gè)人命名),是依據(jù)輸入數(shù)據(jù)的維度,每個(gè)維度內(nèi)進(jìn)行削減,這個(gè)也是常見的做法;第二種叫做per_image_mean,ufldl教程上說,在natural images上訓(xùn)練網(wǎng)絡(luò)時(shí);給每個(gè)像素(這里只每個(gè)dimension)計(jì)算一個(gè)獨(dú)立的均值和方差是make little sense的;這是因?yàn)閳D像本身具有統(tǒng)計(jì)不變性,即在圖像的一部分的統(tǒng)計(jì)特性和另一部分相同。作者最后建議,如果你訓(xùn)練你的算法在非natural images(如mnist,或者在白背景存在單個(gè)獨(dú)立的物體),其他類型的規(guī)則化是值得考慮的。但是當(dāng)在natural images上訓(xùn)練時(shí),per_image_mean是一個(gè)合理的默認(rèn)選擇。
這段話意在告訴我們?cè)谟?xùn)練的圖像不同,我們均值采用的方法亦可發(fā)生變化。
了解完后我們來看一下如果圖像尺寸不統(tǒng)一會(huì)報(bào)出什么樣子的錯(cuò)誤:
上圖中很明顯爆出了“size_in_datum == data_size ” 的錯(cuò)誤。
下面是小編找到的問題原因:
在把圖片轉(zhuǎn)化到levelDB中遇到了Check failed: data.size() == data_size,歸根究底還是源碼沒細(xì)看,找到出錯(cuò)的行在F0714 20:31:14.899121 26565 convert_imageset.cpp:84] convert_imageset.cpp中的第84行,? CHECK_EQ(data.size(), data_size) << "Incorrect data field size "?<< data.size();就是說兩個(gè)大小不一致,再看代碼
[cpp] ?view plaincopy
int ?data_size;?? ???bool ?data_size_initialized?=? false ;?? ???for ?( int ?line_id?=?0;?line_id?<?lines.size();?++line_id)?{?? ?????if ?(!ReadImageToDatum(root_folder?+?lines[line_id].first,lines[line_id].second,?datum))?{?? ???????continue ;?? ?????}?? ?????if ?(!data_size_initialized)?{?? ???????data_size?=?datum.channels()?*?datum.height()?*?datum.width();?? ???????data_size_initialized?=?true ;?? ?????}?else ?{?? ???????const ?string&?data?=?datum.data();?? ???????CHECK_EQ(data.size(),?data_size)?<<?"Incorrect?data?field?size?" ?? ???????????<<?data.size();?? ?????} ?
從上面的代碼可知,第一次循環(huán)中,data_size_initialized=false,然后進(jìn)入到if (!data_size_initialized) 中,把data_size設(shè)為了datum.channels() * datum.height() * datum.width(),同時(shí)把data_size_initialized=true,在以后的迭代中,都是執(zhí)行else語(yǔ)句,從而加入圖片大小不一致會(huì)報(bào)錯(cuò),處理的辦法可選的是,在轉(zhuǎn)換到數(shù)據(jù)庫(kù)levelDB前,讓圖片resize到一樣的大小,或者把ReadImageToDatum改成ReadImageToDatum(root_folder + lines[line_id].first,lines[line_id].second,width,height ,datum)。
參考博文地址:http://blog.csdn.NET/alan317/article/details/37772457
2、caffe實(shí)際運(yùn)行中圖像大小不一,放大縮小時(shí)都有可能失真,此時(shí)該如何處理數(shù)據(jù)?
如果處理的圖像大小不一且過度放大或者過度縮小會(huì)造成圖像嚴(yán)重失真且丟失信息,則不能直接對(duì)圖像尺寸進(jìn)行歸一化。
措施:
可以采用一個(gè)居中的尺寸,例如統(tǒng)一圖像的寬度為600,而高度根據(jù)寬度的大小按照比例進(jìn)行縮放。處理完之后可以對(duì)圖像進(jìn)行切片處理,進(jìn)而將圖像尺寸進(jìn)行歸一化。
3、Crop_size的作用?
對(duì)圖像進(jìn)行裁剪,如果原圖為800*800,而我們只需進(jìn)行600*600圖像檢測(cè)時(shí),我們可以使用crop_size進(jìn)行圖像截取。當(dāng)截取的模式為TRAIN時(shí),截取方式為隨機(jī)截取。其他的模式則只截取圖像的中間區(qū)域。
具體可查看http://blog.csdn.net/u010417185/article/details/52651761
4、在網(wǎng)絡(luò)配置文件中的 test_iter 值得確定
[python] ?view plaincopy
?? ?? ?? net:?"examples/cifar10/cifar10_quick_train_test.prototxt" ?? ?? ?? ?? test_iter:?100 ?? ?? test_interval:?100 ?? ?? base_lr:?0.001 ?? momentum:?0.9 ?? weight_decay:?0.004 ?? ?? lr_policy:?"fixed" ?? ?? display:?100 ?? ?? max_iter:?4000 ?? ?? snapshot:?4000 ?? snapshot_format:?HDF5?? snapshot_prefix:?"examples/cifar10/cifar10_quick" ?? ?? solver_mode:?CPU ?
在設(shè)置配置時(shí),對(duì)于test_iter值的計(jì)算有一點(diǎn)模糊,不知是根據(jù)batch size 值與整體圖像庫(kù)(測(cè)試集合與訓(xùn)練集合)還是單獨(dú)的某個(gè)圖像集合數(shù)據(jù)計(jì)算獲得。后來通過認(rèn)真讀給出的解釋與實(shí)例,最終確定該值是batch size 值與測(cè)試圖像集合計(jì)算獲得的。若batch size 值為100,而訓(xùn)練集合含有6000幅圖片,測(cè)試集含有1000幅圖片,則test_iter值為1000/10,與訓(xùn)練集的圖片量無關(guān)。
整體步驟:參照http://blog.csdn.net/alexqiweek/article/details/51281240
1.數(shù)據(jù)準(zhǔn)備
在caffe/data下新建目錄myself,并在myself里又新建兩個(gè)目錄train、val。
?
注意:圖片的格式必須為 .jpeg 格式
train存放訓(xùn)練用的數(shù)據(jù)源;該目錄下又兩個(gè)目錄bird(70張圖)、cat(70張圖)
?
?
?
val存放用于測(cè)試的數(shù)據(jù)源;bird和cat各20張圖
?
在終端下切換到caffe/data/myself目錄下,利用上面的數(shù)據(jù)源生成train.txt、val.txt、test.txt。
test.txt的內(nèi)容和val.txt相同,只是沒有后面的數(shù)字標(biāo)識(shí)。
?
生成 val.txt 的命令 :find? -name *.jpeg |grep -v train | cut -d/ -f3>val.txt
?
生成 train.txt 的命令 :find? -name *.jpeg |grep? train | cut -d/ -f3-4 > train.txt;但由于bird和cat的圖片需要通過在后面添加不同的數(shù)字區(qū)分開來,因此還需命令:sed -i '1,70s/.*/&? 0/' train.txt和sed -i'71,141s/.*/&? 1/' train.txt
2創(chuàng)建數(shù)據(jù)庫(kù)
在caffe/example目錄下新建目錄myself。并將caffe/examples/imagenet目錄下create_imagenet.sh文件拷貝到myself中。
?
create_imagenet.sh的內(nèi)容如下:
第5行的EXAMPLE指定生成的數(shù)據(jù)庫(kù)文件存放路徑。
第6行的DATA指定生成數(shù)據(jù)庫(kù)所需文件來源路徑。
第 9 行的 TRAIN_DATA_ROOT 指明存放訓(xùn)練數(shù)據(jù)的絕對(duì)路徑。
第 10 行的VAL_DATA_ROOT 指明存放測(cè)試數(shù)據(jù)的絕對(duì)路徑。 TRAIN_DATA_ROOT 和VAL_DATA_ROOT 寫錯(cuò)了,就會(huì)報(bào)一堆找不到圖片的錯(cuò)誤。
第12行到21行用于將圖片調(diào)節(jié)成統(tǒng)一大小,256X256。
?
第45、55行指定生成的數(shù)據(jù)庫(kù)文件夾的名稱。
?
在caffe的主目錄下輸了命令./examples/myself/create_imagenet.sh就會(huì)在create_imagenet.sh中的EXAMPLE所指定的目錄下(此次為example/myself)生成兩個(gè)數(shù)據(jù)庫(kù)文件。
?
3訓(xùn)練網(wǎng)絡(luò)【使用CaffeNet網(wǎng)絡(luò)進(jìn)行訓(xùn)練的時(shí)間可能比LeNet網(wǎng)絡(luò)用的時(shí)間多,本次實(shí)驗(yàn)使用的網(wǎng)絡(luò)是 CaffeNet 】
①??拷貝models/bvlc_alexnet目錄下的train_val.prototxt文件到example/myself目錄下。
該文件的定義的為待訓(xùn)練網(wǎng)絡(luò)的結(jié)構(gòu)。
?
②拷貝models/bvlc_alexnet目錄下的solver.prototxt文件到example/myself目錄下。
該文件為訓(xùn)練網(wǎng)絡(luò)時(shí)的所需的一些配置和設(shè)置
第1行指定定義網(wǎng)絡(luò)結(jié)構(gòu)的文件的相對(duì)路徑。
?
③??拷貝examples/imagenet目錄下的make_imagenet_mean.sh文件到examples/myself目錄下。用于計(jì)算圖像均值,使用的源文件在/tools/compute_image_mean.cpp。
?
?
④??拷貝examples/imagenet目錄下的train_caffenet.sh文件到example/myself目錄下。
該文件為一個(gè)腳本文件,內(nèi)容為訓(xùn)練網(wǎng)絡(luò)的命令
?
在caffe的主目錄下輸入命令:./ examples/myself/train_caffenet.sh開始訓(xùn)練網(wǎng)絡(luò)。
?
?
?
4使用測(cè)試數(shù)據(jù)測(cè)試網(wǎng)絡(luò)
使用命令:./build/tools/caffe.bintest --model=examples/myself/train_val.prototxt?--weights=examples/myself/caffenet_model/caffenet_train_iter_16000.caffemodel對(duì)網(wǎng)絡(luò)進(jìn)行測(cè)試。Train_val.prototxt為網(wǎng)絡(luò)的定義;caffenet_train_iter_16000.caffemodel為訓(xùn)練網(wǎng)絡(luò)時(shí)生成的模型。
[ 出現(xiàn)的問題 ]
[ 解決辦法 ]
復(fù)制上圖中的三個(gè)文件到 /caffe/examples/mysel 下。使其和與 CaffeNet 有關(guān)的網(wǎng)絡(luò)結(jié)構(gòu)定義文件 .protxt 、訓(xùn)練網(wǎng)絡(luò)時(shí)生成的 .caffemodel 和 .solversate 文件在同一目錄下。
?
5.使用某張圖片測(cè)試網(wǎng)絡(luò),并顯示所提取的特征。
編寫Classification:Instant Recognition with Caffe有關(guān)的文件。
在/caffe/examples/myself/下使用命令Python?./xxxxx.py命令運(yùn)行Classification:Instant Recognition with Caffe有關(guān)的文件有關(guān)文件。
【出錯(cuò) 1 】
【解決辦法】
修改定義 CaffeNet 訓(xùn)練網(wǎng)絡(luò)結(jié)構(gòu)的定義 .prototxt 文件的有關(guān)內(nèi)容,使相對(duì)路徑變成絕對(duì)路徑。
?
?
【出錯(cuò) 2 】
【解決辦法 2 】
該問題無法解決,因?yàn)楝F(xiàn)在測(cè)試所用的網(wǎng)絡(luò)與訓(xùn)練所用的網(wǎng)絡(luò)是同一個(gè)網(wǎng)絡(luò)。可以考慮用其它的網(wǎng)絡(luò)來測(cè)試訓(xùn)練生成的模型是否準(zhǔn)確。修改前面提到的 xxxxxxx.py 文件
總結(jié)
以上是生活随笔 為你收集整理的Caffe 在自己的数据库上训练步骤 的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔 網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔 推薦給好友。