當(dāng)前位置：首頁 >

Caffe官方教程翻译（6）：Learning LeNet

發(fā)布時(shí)間：2025/3/21 104 豆豆

生活随笔收集整理的這篇文章主要介紹了 Caffe官方教程翻译（6）：Learning LeNet 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

前言

最近打算重新跟著官方教程學(xué)習(xí)一下caffe，順便也自己翻譯了一下官方的文檔。自己也做了一些標(biāo)注，都用斜體標(biāo)記出來了。中間可能額外還加了自己遇到的問題或是運(yùn)行結(jié)果之類的。歡迎交流指正，拒絕噴子！
官方教程的原文鏈接：http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/01-learning-lenet.ipynb

Solving in Python with LeNet

在這個(gè)例子中我們將要學(xué)習(xí)Caffe的Python接口，著重學(xué)習(xí)Solver接口。

1.準(zhǔn)備

準(zhǔn)備好Python環(huán)境：我們通過使用pylab庫來導(dǎo)入numpy并繪圖。

from pylab import * %matplotlib inline

導(dǎo)入caffe，添加它的路徑到sys.path。請(qǐng)事先編譯好pycaffe。

import sys caffe_root = '/home/xhb/caffe/caffe/' # caffe的根路徑，請(qǐng)自行設(shè)置 sys.path.insert(0, caffe_root + 'python') import caffe

我們首先使用提供的LeNet例子的數(shù)據(jù)和網(wǎng)絡(luò)模型(你需要自行下載好數(shù)據(jù)，并創(chuàng)建好數(shù)據(jù)庫，如下所示)

# run scripts from caffe root import os os.chdir(caffe_root) # Download data !data/mnist/get_mnist.sh # Prepare data !examples/mnist/create_mnist.sh # back to examples os.chdir('examples') Downloading... Creating lmdb... I0301 12:48:30.756855 995 db_lmdb.cpp:35] Opened lmdb examples/mnist/mnist_train_lmdb I0301 12:48:30.757007 995 convert_mnist_data.cpp:88] A total of 60000 items. I0301 12:48:30.757015 995 convert_mnist_data.cpp:89] Rows: 28 Cols: 28 I0301 12:48:35.242076 995 convert_mnist_data.cpp:108] Processed 60000 files. I0301 12:48:35.257020 996 db_lmdb.cpp:35] Opened lmdb examples/mnist/mnist_test_lmdb I0301 12:48:35.257267 996 convert_mnist_data.cpp:88] A total of 10000 items. I0301 12:48:35.257280 996 convert_mnist_data.cpp:89] Rows: 28 Cols: 28 I0301 12:48:35.941156 996 convert_mnist_data.cpp:108] Processed 10000 files. Done.

2.創(chuàng)建網(wǎng)絡(luò)

現(xiàn)在讓我們來編寫一個(gè)LeNet的變種網(wǎng)絡(luò)，經(jīng)典的1989年的convnet結(jié)構(gòu)。
我們另外需要兩個(gè)文件：
- 網(wǎng)絡(luò)的prototxt文件，定義了網(wǎng)絡(luò)結(jié)構(gòu)，并指向了訓(xùn)練和測試數(shù)據(jù)集。
- 解決方案的prototxt文件，定義了超參數(shù)等。
我們首先創(chuàng)建網(wǎng)絡(luò)。我們將使用Python代碼以簡潔而自然的方式來編寫網(wǎng)絡(luò)，并序列化為Caffe的protobuf模型格式。
這個(gè)網(wǎng)絡(luò)需要從生成好的LMDB數(shù)據(jù)庫文件讀取數(shù)據(jù)，單也可以使用MemoryDataLayer直接從ndarray讀取數(shù)據(jù)。

from caffe import layers as L, params as Pdef lenet(lmdb, batch_size):# our version of LeNet: a series of linear and simple nonlinear transformationsn = caffe.NetSpec()n.data, n.label = L.Data(batch_size=batch_size, backend=P.Data.LMDB, source=lmdb,transform_param=dict(scale=1./255), ntop=2)n.conv1 = L.Convolution(n.data, kernel_size=5, num_output=20, weight_filler=dict(type='xavier'))n.pool1 = L.Pooling(n.conv1, kernel_size=2, stride=2, pool=P.Pooling.MAX)n.conv2 = L.Convolution(n.pool1, kernel_size=5, num_output=50, weight_filler=dict(type='xavier'))n.pool2 = L.Pooling(n.conv2, kernel_size=2, stride=2, pool=P.Pooling.MAX)n.fc1 = L.InnerProduct(n.pool2, num_output=500, weight_filler=dict(type='xavier'))n.relu1 = L.ReLU(n.fc1, in_place=True)n.score = L.InnerProduct(n.relu1, num_output=10, weight_filler=dict(type='xavier'))n.loss = L.SoftmaxWithLoss(n.score, n.label)return n.to_proto()with open('mnist/lenet_auto_train.prototxt', 'w') as f:f.write(str(lenet('mnist/mnist_train_lmdb', 64)))with open('mnist/lenet_auto_test.prototxt', 'w') as f:f.write(str(lenet('mnist/mnist_test_lmdb', 100)))

通過使用Google的protobuf庫，這個(gè)網(wǎng)絡(luò)已經(jīng)被以一種更加冗長單卻易讀的序列化格式保存到硬盤上了。你可以直接讀取，寫入，修改數(shù)據(jù)。讓我們看看要訓(xùn)練的網(wǎng)絡(luò)。

!cat mnist/lenet_auto_train.prototxt layer {name: "data"type: "Data"top: "data"top: "label"transform_param {scale: 0.00392156885937}data_param {source: "mnist/mnist_train_lmdb"batch_size: 64backend: LMDB} } layer {name: "conv1"type: "Convolution"bottom: "data"top: "conv1"convolution_param {num_output: 20kernel_size: 5weight_filler {type: "xavier"}} } layer {name: "pool1"type: "Pooling"bottom: "conv1"top: "pool1"pooling_param {pool: MAXkernel_size: 2stride: 2} } layer {name: "conv2"type: "Convolution"bottom: "pool1"top: "conv2"convolution_param {num_output: 50kernel_size: 5weight_filler {type: "xavier"}} } layer {name: "pool2"type: "Pooling"bottom: "conv2"top: "pool2"pooling_param {pool: MAXkernel_size: 2stride: 2} } layer {name: "fc1"type: "InnerProduct"bottom: "pool2"top: "fc1"inner_product_param {num_output: 500weight_filler {type: "xavier"}} } layer {name: "relu1"type: "ReLU"bottom: "fc1"top: "fc1" } layer {name: "score"type: "InnerProduct"bottom: "fc1"top: "score"inner_product_param {num_output: 10weight_filler {type: "xavier"}} } layer {name: "loss"type: "SoftmaxWithLoss"bottom: "score"bottom: "label"top: "loss" }

現(xiàn)在讓我們看看學(xué)習(xí)參數(shù)（超參數(shù)），它們都被保存在一個(gè)prototxt文件中（caffe源碼中已經(jīng)提供了）。我們使用有動(dòng)量、權(quán)重衰減、指定的學(xué)習(xí)率表的SGD算法。

# 備注：這里我修改了lenet_auto_solver.prototxt，因?yàn)槲也皇窃赾affe_root下操作的，所以不能使用相關(guān)路徑； # 如果這個(gè)文件中的路徑錯(cuò)了，后面的程序會(huì)直接死掉，無法運(yùn)行，所以無法運(yùn)行時(shí)可以查看下這個(gè)文件中定義的路徑是否出錯(cuò)了 !cat mnist/lenet_auto_solver.prototxt # The train/test net protocol buffer definition # train_net: "mnist/lenet_auto_train.prototxt" train_net: "/home/xhb/caffe/caffe/examples/mnist/lenet_auto_train.prototxt" # test_net: "mnist/lenet_auto_test.prototxt" test_net: "/home/xhb/caffe/caffe/examples/mnist/lenet_auto_test.prototxt" # test_iter specifies how many forward passes the test should carry out. # In the case of MNIST, we have test batch size 100 and 100 test iterations, # covering the full 10,000 testing images. test_iter: 100 # Carry out testing every 500 training iterations. test_interval: 500 # The base learning rate, momentum and the weight decay of the network. base_lr: 0.01 momentum: 0.9 weight_decay: 0.0005 # The learning rate policy lr_policy: "inv" gamma: 0.0001 power: 0.75 # Display every 100 iterations display: 100 # The maximum number of iterations max_iter: 10000 # snapshot intermediate results snapshot: 5000 snapshot_prefix: "/home/xhb/caffe/caffe/examples/mnist/lenet"

3.導(dǎo)入并檢驗(yàn)解決方案

我們選擇一個(gè)設(shè)備，并導(dǎo)入解決方案（solver）。使用SGD算法（帶動(dòng)量）進(jìn)行優(yōu)化，但是其他優(yōu)化算法也是可行的，比如Adagrad和Nesterov的加速梯度下降算法。

# 備注：我在筆記本上跑的，所以沒有采用GPU模式，而是使用了CPU模式 # caffe.set_device(0) # caffe.set_mode_gpu() caffe.set_mode_cpu()### load the solver and create train and test nets # solver = None# ignore this workaround for lmdb data (can't instantiate two solvers on the same data) solver = caffe.SGDSolver('mnist/lenet_auto_solver.prototxt')

為了大致了解下網(wǎng)絡(luò)結(jié)構(gòu)，我們可以檢查一下中間特征（blob）的維度和參數(shù)。

# each output is (batch size, feature dim, spatial dim) [(k, v.data.shape) for k, v in solver.net.blobs.items()] [('data', (64, 1, 28, 28)),('label', (64,)),('conv1', (64, 20, 24, 24)),('pool1', (64, 20, 12, 12)),('conv2', (64, 50, 8, 8)),('pool2', (64, 50, 4, 4)),('fc1', (64, 500)),('score', (64, 10)),('loss', ())] # just print the weight sizes (we'll omit the biases) [(k, v[0].data.shape) for k, v in solver.net.params.items()] [('conv1', (20, 1, 5, 5)),('conv2', (50, 20, 5, 5)),('fc1', (500, 800)),('score', (10, 500))]

在運(yùn)行之前，我們先看看是否整個(gè)網(wǎng)絡(luò)都如我們所期望的那樣正確導(dǎo)入了。在訓(xùn)練和測試網(wǎng)絡(luò)上跑一次前向運(yùn)算，并確認(rèn)他們是否包含了你要的數(shù)據(jù)。

solver.net.forward() # 訓(xùn)練網(wǎng)絡(luò) solver.test_nets[0].forward() # 測試網(wǎng)絡(luò)（有可能不止一個(gè)，所以返回的是一個(gè)列表） {'loss': array(2.3477354049682617, dtype=float32)}

備注：這里我的運(yùn)行結(jié)果跟官網(wǎng)上結(jié)果有一點(diǎn)不同，他的結(jié)果是：{'loss': array(2.365971088409424, dtype=float32)}

# 用一點(diǎn)小技巧來貼出前8張圖片 imshow(solver.net.blobs['data'].data[:8, 0].transpose(1, 0, 2).reshape(28, 8*28), cmap='gray') axis('off') print 'train labels:', solver.net.blobs['label'].data[:8] train labels: [ 5. 0. 4. 1. 9. 2. 1. 3.]

imshow(solver.test_nets[0].blobs['data'].data[:8, 0].transpose(1, 0, 2).reshape(28, 8*28), cmap='gray') axis('off') print 'test labels:', solver.test_nets[0].blobs['label'].data[:8] test labels: [ 7. 2. 1. 0. 4. 1. 4. 9.]

4.分步運(yùn)行solver

訓(xùn)練和測試網(wǎng)絡(luò)都能正確導(dǎo)入數(shù)據(jù)和標(biāo)簽了。
- 使用SGD跑一次看看結(jié)果如何。

solver.step(1) # imshow(solver.net.params['conv1'][0].diff[:, 0].reshape(4,5,5,5).transpose(0,2,1,3).reshape(4*5, 5*5), cmap='gray') # axis('off') imshow(solver.net.params['conv1'][0].diff[:, 0].reshape(4, 5, 5, 5).transpose(0, 2, 1, 3).reshape(4*5, 5*5), cmap='gray'); axis('off') (-0.5, 24.5, 19.5, -0.5)

5.寫一個(gè)訓(xùn)練的循環(huán)

一定發(fā)生了什么吧。我們花點(diǎn)時(shí)間跑跑這個(gè)網(wǎng)絡(luò)，在它運(yùn)行的同時(shí)也注意記錄一些東西。注意，這里跟使用caffe編譯好的二進(jìn)制程序訓(xùn)練的過程是一樣的。特別地：
- 終端依然會(huì)照常打印日志信息（logging）。
- snapshots（也就是保存中間過程產(chǎn)生的模型）會(huì)按照在solver prototxt文件中定義的間隔，比如這里是指每隔5000次迭代，取一次。
- 每過特定的間隔就會(huì)測試一次網(wǎng)絡(luò)，這里是指500次迭代。
既然我們已經(jīng)在Python代碼中控制了循環(huán)操作，那么我們可以在運(yùn)行程序的同時(shí)計(jì)算些別的東西了，如下所示。
我們也可以做些別的事，比如：
- 寫一個(gè)停止循環(huán)的條件
- 在循環(huán)更新網(wǎng)絡(luò)的同時(shí)改變解決方案的進(jìn)程

%%time niter = 200 test_interval = 25 # losses will also be stored in the log train_loss = zeros(niter) test_acc = zeros(int(np.ceil(niter / test_interval))) output = zeros((niter, 8, 10))# the main solver loop for it in range(niter):solver.step(1) # SGD by Caffe# store the train losstrain_loss[it] = solver.net.blobs['loss'].data# store the output on the first test batch# (start the forward pass at conv1 to avoid loading new data)solver.test_nets[0].forward(start='conv1')output[it] = solver.test_nets[0].blobs['score'].data[:8]# run a full test every so often# (Caffe can also do this for us and write to a log, but we show here# how to do it directly in Python, where more complicated things are easier.)if it % test_interval == 0:print 'Iteration', it, 'testing...'correct = 0for test_it in range(100):solver.test_nets[0].forward()correct += sum(solver.test_nets[0].blobs['score'].data.argmax(1)== solver.test_nets[0].blobs['label'].data)test_acc[it // test_interval] = correct / 1e4 Iteration 0 testing... Iteration 25 testing... Iteration 50 testing... Iteration 75 testing... Iteration 100 testing... Iteration 125 testing... Iteration 150 testing... Iteration 175 testing... CPU times: user 1min 21s, sys: 68 ms, total: 1min 21s Wall time: 1min 20s

接下來畫出訓(xùn)練的loss和測試的準(zhǔn)確率。

_, ax1 = subplots() ax2 = ax1.twinx() ax1.plot(arange(niter), train_loss) ax2.plot(test_interval * arange(len(test_acc)), test_acc, 'r') ax1.set_xlabel('iteration') ax1.set_ylabel('train loss') ax2.set_ylabel('test accuracy') ax2.set_title('Test Accuracy: {:.2f}'.format(test_acc[-1])) Text(0.5,1,u'Test Accuracy: 0.94')

loss看起來下降的很快，也很快趨于收斂（當(dāng)然要出去局部的隨機(jī)性振蕩），同時(shí)準(zhǔn)確率也相應(yīng)地提高了。萬歲！
- 既然我們?cè)诘谝粋€(gè)測試的batch中保存了結(jié)果，我們也當(dāng)然可以看一下預(yù)測結(jié)果的變化。我們令x軸為時(shí)間，y軸對(duì)應(yīng)每個(gè)可能的標(biāo)簽，亮度代表置信度。

for i in range(8):figure(figsize=(2,2))imshow(solver.test_nets[0].blobs['data'].data[i, 0], cmap='gray')figure(figsize=(10,2))imshow(output[150:200,i].T, interpolation='nearest', cmap='gray')xlabel('iteration')ylabel('label')

最初，我們幾乎無法正確預(yù)測任何手寫數(shù)字，最后慢慢的能夠正確地分類他們了。如果你一直跟著教程走的話，你會(huì)看到最后的一個(gè)數(shù)字是最復(fù)雜的，一個(gè)傾斜的“9”，很容易被誤認(rèn)為是“4”
- 注意，這些都是神經(jīng)網(wǎng)絡(luò)最后的輸出，而不是通過softmax計(jì)算后的向量。后者，正如下面所示，讓我們更方便地看出網(wǎng)絡(luò)的置信率。

for i in range(8):figure(figsize=(2,2))imshow(solver.test_nets[0].blobs['data'].data[i, 0], cmap='gray')figure(figsize=(10,2))imshow(exp(output[150:200,i].T) / exp(output[150:200,i].T).sum(0), interpolation='nearest', cmap='gray')xlabel('iteration')ylabel('label')

6.有關(guān)網(wǎng)絡(luò)結(jié)構(gòu)和優(yōu)化的實(shí)驗(yàn)

現(xiàn)在我已經(jīng)定義好了，分別用于訓(xùn)練和測試的LeNet網(wǎng)絡(luò)，我們還有些別的事情要做：
- 定義新的結(jié)構(gòu)，并與現(xiàn)在的對(duì)比效果
- 設(shè)置base_lr微調(diào)優(yōu)化，或是再訓(xùn)練更長的時(shí)間
- 切換優(yōu)化算法，比如使用AdaDelta或者Adam替換SGD
可以通過編輯下面的整合好的例子來試著自行探索。注釋有“EDIT HERE”的地方是建議你修改的地方。
默認(rèn)定義好了一個(gè)簡單的線性分類器作為基線。
如果你更改的方案行不通，試著按照以下建議做做看：
1. 把非線性單元ReLU切換為ELU，或是一個(gè)基礎(chǔ)的非線性單元，比如Sigmoid
2. 堆疊更多的全連接層和非線性層
3. 每次都試著10倍10倍地取學(xué)習(xí)率（比如0.1和0.001）
4. 切換優(yōu)化算法為Adam（一般來說，這種自適應(yīng)優(yōu)化器對(duì)超參數(shù)不敏感，但也不保證一定如此…）
5. 多訓(xùn)練一段時(shí)間，把niter設(shè)置高一些（比如500或是1000）來看看差異

examples_path = '/home/xhb/caffe/caffe/examples/'train_net_path = examples_path + 'mnist/custom_auto_train.prototxt' test_net_path = examples_path + 'mnist/custom_auto_test.prototxt' solver_config_path = examples_path + 'mnist/custom_auto_solver.prototxt'### define net def custom_net(lmdb, batch_size):# define your own net!n = caffe.NetSpec()# keep this data layer for all networksn.data, n.label = L.Data(batch_size=batch_size, backend=P.Data.LMDB, source=lmdb,transform_param=dict(scale=1./255), ntop=2)# EDIT HERE to try different networks# this single layer defines a simple linear classifier# (in particular this defines a multiway logistic regression)n.score = L.InnerProduct(n.data, num_output=10, weight_filler=dict(type='xavier'))# EDIT HERE this is the LeNet variant we have already tried# n.conv1 = L.Convolution(n.data, kernel_size=5, num_output=20, weight_filler=dict(type='xavier'))# n.pool1 = L.Pooling(n.conv1, kernel_size=2, stride=2, pool=P.Pooling.MAX)# n.conv2 = L.Convolution(n.pool1, kernel_size=5, num_output=50, weight_filler=dict(type='xavier'))# n.pool2 = L.Pooling(n.conv2, kernel_size=2, stride=2, pool=P.Pooling.MAX)# n.fc1 = L.InnerProduct(n.pool2, num_output=500, weight_filler=dict(type='xavier'))# EDIT HERE consider L.ELU or L.Sigmoid for the nonlinearity# n.relu1 = L.ReLU(n.fc1, in_place=True)# n.score = L.InnerProduct(n.fc1, num_output=10, weight_filler=dict(type='xavier'))# keep this loss layer for all networksn.loss = L.SoftmaxWithLoss(n.score, n.label)return n.to_proto()with open(train_net_path, 'w') as f:f.write(str(custom_net('mnist/mnist_train_lmdb', 64))) with open(test_net_path, 'w') as f:f.write(str(custom_net('mnist/mnist_test_lmdb', 100)))### define solver from caffe.proto import caffe_pb2 s = caffe_pb2.SolverParameter()# Set a seed for reproducible experiments: # this controls for randomization in training. s.random_seed = 0xCAFFE# Specify locations of the train and (maybe) test networks. s.train_net = train_net_path s.test_net.append(test_net_path) s.test_interval = 500 # Test after every 500 training iterations. s.test_iter.append(100) # Test on 100 batches each time we test.s.max_iter = 10000 # no. of times to update the net (training iterations)# EDIT HERE to try different solvers # solver types include "SGD", "Adam", and "Nesterov" among others. s.type = "SGD"# Set the initial learning rate for SGD. s.base_lr = 0.01 # EDIT HERE to try different learning rates # Set momentum to accelerate learning by # taking weighted average of current and previous updates. s.momentum = 0.9 # Set weight decay to regularize and prevent overfitting s.weight_decay = 5e-4# Set `lr_policy` to define how the learning rate changes during training. # This is the same policy as our default LeNet. s.lr_policy = 'inv' s.gamma = 0.0001 s.power = 0.75 # EDIT HERE to try the fixed rate (and compare with adaptive solvers) # `fixed` is the simplest policy that keeps the learning rate constant. # s.lr_policy = 'fixed'# Display the current training loss and accuracy every 1000 iterations. s.display = 1000# Snapshots are files used to store networks we've trained. # We'll snapshot every 5K iterations -- twice during training. s.snapshot = 5000 s.snapshot_prefix = 'mnist/custom_net'# Train on the GPU s.solver_mode = caffe_pb2.SolverParameter.GPU# Write the solver to a temporary file and return its filename. with open(solver_config_path, 'w') as f:f.write(str(s))### load the solver and create train and test nets solver = None # ignore this workaround for lmdb data (can't instantiate two solvers on the same data) solver = caffe.get_solver(solver_config_path)### solve niter = 250 # EDIT HERE increase to train for longer test_interval = niter / 10 # losses will also be stored in the log train_loss = zeros(niter) test_acc = zeros(int(np.ceil(niter / test_interval)))# the main solver loop for it in range(niter):solver.step(1) # SGD by Caffe# store the train losstrain_loss[it] = solver.net.blobs['loss'].data# run a full test every so often# (Caffe can also do this for us and write to a log, but we show here# how to do it directly in Python, where more complicated things are easier.)if it % test_interval == 0:print 'Iteration', it, 'testing...'correct = 0for test_it in range(100):solver.test_nets[0].forward()correct += sum(solver.test_nets[0].blobs['score'].data.argmax(1)== solver.test_nets[0].blobs['label'].data)test_acc[it // test_interval] = correct / 1e4_, ax1 = subplots() ax2 = ax1.twinx() ax1.plot(arange(niter), train_loss) ax2.plot(test_interval * arange(len(test_acc)), test_acc, 'r') ax1.set_xlabel('iteration') ax1.set_ylabel('train loss') ax2.set_ylabel('test accuracy') ax2.set_title('Custom Test Accuracy: {:.2f}'.format(test_acc[-1])) Iteration 0 testing... Iteration 25 testing... Iteration 50 testing... Iteration 75 testing... Iteration 100 testing... Iteration 125 testing... Iteration 150 testing... Iteration 175 testing... Iteration 200 testing... Iteration 225 testing... Text(0.5,1,u'Custom Test Accuracy: 0.88')

總結(jié)

以上是生活随笔為你收集整理的Caffe官方教程翻译（6）：Learning LeNet的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： Caffe官方教程翻译（5）：Class
下一篇： Caffe官方教程翻译（7）：Fine-