日韩av黄I国产麻豆传媒I国产91av视频在线观看I日韩一区二区三区在线看I美女国产在线I麻豆视频国产在线观看I成人黄色短片

歡迎訪問(wèn) 生活随笔!

生活随笔

當(dāng)前位置: 首頁(yè) >

DeepLearning tutorial(4)CNN卷积神经网络原理简介+代码详解

發(fā)布時(shí)間:2025/7/25 82 豆豆
生活随笔 收集整理的這篇文章主要介紹了 DeepLearning tutorial(4)CNN卷积神经网络原理简介+代码详解 小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

FROM:?http://blog.csdn.net/u012162613/article/details/43225445

DeepLearning tutorial(4)CNN卷積神經(jīng)網(wǎng)絡(luò)原理簡(jiǎn)介+代碼詳解


@author:wepon

@blog:http://blog.csdn.net/u012162613/article/details/43225445


本文介紹多層感知機(jī)算法,特別是詳細(xì)解讀其代碼實(shí)現(xiàn),基于python theano,代碼來(lái)自Convolutional Neural Networks (LeNet)。經(jīng)詳細(xì)注釋的代碼和原始代碼:放在我的github地址上,可下載。



一、CNN卷積神經(jīng)網(wǎng)絡(luò)原理簡(jiǎn)介

要講明白卷積神經(jīng)網(wǎng)絡(luò),估計(jì)得長(zhǎng)篇大論,網(wǎng)上有很多博文已經(jīng)寫(xiě)得很好了,所以本文就不重復(fù)了,如果你了解CNN,那可以往下看,本文主要是詳細(xì)地解讀CNN的實(shí)現(xiàn)代碼。如果你沒(méi)學(xué)習(xí)過(guò)CNN,在此推薦周曉藝師兄的博文:Deep Learning(深度學(xué)習(xí))學(xué)習(xí)筆記整理系列之(七),以及UFLDL上的卷積特征提取、池化


CNN的最大特點(diǎn)就是稀疏連接(局部感受)和權(quán)值共享,如下面兩圖所示,左為稀疏連接,右為權(quán)值共享。稀疏連接和權(quán)值共享可以減少所要訓(xùn)練的參數(shù),減少計(jì)算復(fù)雜度。


? ? ? ?


至于CNN的結(jié)構(gòu),以經(jīng)典的LeNet5來(lái)說(shuō)明:




這個(gè)圖真是無(wú)處不在,一談CNN,必說(shuō)LeNet5,這圖來(lái)自于這篇論文:Gradient-Based Learning Applied to Document Recognition,論文很長(zhǎng),第7頁(yè)那里開(kāi)始講LeNet5這個(gè)結(jié)構(gòu),建議看看那部分。


我這里簡(jiǎn)單說(shuō)一下,LeNet5這張圖從左到右,先是input,這是輸入層,即輸入的圖片。input-layer到C1這部分就是一個(gè)卷積層(convolution運(yùn)算),C1到S2是一個(gè)子采樣層(pooling運(yùn)算),關(guān)于卷積和子采樣的具體過(guò)程可以參考下圖:



然后,S2到C3又是卷積,C3到S4又是子采樣,可以發(fā)現(xiàn),卷積和子采樣都是成對(duì)出現(xiàn)的,卷積后面一般跟著子采樣。S4到C5之間是全連接的,這就相當(dāng)于一個(gè)MLP的隱含層了(如果你不清楚MLP,參考《DeepLearning tutorial(3)MLP多層感知機(jī)原理簡(jiǎn)介+代碼詳解》)。C5到F6同樣是全連接,也是相當(dāng)于一個(gè)MLP的隱含層。最后從F6到輸出output,其實(shí)就是一個(gè)分類(lèi)器,這一層就叫分類(lèi)層。


ok,CNN的基本結(jié)構(gòu)大概就是這樣,由輸入、卷積層、子采樣層、全連接層、分類(lèi)層、輸出這些基本“構(gòu)件”組成,一般根據(jù)具體的應(yīng)用或者問(wèn)題,去確定要多少卷積層和子采樣層、采用什么分類(lèi)器。當(dāng)確定好了結(jié)構(gòu)以后,如何求解層與層之間的連接參數(shù)?一般采用向前傳播(FP)+向后傳播(BP)的方法來(lái)訓(xùn)練。具體可參考上面給出的鏈接。



二、CNN卷積神經(jīng)網(wǎng)絡(luò)代碼詳細(xì)解讀(基于python+theano)


代碼來(lái)自于深度學(xué)習(xí)教程:Convolutional Neural Networks (LeNet),這個(gè)代碼實(shí)現(xiàn)的是一個(gè)簡(jiǎn)化了的LeNet5,具體如下:

  • 沒(méi)有實(shí)現(xiàn)location-specific gain and bias parameters
  • 用的是maxpooling,而不是average_pooling
  • 分類(lèi)器用的是softmax,LeNet5用的是rbf
  • LeNet5第二層并不是全連接的,本程序?qū)崿F(xiàn)的是全連接

另外,代碼里將卷積層和子采用層合在一起,定義為“LeNetConvPoolLayer“(卷積采樣層),這好理解,因?yàn)樗鼈兛偸浅蓪?duì)出現(xiàn)。但是有個(gè)地方需要注意,代碼中將卷積后的輸出直接作為子采樣層的輸入,而沒(méi)有加偏置b再通過(guò)sigmoid函數(shù)進(jìn)行映射,即沒(méi)有了下圖中fx后面的bx以及sigmoid映射,也即直接由fx得到Cx。


最后,代碼中第一個(gè)卷積層用的卷積核有20個(gè),第二個(gè)卷積層用50個(gè),而不是上面那張LeNet5圖中所示的6個(gè)和16個(gè)。


了解了這些,下面看代碼:


(1)導(dǎo)入必要的模塊

[python] view plaincopy
  • import?cPickle??
  • import?gzip??
  • import?os??
  • import?sys??
  • import?time??
  • ??
  • import?numpy??
  • ??
  • import?theano??
  • import?theano.tensor?as?T??
  • from?theano.tensor.signal?import?downsample??
  • from?theano.tensor.nnet?import?conv??

  • (2)定義CNN的基本"構(gòu)件"

    CNN的基本構(gòu)件包括卷積采樣層、隱含層、分類(lèi)器,如下

    • 定義LeNetConvPoolLayer(卷積+采樣層)

    見(jiàn)代碼注釋: [python] view plaincopy
  • """?
  • 卷積+下采樣合成一個(gè)層LeNetConvPoolLayer?
  • rng:隨機(jī)數(shù)生成器,用于初始化W?
  • input:4維的向量,theano.tensor.dtensor4?
  • filter_shape:(number?of?filters,?num?input?feature?maps,filter?height,?filter?width)?
  • image_shape:(batch?size,?num?input?feature?maps,image?height,?image?width)?
  • poolsize:?(#rows,?#cols)?
  • """??
  • class?LeNetConvPoolLayer(object):??
  • ????def?__init__(self,?rng,?input,?filter_shape,?image_shape,?poolsize=(2,?2)):??
  • ????
  • #assert?condition,condition為T(mén)rue,則繼續(xù)往下執(zhí)行,condition為False,中斷程序??
  • #image_shape[1]和filter_shape[1]都是num?input?feature?maps,它們必須是一樣的。??
  • ????????assert?image_shape[1]?==?filter_shape[1]??
  • ????????self.input?=?input??
  • ??
  • #每個(gè)隱層神經(jīng)元(即像素)與上一層的連接數(shù)為num?input?feature?maps?*?filter?height?*?filter?width。??
  • #可以用numpy.prod(filter_shape[1:])來(lái)求得??
  • ????????fan_in?=?numpy.prod(filter_shape[1:])??
  • ??
  • #lower?layer上每個(gè)神經(jīng)元獲得的梯度來(lái)自于:"num?output?feature?maps?*?filter?height?*?filter?width"?/pooling?size??
  • ????????fan_out?=?(filter_shape[0]?*?numpy.prod(filter_shape[2:])?/??
  • ???????????????????numpy.prod(poolsize))??
  • ?????????????????????
  • #以上求得fan_in、fan_out?,將它們代入公式,以此來(lái)隨機(jī)初始化W,W就是線性卷積核??
  • ????????W_bound?=?numpy.sqrt(6.?/?(fan_in?+?fan_out))??
  • ????????self.W?=?theano.shared(??
  • ????????????numpy.asarray(??
  • ????????????????rng.uniform(low=-W_bound,?high=W_bound,?size=filter_shape),??
  • ????????????????dtype=theano.config.floatX??
  • ????????????),??
  • ????????????borrow=True??
  • ????????)??
  • ??
  • #?the?bias?is?a?1D?tensor?--?one?bias?per?output?feature?map??
  • #偏置b是一維向量,每個(gè)輸出圖的特征圖都對(duì)應(yīng)一個(gè)偏置,??
  • #而輸出的特征圖的個(gè)數(shù)由filter個(gè)數(shù)決定,因此用filter_shape[0]即number?of?filters來(lái)初始化??
  • ????????b_values?=?numpy.zeros((filter_shape[0],),?dtype=theano.config.floatX)??
  • ????????self.b?=?theano.shared(value=b_values,?borrow=True)??
  • ??
  • #將輸入圖像與filter卷積,conv.conv2d函數(shù)??
  • #卷積完沒(méi)有加b再通過(guò)sigmoid,這里是一處簡(jiǎn)化。??
  • ????????conv_out?=?conv.conv2d(??
  • ????????????input=input,??
  • ????????????filters=self.W,??
  • ????????????filter_shape=filter_shape,??
  • ????????????image_shape=image_shape??
  • ????????)??
  • ??
  • #maxpooling,最大子采樣過(guò)程??
  • ????????pooled_out?=?downsample.max_pool_2d(??
  • ????????????input=conv_out,??
  • ????????????ds=poolsize,??
  • ????????????ignore_border=True??
  • ????????)??
  • ??
  • #加偏置,再通過(guò)tanh映射,得到卷積+子采樣層的最終輸出??
  • #因?yàn)閎是一維向量,這里用維度轉(zhuǎn)換函數(shù)dimshuffle將其reshape。比如b是(10,),??
  • #則b.dimshuffle('x',?0,?'x',?'x'))將其reshape為(1,10,1,1)??
  • ????????self.output?=?T.tanh(pooled_out?+?self.b.dimshuffle('x',?0,?'x',?'x'))??
  • #卷積+采樣層的參數(shù)??
  • ????????self.params?=?[self.W,?self.b]??


    • 定義隱含層HiddenLayer

    這個(gè)跟上一篇文章《?DeepLearning tutorial(3)MLP多層感知機(jī)原理簡(jiǎn)介+代碼詳解》中的HiddenLayer是一致的,直接拿過(guò)來(lái): [python] view plaincopy
  • """?
  • 注釋:?
  • 這是定義隱藏層的類(lèi),首先明確:隱藏層的輸入即input,輸出即隱藏層的神經(jīng)元個(gè)數(shù)。輸入層與隱藏層是全連接的。?
  • 假設(shè)輸入是n_in維的向量(也可以說(shuō)時(shí)n_in個(gè)神經(jīng)元),隱藏層有n_out個(gè)神經(jīng)元,則因?yàn)槭侨B接,?
  • 一共有n_in*n_out個(gè)權(quán)重,故W大小時(shí)(n_in,n_out),n_in行n_out列,每一列對(duì)應(yīng)隱藏層的每一個(gè)神經(jīng)元的連接權(quán)重。?
  • b是偏置,隱藏層有n_out個(gè)神經(jīng)元,故b時(shí)n_out維向量。?
  • rng即隨機(jī)數(shù)生成器,numpy.random.RandomState,用于初始化W。?
  • input訓(xùn)練模型所用到的所有輸入,并不是MLP的輸入層,MLP的輸入層的神經(jīng)元個(gè)數(shù)時(shí)n_in,而這里的參數(shù)input大小是(n_example,n_in),每一行一個(gè)樣本,即每一行作為MLP的輸入層。?
  • activation:激活函數(shù),這里定義為函數(shù)tanh?
  • """??
  • class?HiddenLayer(object):??
  • ????def?__init__(self,?rng,?input,?n_in,?n_out,?W=None,?b=None,??
  • ?????????????????activation=T.tanh):??
  • ?????????self.input?=?input???#類(lèi)HiddenLayer的input即所傳遞進(jìn)來(lái)的input??
  • ??
  • ?????????"""?
  • ?????????注釋:?
  • ?????????代碼要兼容GPU,則必須使用?dtype=theano.config.floatX,并且定義為theano.shared?
  • ?????????另外,W的初始化有個(gè)規(guī)則:如果使用tanh函數(shù),則在-sqrt(6./(n_in+n_hidden))到sqrt(6./(n_in+n_hidden))之間均勻?
  • ?????????抽取數(shù)值來(lái)初始化W,若時(shí)sigmoid函數(shù),則以上再乘4倍。?
  • ?????????"""??
  • ?????????#如果W未初始化,則根據(jù)上述方法初始化。??
  • ?????????#加入這個(gè)判斷的原因是:有時(shí)候我們可以用訓(xùn)練好的參數(shù)來(lái)初始化W,見(jiàn)我的上一篇文章。??
  • ?????????if?W?is?None:??
  • ????????????W_values?=?numpy.asarray(??
  • ????????????????rng.uniform(??
  • ????????????????????low=-numpy.sqrt(6.?/?(n_in?+?n_out)),??
  • ????????????????????high=numpy.sqrt(6.?/?(n_in?+?n_out)),??
  • ????????????????????size=(n_in,?n_out)??
  • ????????????????),??
  • ????????????????dtype=theano.config.floatX??
  • ????????????)??
  • ????????????if?activation?==?theano.tensor.nnet.sigmoid:??
  • ????????????????W_values?*=?4??
  • ????????????W?=?theano.shared(value=W_values,?name='W',?borrow=True)??
  • ??
  • ?????????if?b?is?None:??
  • ????????????b_values?=?numpy.zeros((n_out,),?dtype=theano.config.floatX)??
  • ????????????b?=?theano.shared(value=b_values,?name='b',?borrow=True)??
  • ??
  • ?????????#用上面定義的W、b來(lái)初始化類(lèi)HiddenLayer的W、b??
  • ?????????self.W?=?W??
  • ?????????self.b?=?b??
  • ??
  • ????????#隱含層的輸出??
  • ?????????lin_output?=?T.dot(input,?self.W)?+?self.b??
  • ?????????self.output?=?(??
  • ????????????lin_output?if?activation?is?None??
  • ????????????else?activation(lin_output)??
  • ?????????)??
  • ??
  • ????????#隱含層的參數(shù)??
  • ?????????self.params?=?[self.W,?self.b]??


    • 定義分類(lèi)器 (Softmax回歸)

    采用Softmax,這跟《DeepLearning tutorial(1)Softmax回歸原理簡(jiǎn)介+代碼詳解》中的LogisticRegression是一樣的,直接拿過(guò)來(lái): [python] view plaincopy
  • """?
  • 定義分類(lèi)層LogisticRegression,也即Softmax回歸?
  • 在deeplearning?tutorial中,直接將LogisticRegression視為Softmax,?
  • 而我們所認(rèn)識(shí)的二類(lèi)別的邏輯回歸就是當(dāng)n_out=2時(shí)的LogisticRegression?
  • """??
  • #參數(shù)說(shuō)明:??
  • #input,大小就是(n_example,n_in),其中n_example是一個(gè)batch的大小,??
  • #因?yàn)槲覀冇?xùn)練時(shí)用的是Minibatch?SGD,因此input這樣定義??
  • #n_in,即上一層(隱含層)的輸出??
  • #n_out,輸出的類(lèi)別數(shù)???
  • class?LogisticRegression(object):??
  • ????def?__init__(self,?input,?n_in,?n_out):??
  • ??
  • #W大小是n_in行n_out列,b為n_out維向量。即:每個(gè)輸出對(duì)應(yīng)W的一列以及b的一個(gè)元素。????
  • ????????self.W?=?theano.shared(??
  • ????????????value=numpy.zeros(??
  • ????????????????(n_in,?n_out),??
  • ????????????????dtype=theano.config.floatX??
  • ????????????),??
  • ????????????name='W',??
  • ????????????borrow=True??
  • ????????)??
  • ??
  • ????????self.b?=?theano.shared(??
  • ????????????value=numpy.zeros(??
  • ????????????????(n_out,),??
  • ????????????????dtype=theano.config.floatX??
  • ????????????),??
  • ????????????name='b',??
  • ????????????borrow=True??
  • ????????)??
  • ??
  • #input是(n_example,n_in),W是(n_in,n_out),點(diǎn)乘得到(n_example,n_out),加上偏置b,??
  • #再作為T(mén).nnet.softmax的輸入,得到p_y_given_x??
  • #故p_y_given_x每一行代表每一個(gè)樣本被估計(jì)為各類(lèi)別的概率??????
  • #PS:b是n_out維向量,與(n_example,n_out)矩陣相加,內(nèi)部其實(shí)是先復(fù)制n_example個(gè)b,??
  • #然后(n_example,n_out)矩陣的每一行都加b??
  • ????????self.p_y_given_x?=?T.nnet.softmax(T.dot(input,?self.W)?+?self.b)??
  • ??
  • #argmax返回最大值下標(biāo),因?yàn)楸纠龜?shù)據(jù)集是MNIST,下標(biāo)剛好就是類(lèi)別。axis=1表示按行操作。??
  • ????????self.y_pred?=?T.argmax(self.p_y_given_x,?axis=1)??
  • ??
  • #params,LogisticRegression的參數(shù)???????
  • ????????self.params?=?[self.W,?self.b]??



  • 到這里,CNN的基本”構(gòu)件“都有了,下面要用這些”構(gòu)件“組裝成LeNet5(當(dāng)然,是簡(jiǎn)化的,上面已經(jīng)說(shuō)了),具體來(lái)說(shuō),就是組裝成:LeNet5=input+LeNetConvPoolLayer_1+LeNetConvPoolLayer_2+HiddenLayer+LogisticRegression+output。

    然后將其應(yīng)用于MNIST數(shù)據(jù)集,用BP算法去解這個(gè)模型,得到最優(yōu)的參數(shù)。



    (3)加載MNIST數(shù)據(jù)集(mnist.pkl.gz)

    [python] view plaincopy
  • """?
  • 加載MNIST數(shù)據(jù)集load_data()?
  • """??
  • def?load_data(dataset):??
  • ????#?dataset是數(shù)據(jù)集的路徑,程序首先檢測(cè)該路徑下有沒(méi)有MNIST數(shù)據(jù)集,沒(méi)有的話就下載MNIST數(shù)據(jù)集??
  • ????#這一部分就不解釋了,與softmax回歸算法無(wú)關(guān)。??
  • ????data_dir,?data_file?=?os.path.split(dataset)??
  • ????if?data_dir?==?""?and?not?os.path.isfile(dataset):??
  • ????????#?Check?if?dataset?is?in?the?data?directory.??
  • ????????new_path?=?os.path.join(??
  • ????????????os.path.split(__file__)[0],??
  • ????????????"..",??
  • ????????????"data",??
  • ????????????dataset??
  • ????????)??
  • ????????if?os.path.isfile(new_path)?or?data_file?==?'mnist.pkl.gz':??
  • ????????????dataset?=?new_path??
  • ??
  • ????if?(not?os.path.isfile(dataset))?and?data_file?==?'mnist.pkl.gz':??
  • ????????import?urllib??
  • ????????origin?=?(??
  • ????????????'http://www.iro.umontreal.ca/~lisa/deep/data/mnist/mnist.pkl.gz'??
  • ????????)??
  • ????????print?'Downloading?data?from?%s'?%?origin??
  • ????????urllib.urlretrieve(origin,?dataset)??
  • ??
  • ????print?'...?loading?data'??
  • #以上是檢測(cè)并下載數(shù)據(jù)集mnist.pkl.gz,不是本文重點(diǎn)。下面才是load_data的開(kāi)始??
  • ??????
  • #從"mnist.pkl.gz"里加載train_set,?valid_set,?test_set,它們都是包括label的??
  • #主要用到python里的gzip.open()函數(shù),以及?cPickle.load()。??
  • #‘rb’表示以二進(jìn)制可讀的方式打開(kāi)文件??
  • ????f?=?gzip.open(dataset,?'rb')??
  • ????train_set,?valid_set,?test_set?=?cPickle.load(f)??
  • ????f.close()??
  • ?????
  • ??
  • #將數(shù)據(jù)設(shè)置成shared?variables,主要時(shí)為了GPU加速,只有shared?variables才能存到GPU?memory中??
  • #GPU里數(shù)據(jù)類(lèi)型只能是float。而data_y是類(lèi)別,所以最后又轉(zhuǎn)換為int返回??
  • ????def?shared_dataset(data_xy,?borrow=True):??
  • ????????data_x,?data_y?=?data_xy??
  • ????????shared_x?=?theano.shared(numpy.asarray(data_x,??
  • ???????????????????????????????????????????????dtype=theano.config.floatX),??
  • ?????????????????????????????????borrow=borrow)??
  • ????????shared_y?=?theano.shared(numpy.asarray(data_y,??
  • ???????????????????????????????????????????????dtype=theano.config.floatX),??
  • ?????????????????????????????????borrow=borrow)??
  • ????????return?shared_x,?T.cast(shared_y,?'int32')??
  • ??
  • ??
  • ????test_set_x,?test_set_y?=?shared_dataset(test_set)??
  • ????valid_set_x,?valid_set_y?=?shared_dataset(valid_set)??
  • ????train_set_x,?train_set_y?=?shared_dataset(train_set)??
  • ??
  • ????rval?=?[(train_set_x,?train_set_y),?(valid_set_x,?valid_set_y),??
  • ????????????(test_set_x,?test_set_y)]??
  • ????return?rval??


  • (4)實(shí)現(xiàn)LeNet5并測(cè)試

    [python] view plaincopy
  • """?
  • 實(shí)現(xiàn)LeNet5?
  • LeNet5有兩個(gè)卷積層,第一個(gè)卷積層有20個(gè)卷積核,第二個(gè)卷積層有50個(gè)卷積核?
  • """??
  • def?evaluate_lenet5(learning_rate=0.1,?n_epochs=200,??
  • ????????????????????dataset='mnist.pkl.gz',??
  • ????????????????????nkerns=[20,?50],?batch_size=500):??
  • ????"""??
  • ?learning_rate:學(xué)習(xí)速率,隨機(jī)梯度前的系數(shù)。?
  • ?n_epochs訓(xùn)練步數(shù),每一步都會(huì)遍歷所有batch,即所有樣本?
  • ?batch_size,這里設(shè)置為500,即每遍歷完500個(gè)樣本,才計(jì)算梯度并更新參數(shù)?
  • ?nkerns=[20,?50],每一個(gè)LeNetConvPoolLayer卷積核的個(gè)數(shù),第一個(gè)LeNetConvPoolLayer有?
  • ?20個(gè)卷積核,第二個(gè)有50個(gè)?
  • ????"""??
  • ??
  • ????rng?=?numpy.random.RandomState(23455)??
  • ??
  • ????#加載數(shù)據(jù)??
  • ????datasets?=?load_data(dataset)??
  • ????train_set_x,?train_set_y?=?datasets[0]??
  • ????valid_set_x,?valid_set_y?=?datasets[1]??
  • ????test_set_x,?test_set_y?=?datasets[2]??
  • ??
  • ????#?計(jì)算batch的個(gè)數(shù)??
  • ????n_train_batches?=?train_set_x.get_value(borrow=True).shape[0]??
  • ????n_valid_batches?=?valid_set_x.get_value(borrow=True).shape[0]??
  • ????n_test_batches?=?test_set_x.get_value(borrow=True).shape[0]??
  • ????n_train_batches?/=?batch_size??
  • ????n_valid_batches?/=?batch_size??
  • ????n_test_batches?/=?batch_size??
  • ??
  • ????#定義幾個(gè)變量,index表示batch下標(biāo),x表示輸入的訓(xùn)練數(shù)據(jù),y對(duì)應(yīng)其標(biāo)簽??
  • ????index?=?T.lscalar()????
  • ????x?=?T.matrix('x')?????
  • ????y?=?T.ivector('y')???
  • ??
  • ????######################??
  • ????#?BUILD?ACTUAL?MODEL?#??
  • ????######################??
  • ????print?'...?building?the?model'??
  • ??
  • ??
  • #我們加載進(jìn)來(lái)的batch大小的數(shù)據(jù)是(batch_size,?28?*?28),但是LeNetConvPoolLayer的輸入是四維的,所以要reshape??
  • ????layer0_input?=?x.reshape((batch_size,?1,?28,?28))??
  • ??
  • #?layer0即第一個(gè)LeNetConvPoolLayer層??
  • #輸入的單張圖片(28,28),經(jīng)過(guò)conv得到(28-5+1?,?28-5+1)?=?(24,?24),??
  • #經(jīng)過(guò)maxpooling得到(24/2,?24/2)?=?(12,?12)??
  • #因?yàn)槊總€(gè)batch有batch_size張圖,第一個(gè)LeNetConvPoolLayer層有nkerns[0]個(gè)卷積核,??
  • #故layer0輸出為(batch_size,?nkerns[0],?12,?12)??
  • ????layer0?=?LeNetConvPoolLayer(??
  • ????????rng,??
  • ????????input=layer0_input,??
  • ????????image_shape=(batch_size,?1,?28,?28),??
  • ????????filter_shape=(nkerns[0],?1,?5,?5),??
  • ????????poolsize=(2,?2)??
  • ????)??
  • ??
  • ??
  • #layer1即第二個(gè)LeNetConvPoolLayer層??
  • #輸入是layer0的輸出,每張?zhí)卣鲌D為(12,12),經(jīng)過(guò)conv得到(12-5+1,?12-5+1)?=?(8,?8),??
  • #經(jīng)過(guò)maxpooling得到(8/2,?8/2)?=?(4,?4)??
  • #因?yàn)槊總€(gè)batch有batch_size張圖(特征圖),第二個(gè)LeNetConvPoolLayer層有nkerns[1]個(gè)卷積核??
  • #,故layer1輸出為(batch_size,?nkerns[1],?4,?4)??
  • ????layer1?=?LeNetConvPoolLayer(??
  • ????????rng,??
  • ????????input=layer0.output,??
  • ????????image_shape=(batch_size,?nkerns[0],?12,?12),#輸入nkerns[0]張?zhí)卣鲌D,即layer0輸出nkerns[0]張?zhí)卣鲌D??
  • ????????filter_shape=(nkerns[1],?nkerns[0],?5,?5),??
  • ????????poolsize=(2,?2)??
  • ????)??
  • ??
  • ??
  • #前面定義好了兩個(gè)LeNetConvPoolLayer(layer0和layer1),layer1后面接layer2,這是一個(gè)全連接層,相當(dāng)于MLP里面的隱含層??
  • #故可以用MLP中定義的HiddenLayer來(lái)初始化layer2,layer2的輸入是二維的(batch_size,?num_pixels)?,??
  • #故要將上層中同一張圖經(jīng)不同卷積核卷積出來(lái)的特征圖合并為一維向量,??
  • #也就是將layer1的輸出(batch_size,?nkerns[1],?4,?4)flatten為(batch_size,?nkerns[1]*4*4)=(500,800),作為layer2的輸入。??
  • #(500,800)表示有500個(gè)樣本,每一行代表一個(gè)樣本。layer2的輸出大小是(batch_size,n_out)=(500,500)??
  • ????layer2_input?=?layer1.output.flatten(2)??
  • ????layer2?=?HiddenLayer(??
  • ????????rng,??
  • ????????input=layer2_input,??
  • ????????n_in=nkerns[1]?*?4?*?4,??
  • ????????n_out=500,??
  • ????????activation=T.tanh??
  • ????)??
  • ??
  • #最后一層layer3是分類(lèi)層,用的是邏輯回歸中定義的LogisticRegression,??
  • #layer3的輸入是layer2的輸出(500,500),layer3的輸出就是(batch_size,n_out)=(500,10)??
  • ????layer3?=?LogisticRegression(input=layer2.output,?n_in=500,?n_out=10)??
  • ??
  • #代價(jià)函數(shù)NLL??
  • ????cost?=?layer3.negative_log_likelihood(y)??
  • ??
  • #?test_model計(jì)算測(cè)試誤差,x、y根據(jù)給定的index具體化,然后調(diào)用layer3,??
  • #layer3又會(huì)逐層地調(diào)用layer2、layer1、layer0,故test_model其實(shí)就是整個(gè)CNN結(jié)構(gòu),??
  • #test_model的輸入是x、y,輸出是layer3.errors(y)的輸出,即誤差。??
  • ????test_model?=?theano.function(??
  • ????????[index],??
  • ????????layer3.errors(y),??
  • ????????givens={??
  • ????????????x:?test_set_x[index?*?batch_size:?(index?+?1)?*?batch_size],??
  • ????????????y:?test_set_y[index?*?batch_size:?(index?+?1)?*?batch_size]??
  • ????????}??
  • ????)??
  • #validate_model,驗(yàn)證模型,分析同上。??
  • ????validate_model?=?theano.function(??
  • ????????[index],??
  • ????????layer3.errors(y),??
  • ????????givens={??
  • ????????????x:?valid_set_x[index?*?batch_size:?(index?+?1)?*?batch_size],??
  • ????????????y:?valid_set_y[index?*?batch_size:?(index?+?1)?*?batch_size]??
  • ????????}??
  • ????)??
  • ??
  • #下面是train_model,涉及到優(yōu)化算法即SGD,需要計(jì)算梯度、更新參數(shù)??
  • ????#參數(shù)集??
  • ????params?=?layer3.params?+?layer2.params?+?layer1.params?+?layer0.params??
  • ??
  • ????#對(duì)各個(gè)參數(shù)的梯度??
  • ????grads?=?T.grad(cost,?params)??
  • ??
  • #因?yàn)閰?shù)太多,在updates規(guī)則里面一個(gè)一個(gè)具體地寫(xiě)出來(lái)是很麻煩的,所以下面用了一個(gè)for..in..,自動(dòng)生成規(guī)則對(duì)(param_i,?param_i?-?learning_rate?*?grad_i)??
  • ????updates?=?[??
  • ????????(param_i,?param_i?-?learning_rate?*?grad_i)??
  • ????????for?param_i,?grad_i?in?zip(params,?grads)??
  • ????]??
  • ??
  • #train_model,代碼分析同test_model。train_model里比test_model、validation_model多出updates規(guī)則??
  • ????train_model?=?theano.function(??
  • ????????[index],??
  • ????????cost,??
  • ????????updates=updates,??
  • ????????givens={??
  • ????????????x:?train_set_x[index?*?batch_size:?(index?+?1)?*?batch_size],??
  • ????????????y:?train_set_y[index?*?batch_size:?(index?+?1)?*?batch_size]??
  • ????????}??
  • ????)??
  • ??
  • ??
  • ????###############??
  • ????#???開(kāi)始訓(xùn)練??#??
  • ????###############??
  • ????print?'...?training'??
  • ????patience?=?10000????
  • ????patience_increase?=?2????
  • ????improvement_threshold?=?0.995???
  • ?????????????????????????????????????
  • ????validation_frequency?=?min(n_train_batches,?patience?/?2)??
  • ?#這樣設(shè)置validation_frequency可以保證每一次epoch都會(huì)在驗(yàn)證集上測(cè)試。??
  • ??
  • ????best_validation_loss?=?numpy.inf???#最好的驗(yàn)證集上的loss,最好即最小??
  • ????best_iter?=?0??????????????????????#最好的迭代次數(shù),以batch為單位。比如best_iter=10000,說(shuō)明在訓(xùn)練完第10000個(gè)batch時(shí),達(dá)到best_validation_loss??
  • ????test_score?=?0.??
  • ????start_time?=?time.clock()??
  • ??
  • ????epoch?=?0??
  • ????done_looping?=?False??
  • ??
  • #下面就是訓(xùn)練過(guò)程了,while循環(huán)控制的時(shí)步數(shù)epoch,一個(gè)epoch會(huì)遍歷所有的batch,即所有的圖片。??
  • #for循環(huán)是遍歷一個(gè)個(gè)batch,一次一個(gè)batch地訓(xùn)練。for循環(huán)體里會(huì)用train_model(minibatch_index)去訓(xùn)練模型,??
  • #train_model里面的updatas會(huì)更新各個(gè)參數(shù)。??
  • #for循環(huán)里面會(huì)累加訓(xùn)練過(guò)的batch數(shù)iter,當(dāng)iter是validation_frequency倍數(shù)時(shí)則會(huì)在驗(yàn)證集上測(cè)試,??
  • #如果驗(yàn)證集的損失this_validation_loss小于之前最佳的損失best_validation_loss,??
  • #則更新best_validation_loss和best_iter,同時(shí)在testset上測(cè)試。??
  • #如果驗(yàn)證集的損失this_validation_loss小于best_validation_loss*improvement_threshold時(shí)則更新patience。??
  • #當(dāng)達(dá)到最大步數(shù)n_epoch時(shí),或者patience<iter時(shí),結(jié)束訓(xùn)練??
  • ????while?(epoch?<?n_epochs)?and?(not?done_looping):??
  • ????????epoch?=?epoch?+?1??
  • ????????for?minibatch_index?in?xrange(n_train_batches):??
  • ??
  • ????????????iter?=?(epoch?-?1)?*?n_train_batches?+?minibatch_index??
  • ??
  • ????????????if?iter?%?100?==?0:??
  • ????????????????print?'training?@?iter?=?',?iter??
  • ????????????cost_ij?=?train_model(minibatch_index)????
  • #cost_ij?沒(méi)什么用,后面都沒(méi)有用到,只是為了調(diào)用train_model,而train_model有返回值??
  • ????????????if?(iter?+?1)?%?validation_frequency?==?0:??
  • ??
  • ????????????????#?compute?zero-one?loss?on?validation?set??
  • ????????????????validation_losses?=?[validate_model(i)?for?i??
  • ?????????????????????????????????????in?xrange(n_valid_batches)]??
  • ????????????????this_validation_loss?=?numpy.mean(validation_losses)??
  • ????????????????print('epoch?%i,?minibatch?%i/%i,?validation?error?%f?%%'?%??
  • ??????????????????????(epoch,?minibatch_index?+?1,?n_train_batches,??
  • ???????????????????????this_validation_loss?*?100.))??
  • ??
  • ???
  • ????????????????if?this_validation_loss?<?best_validation_loss:??
  • ??
  • ??????????????????????
  • ????????????????????if?this_validation_loss?<?best_validation_loss?*??\??
  • ???????????????????????improvement_threshold:??
  • ????????????????????????patience?=?max(patience,?iter?*?patience_increase)??
  • ??
  • ??????????????????????
  • ????????????????????best_validation_loss?=?this_validation_loss??
  • ????????????????????best_iter?=?iter??
  • ??
  • ?????????????????????
  • ????????????????????test_losses?=?[??
  • ????????????????????????test_model(i)??
  • ????????????????????????for?i?in?xrange(n_test_batches)??
  • ????????????????????]??
  • ????????????????????test_score?=?numpy.mean(test_losses)??
  • ????????????????????print(('?????epoch?%i,?minibatch?%i/%i,?test?error?of?'??
  • ???????????????????????????'best?model?%f?%%')?%??
  • ??????????????????????????(epoch,?minibatch_index?+?1,?n_train_batches,??
  • ???????????????????????????test_score?*?100.))??
  • ??
  • ????????????if?patience?<=?iter:??
  • ????????????????done_looping?=?True??
  • ????????????????break??
  • ??
  • ????end_time?=?time.clock()??
  • ????print('Optimization?complete.')??
  • ????print('Best?validation?score?of?%f?%%?obtained?at?iteration?%i,?'??
  • ??????????'with?test?performance?%f?%%'?%??
  • ??????????(best_validation_loss?*?100.,?best_iter?+?1,?test_score?*?100.))??
  • ????print?>>?sys.stderr,?('The?code?for?file?'?+??
  • ??????????????????????????os.path.split(__file__)[1]?+??
  • ??????????????????????????'?ran?for?%.2fm'?%?((end_time?-?start_time)?/?60.))??


  • 文章完,經(jīng)詳細(xì)注釋的代碼和原始代碼:放在我的github地址上,可下載

    如果有任何錯(cuò)誤,或者有說(shuō)不清楚的地方,歡迎評(píng)論留言。


    總結(jié)

    以上是生活随笔為你收集整理的DeepLearning tutorial(4)CNN卷积神经网络原理简介+代码详解的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。

    如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。