當(dāng)前位置：首頁 >

caffe卷积层代码阅读笔记

發(fā)布時間：2025/3/21 57 豆豆

生活随笔收集整理的這篇文章主要介紹了 caffe卷积层代码阅读笔记小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

轉(zhuǎn)載自：http://blog.csdn.net/tangwei2014/article/details/47730797

卷積的實現(xiàn)思想：

通過im2col將image轉(zhuǎn)為一個matrix，將卷積操作轉(zhuǎn)為矩陣乘法運算
通過調(diào)用GEMM完成運算操作
下面兩個圖是我在知乎中發(fā)現(xiàn)的，“盜”用一下，確實很好，能幫助理解。?
?

參數(shù)剖析

配置參數(shù)：(從配置文件得來)?
kernel_h_ pad_h_ hole_h_ stride_h_?
kernel_w_ pad_w_ hole_w_ stride_w_?
is_1x1_:上面8個參數(shù)都為1時，該參數(shù)為true
和輸入有關(guān)的參數(shù)：（從bottom得來）?
num_?
channels_?
height_?
width_
和卷積核有關(guān)的參數(shù)：(前兩個參數(shù)從配置文件得來)?
num_output_?
group_?
this->blobs_[0].reset(new Blob(num_output_, channels_ / group_, kernel_h_, kernel_w_));?
this->blobs_[1].reset(new Blob(1, 1, 1, num_output_));?
this->param_propagate_down_
和輸出有關(guān)的參數(shù)：（計算得來）?
const int kernel_h_eff = kernel_h_ + (kernel_h_ - 1) * (hole_h_ - 1);?
const int kernel_w_eff = kernel_w_ + (kernel_w_ - 1) * (hole_w_ - 1);?
height_out_ = (height_ + 2 * pad_h_ - kernel_h_eff) / stride_h_ + 1;?
width_out_ = (width_ + 2 * pad_w_ - kernel_w_eff) / stride_w_ + 1;
和矩陣運算有關(guān)的參數(shù)：（計算得來）?
M_ = num_output_ / group_;?
K_ = channels_ * kernel_h_ * kernel_w_ / group_;?
N_ = height_out_ * width_out_;?
col_buffer_.Reshape(1, channels_*kernel_h_*kernel_w_, height_out_, width_out_);// is_1x1_為false的時候用?
bias_multiplier_.Reshape(1, 1, 1, N_); //全部為1

輸入大小：(num_, channels_, height_, width_)?
輸出大小：(num_, num_output_, height_out_, width_out_)

重點函數(shù)剖析

函數(shù)一：?
im2col_cpu(bottom_data + bottom[i]->offset(n),?
1, channels_, height_, width_,?
kernel_h_, kernel_w_, pad_h_, pad_w_,?
stride_h_, stride_w_, hole_h_, hole_w_,?
col_buff);

該函數(shù)的目的是：根據(jù)配置參數(shù)，將一幅(1, channels_, height_, width_)的輸入feature map expand成 (1, channels_*kernel_h_*kernel_w_, height_out_, width_out_)大小的矩陣。

具體的實現(xiàn)方法是：?
內(nèi)部主要有兩套索引?
一套是在輸入圖像上的索引，分別是：c_im(channels), h_im(height), w_im(width)?
另一套是在輸出的col_buff上的，分別是：c(channels_col), h(height_col), w(width_col)

循環(huán)變量來自輸出的col_buff的維數(shù)，根據(jù)輸出的位置計算對應(yīng)在輸入圖像上的位置（col2imh函數(shù)和im2col函數(shù)是一個道理，兩套坐標(biāo)反著來就行）。把索引的代碼整合出來，對著源代碼看，很容易懂：

<code class="hljs cs has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;"> const int kernel_h_eff = kernel_h + (kernel_h - 1) * (hole_h - 1);const int kernel_w_eff = kernel_w + (kernel_w - 1) * (hole_w - 1);int height_col = (height + 2 * pad_h - kernel_h_eff) / stride_h + 1;int width_col = (width + 2 * pad_w - kernel_w_eff) / stride_w + 1;int channels_col = channels * kernel_h * kernel_w;int w_offset = (c % kernel_w) * hole_w;int h_offset = ((c / kernel_w) % kernel_h) * hole_h;int c_im = c / kernel_w / kernel_h;const int h_im = h * stride_h + h_offset - pad_h;const int w_im = w * stride_w + w_offset - pad_w;</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; background-color: rgb(238, 238, 238); top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right;"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li></ul>

函數(shù)二：

caffe_cpu_gemm(CblasNoTrans, CblasNoTrans, M_, N_, K_,?
(Dtype)1., weight + weight_offset * g, col_buff + col_offset * g,?
(Dtype)0., top_data + top[i]->offset(n) + top_offset * g);

該函數(shù)的目的是：?
將(num_output_/group_, channels_ /group_, kernel_h_, kernel_w_)卷積核看成一個(num_output_/group_, channels_*kernel_h_*kernel_w_/group_)的矩陣A,即大小為M_x K_。

將(1, channels_*kernel_h_*kernel_w_, height_out_, width_out_)的col_buff看成group_個(channels_*kernel_h_*kernel_w_/group_, height_out_*width_out_)的矩陣B，即大小為K_x N_。

兩者相乘再加上偏置項，就能得到卷積的結(jié)果。

解釋caffe_cpu_gemm函數(shù)：?
其實其內(nèi)部包了一個cblas_sgemm函數(shù)。?
void cblas_sgemm(const enum CBLAS_ORDER Order, const enum CBLAS_TRANSPOSE TransA,?
const enum CBLAS_TRANSPOSE TransB, const int M, const int N,?
const int K, const float alpha, const float *A,?
const int lda, const float *B, const int ldb,?
const float beta, float *C, const int ldc)

得到的結(jié)果是:?
C = alpha*op( A )*op( B ) + beta*C

const enum CBLAS_ORDER Order，這是指的數(shù)據(jù)的存儲形式，在CBLAS的函數(shù)中無論一維還是二維數(shù)據(jù)都是用一維數(shù)組存儲，這就要涉及是行主序還是列主序，在C語言中數(shù)組是用行主序，fortran中是列主序。如果是習(xí)慣于是用行主序，所以這個參數(shù)是用CblasRowMajor，如果是列主序的話就是 CblasColMajor。?
const int M，矩陣A的行，矩陣C的行?
const int N，矩陣B的列，矩陣C的列?
const int K，矩陣A的列，矩陣B的行

總結(jié)

以上是生活随笔為你收集整理的caffe卷积层代码阅读笔记的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇：论文阅读笔记：You Only Look
下一篇：三种权重的初始化方法