當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Eltwise_layer简介

發布時間：2023/12/4 编程问答 31 豆豆

生活随笔收集整理的這篇文章主要介紹了 Eltwise_layer简介小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

http://www.voidcn.com/blog/thy_2014/article/p-6117416.html

common_layer：

ArgMaxLayer類；

ConcatLayer類：

EltwiseLayer類；

FlattenLayer類；

InnerProductLayer類；

MVNLayer類；

SilenceLayer類；

SoftmaxLayer類，CuDNNSoftmaxLayer類；

SplitLayer類；

SliceLayer類。

呃，貌似就曉得全鏈接一樣！！一個個的來看看這些是可以用在什么地方？

1 ArgMaxLayer：

Compute the index of the @f$ K @f$ max values for each datum across all dimensions @f$ (C \times H \times W) @f$.

Intended for use after a classification layer to produce a prediction. If parameter out_max_val is set to true, output is a vector of pairs (max_ind, max_val) for each image.

?NOTE: does not implement Backwards operation.

1.1 原理介紹：

在做分類之后，也就是經過全鏈接層之后，對每組數據計算其最大的前K個值。

感覺上有點像：例如我們在使用caffeNet做預測的時候，通常會輸出概率最大的5個值，感覺上就是這個層在起作用。(這句話是亂說的哈，沒有得到確認！)

所以也不需要反饋什么的了。

1.2 屬性變量：

bool out_max_val_;size_t top_k_;
從下面的構造函數里面可以看到，當out_max_val_賦值為true的時候，輸出包括下標和值；賦值為false的時候，就只輸出下標。

top_k_的話，用于表明找到前top_k_個最大值吧。

1.3 構造函數：

template <typename Dtype> void ArgMaxLayer<Dtype>::LayerSetUp(const vector<Blob<Dtype>*>& bottom,vector<Blob<Dtype>*>* top) {out_max_val_ = this->layer_param_.argmax_param().out_max_val();top_k_ = this->layer_param_.argmax_param().top_k();CHECK_GE(top_k_, 1) << " top k must not be less than 1.";CHECK_LE(top_k_, bottom[0]->count() / bottom[0]->num())<< "top_k must be less than or equal to the number of classes."; }template <typename Dtype> void ArgMaxLayer<Dtype>::Reshape(const vector<Blob<Dtype>*>& bottom,vector<Blob<Dtype>*>* top) {if (out_max_val_) {// Produces max_ind and max_val(*top)[0]->Reshape(bottom[0]->num(), 2, top_k_, 1);} else {// Produces only max_ind(*top)[0]->Reshape(bottom[0]->num(), 1, top_k_, 1);} }
這兩個函數沒什么好說的嘛，很好理解。只是好像最開始學習使用caffe，并試著訓練一些模型，試著寫模型的配置文件時，沒有用過這個層一樣？！?

1.4 前饋函數：

template <typename Dtype> void ArgMaxLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom,vector<Blob<Dtype>*>* top) {const Dtype* bottom_data = bottom[0]->cpu_data();Dtype* top_data = (*top)[0]->mutable_cpu_data();int num = bottom[0]->num();int dim = bottom[0]->count() / bottom[0]->num();for (int i = 0; i < num; ++i) {std::vector<std::pair<Dtype, int> > bottom_data_vector;for (int j = 0; j < dim; ++j) {bottom_data_vector.push_back(std::make_pair(bottom_data[i * dim + j], j));}std::partial_sort(bottom_data_vector.begin(), bottom_data_vector.begin() + top_k_,bottom_data_vector.end(), std::greater<std::pair<Dtype, int> >());for (int j = 0; j < top_k_; ++j) {top_data[(*top)[0]->offset(i, 0, j)] = bottom_data_vector[j].second;}if (out_max_val_) {for (int j = 0; j < top_k_; ++j) {top_data[(*top)[0]->offset(i, 1, j)] = bottom_data_vector[j].first;}}} }
我想可以用下面這樣一個圖來表述ArgMaxLayer的作用：

這個圖的最有端，也表明了其計算過程，所以再去讀一下上面的前饋函數，就很容易理解了吧。

2 ConcatLayer：

Takes at least two Blob%s and concatenates them along either the num or channel dimension, outputting the result.

2.1 原理介紹：

前饋：(矩陣合并)

反饋：(矩陣分割)

有沒有覺得奇怪，什么地方會用這種層呢？其實至少在google的論文中看到了確實用得上這種層，也就是那個“盜夢空間”結構。

2.2 屬性變量：

Blob<Dtype> col_bob_;int count_;int num_;int channels_;int height_;int width_;int concat_dim_;
其中兩個變量不怎么認識：

col_bob_：

concat_dim_：指定在鏈接Blob時的維度，例如當concat_dim_，表示從第2個維度鏈接Blob。

其余的幾個變量都是比較熟悉了，不過需要注意的是，這里的幾個值都是用于設置top層Blob大小的。

2.3 構造函數：

template <typename Dtype> void ConcatLayer<Dtype>::LayerSetUp(const vector<Blob<Dtype>*>& bottom,vector<Blob<Dtype>*>* top) {concat_dim_ = this->layer_param_.concat_param().concat_dim();CHECK_GE(concat_dim_, 0) <<"concat_dim should be >= 0";CHECK_LE(concat_dim_, 1) <<"For now concat_dim <=1, it can only concat num and channels"; }template <typename Dtype> void ConcatLayer<Dtype>::Reshape(const vector<Blob<Dtype>*>& bottom,vector<Blob<Dtype>*>* top) {// Initialize with the first blob.count_ = bottom[0]->count();num_ = bottom[0]->num();channels_ = bottom[0]->channels();height_ = bottom[0]->height();width_ = bottom[0]->width();for (int i = 1; i < bottom.size(); ++i) {count_ += bottom[i]->count();if (concat_dim_== 0) {num_ += bottom[i]->num();} else if (concat_dim_ == 1) {channels_ += bottom[i]->channels();} else if (concat_dim_ == 2) {height_ += bottom[i]->height();} else if (concat_dim_ == 3) {width_ += bottom[i]->width();}}(*top)[0]->Reshape(num_, channels_, height_, width_);CHECK_EQ(count_, (*top)[0]->count()); }
這里在初始化的時候，? Reshape() 中，注意到那個for了吧。假設bottom中有K個Blob，鏈接的維度是1，那么自然top層Blob的channels_維等于bottom中K個channels之和。?

2.4 前饋反饋函數：

前饋：

template <typename Dtype> void ConcatLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom,vector<Blob<Dtype>*>* top) {Dtype* top_data = (*top)[0]->mutable_cpu_data();if (concat_dim_== 0) {int offset_num = 0;for (int i = 0; i < bottom.size(); ++i) {const Dtype* bottom_data = bottom[i]->cpu_data();int num_elem = bottom[i]->count();caffe_copy(num_elem, bottom_data, top_data+(*top)[0]->offset(offset_num));offset_num += bottom[i]->num();}} else if (concat_dim_ == 1) {int offset_channel = 0;for (int i = 0; i < bottom.size(); ++i) {const Dtype* bottom_data = bottom[i]->cpu_data();int num_elem =bottom[i]->channels()*bottom[i]->height()*bottom[i]->width();for (int n = 0; n < num_; ++n) {caffe_copy(num_elem, bottom_data+bottom[i]->offset(n),top_data+(*top)[0]->offset(n, offset_channel));}offset_channel += bottom[i]->channels();} // concat_dim_ is guaranteed to be 0 or 1 by LayerSetUp.} }
這里的實現中，算是默認了，鏈接的維度只可能是第0維和第1維。既然這樣的話，Reshape中也沒有必要寫那么多了嘛。

其它的就相當于是矩陣的拼接。

反饋：

template <typename Dtype> void ConcatLayer<Dtype>::Backward_cpu(const vector<Blob<Dtype>*>& top,const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom) {const Dtype* top_diff = top[0]->cpu_diff();if (concat_dim_ == 0) {int offset_num = 0;for (int i = 0; i < bottom->size(); ++i) {Blob<Dtype>* blob = (*bottom)[i];if (propagate_down[i]) {Dtype* bottom_diff = blob->mutable_cpu_diff();caffe_copy(blob->count(), top_diff + top[0]->offset(offset_num),bottom_diff);}offset_num += blob->num();}} else if (concat_dim_ == 1) {int offset_channel = 0;for (int i = 0; i < bottom->size(); ++i) {Blob<Dtype>* blob = (*bottom)[i];if (propagate_down[i]) {Dtype* bottom_diff = blob->mutable_cpu_diff();int num_elem = blob->channels()*blob->height()*blob->width();for (int n = 0; n < num_; ++n) {caffe_copy(num_elem, top_diff + top[0]->offset(n, offset_channel),bottom_diff + blob->offset(n));}}offset_channel += blob->channels();}} // concat_dim_ is guaranteed to be 0 or 1 by LayerSetUp. }
同樣，反饋的時候，就是矩陣分割的問題。

3 EltwiseLayer：

Compute elementwise operations, such as product and sum, along multiple input Blobs.

3.1 原理介紹：

對多個矩陣之間按元素進行某種操作，通過源碼可以看到，一共提供了：乘以，求和，取最大值，三種操作。

前面介紹了那么多前饋和反饋的原理，這里理解起來應該很容易。這里三種操作，分別進行就好了。

3.2 屬性變量：

EltwiseParameter_EltwiseOp op_;vector<Dtype> coeffs_;Blob<int> max_idx_;bool stable_prod_grad_;
既然實現的是多個Blob之間的某種操作，那么自然會定義是什么操作，所以有了變量op_，但是EltwiseParameter_EltwiseOp類型是在什么地方定義的？

coeffs_：該變量的大小應該是和bottom層的Blob個數是相同的，也就是說如果在進行求和的時候，是按照加權求和的。也就是：

其中的 y 和 x_i 都是矩陣，而coeffs_i是一個值。

max_idx_：如果是進行取最大值操作，為了在反饋的時候，能夠反饋得回去，所以需要記錄最大值來源于哪個Blob。從后面會看到top層的Blob和bottom的Blob尺寸大小是相同的，但是top層只有一個Blob，而bottom層有多個Blob。

stable_prod_grad_：在乘積方式反饋的時候，控制反饋的方式。

3.3 構造函數：

template <typename Dtype> void EltwiseLayer<Dtype>::LayerSetUp(const vector<Blob<Dtype>*>& bottom,vector<Blob<Dtype>*>* top) {CHECK(this->layer_param().eltwise_param().coeff_size() == 0|| this->layer_param().eltwise_param().coeff_size() == bottom.size()) <<"Eltwise Layer takes one coefficient per bottom blob.";CHECK(!(this->layer_param().eltwise_param().operation()== EltwiseParameter_EltwiseOp_PROD&& this->layer_param().eltwise_param().coeff_size())) <<"Eltwise layer only takes coefficients for summation.";op_ = this->layer_param_.eltwise_param().operation();// Blob-wise coefficients for the elementwise operation.coeffs_ = vector<Dtype>(bottom.size(), 1);if (this->layer_param().eltwise_param().coeff_size()) {for (int i = 0; i < bottom.size(); ++i) {coeffs_[i] = this->layer_param().eltwise_param().coeff(i);}}stable_prod_grad_ = this->layer_param_.eltwise_param().stable_prod_grad(); }template <typename Dtype> void EltwiseLayer<Dtype>::Reshape(const vector<Blob<Dtype>*>& bottom,vector<Blob<Dtype>*>* top) {const int num = bottom[0]->num();const int channels = bottom[0]->channels();const int height = bottom[0]->height();const int width = bottom[0]->width();for (int i = 1; i < bottom.size(); ++i) {CHECK_EQ(num, bottom[i]->num());CHECK_EQ(channels, bottom[i]->channels());CHECK_EQ(height, bottom[i]->height());CHECK_EQ(width, bottom[i]->width());}(*top)[0]->Reshape(num, channels, height, width);// If max operation, we will initialize the vector index part.if (this->layer_param_.eltwise_param().operation() ==EltwiseParameter_EltwiseOp_MAX && top->size() == 1) {max_idx_.Reshape(bottom[0]->num(), channels, height, width);} }
從這里的? Reshape() 中看到，該層的所有輸入Blob的尺寸必須相同。?

3.4 前饋反饋函數：

前饋：

template <typename Dtype> void EltwiseLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom, vector<Blob<Dtype>*>* top) {int* mask = NULL;const Dtype* bottom_data_a = NULL;const Dtype* bottom_data_b = NULL;const int count = (*top)[0]->count();Dtype* top_data = (*top)[0]->mutable_cpu_data();switch (op_) {case EltwiseParameter_EltwiseOp_PROD:caffe_mul(count, bottom[0]->cpu_data(), bottom[1]->cpu_data(), top_data);for (int i = 2; i < bottom.size(); ++i) {caffe_mul(count, top_data, bottom[i]->cpu_data(), top_data);}break;case EltwiseParameter_EltwiseOp_SUM:caffe_set(count, Dtype(0), top_data);// TODO(shelhamer) does BLAS optimize to sum for coeff = 1?for (int i = 0; i < bottom.size(); ++i) {caffe_axpy(count, coeffs_[i], bottom[i]->cpu_data(), top_data);}break;case EltwiseParameter_EltwiseOp_MAX:// Initializemask = max_idx_.mutable_cpu_data();caffe_set(count, -1, mask);caffe_set(count, Dtype(-FLT_MAX), top_data);// bottom 0 & 1bottom_data_a = bottom[0]->cpu_data();bottom_data_b = bottom[1]->cpu_data();for (int idx = 0; idx < count; ++idx) {if (bottom_data_a[idx] > bottom_data_b[idx]) {top_data[idx] = bottom_data_a[idx]; // maxvalmask[idx] = 0; // maxid} else {top_data[idx] = bottom_data_b[idx]; // maxvalmask[idx] = 1; // maxid}}// bottom 2++for (int blob_idx = 2; blob_idx < bottom.size(); ++blob_idx) {bottom_data_b = bottom[blob_idx]->cpu_data();for (int idx = 0; idx < count; ++idx) {if (bottom_data_b[idx] > top_data[idx]) {top_data[idx] = bottom_data_b[idx]; // maxvalmask[idx] = blob_idx; // maxid}}}break;default:LOG(FATAL) << "Unknown elementwise operation.";} }
這里的代碼直接看，容易理解。

反饋：

template <typename Dtype> void EltwiseLayer<Dtype>::Backward_cpu(const vector<Blob<Dtype>*>& top,const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom) {const int* mask = NULL;const int count = top[0]->count();const Dtype* top_data = top[0]->cpu_data();const Dtype* top_diff = top[0]->cpu_diff();for (int i = 0; i < bottom->size(); ++i) {if (propagate_down[i]) {const Dtype* bottom_data = (*bottom)[i]->cpu_data();Dtype* bottom_diff = (*bottom)[i]->mutable_cpu_diff();switch (op_) {case EltwiseParameter_EltwiseOp_PROD:if (stable_prod_grad_) {bool initialized = false;for (int j = 0; j < bottom->size(); ++j) {if (i == j) { continue; }if (!initialized) {caffe_copy(count, (*bottom)[j]->cpu_data(), bottom_diff);initialized = true;} else {caffe_mul(count, (*bottom)[j]->cpu_data(), bottom_diff,bottom_diff);}}} else {caffe_div(count, top_data, bottom_data, bottom_diff);}caffe_mul(count, bottom_diff, top_diff, bottom_diff);break;case EltwiseParameter_EltwiseOp_SUM:if (coeffs_[i] == Dtype(1)) {caffe_copy(count, top_diff, bottom_diff);} else {caffe_cpu_scale(count, coeffs_[i], top_diff, bottom_diff);}break;case EltwiseParameter_EltwiseOp_MAX:mask = max_idx_.cpu_data();for (int index = 0; index < count; ++index) {Dtype gradient = 0;if (mask[index] == i) {gradient += top_diff[index];}bottom_diff[index] = gradient;}break;default:LOG(FATAL) << "Unknown elementwise operation.";}}} }
反饋中的求和反饋，取最大值的反饋，都還是很好理解。

乘積的反饋好像有點怪怪的。首先來看看成績反饋時的基本原理：

所以直接使用top_data/bottom_data再乘以top_diff，這個是很好理解的。

可是源代碼中提供了兩種方式：

第1中方式是：計算mul(x_i)，i=0...k-1且i != j

第2種方式就是：top_data/bottom_data

只要數據不是很多0，結果應該是差不多的，那么為什么會用這兩種方式呢？不理解。

4 FlattenLayer：

總結

以上是生活随笔為你收集整理的Eltwise_layer简介的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。