Eltwise_layer简介
http://www.voidcn.com/blog/thy_2014/article/p-6117416.html
common_layer:
ArgMaxLayer類;
ConcatLayer類:
EltwiseLayer類;
FlattenLayer類;
InnerProductLayer類;
MVNLayer類;
SilenceLayer類;
SoftmaxLayer類,CuDNNSoftmaxLayer類;
SplitLayer類;
SliceLayer類。
呃,貌似就曉得全鏈接一樣!!一個個的來看看這些是可以用在什么地方?
1 ArgMaxLayer:
Compute the index of the @f$ K @f$ max values for each datum across all dimensions @f$ (C \times H \times W) @f$.
Intended for use after a classification layer to produce a prediction. If parameter out_max_val is set to true, output is a vector of pairs (max_ind, max_val) for each image.
?NOTE: does not implement Backwards operation.
1.1 原理介紹:
在做分類之后,也就是經過全鏈接層之后,對每組數據計算其最大的前K個值。
感覺上有點像:例如我們在使用caffeNet做預測的時候,通常會輸出概率最大的5個值,感覺上就是這個層在起作用。(這句話是亂說的哈,沒有得到確認!)
所以也不需要反饋什么的了。
1.2 屬性變量:
bool out_max_val_;size_t top_k_;
從下面的構造函數里面可以看到,當out_max_val_賦值為true的時候,輸出包括下標和值;賦值為false的時候,就只輸出下標。
top_k_的話,用于表明找到前top_k_個最大值吧。
1.3 構造函數:
template <typename Dtype> void ArgMaxLayer<Dtype>::LayerSetUp(const vector<Blob<Dtype>*>& bottom,vector<Blob<Dtype>*>* top) {out_max_val_ = this->layer_param_.argmax_param().out_max_val();top_k_ = this->layer_param_.argmax_param().top_k();CHECK_GE(top_k_, 1) << " top k must not be less than 1.";CHECK_LE(top_k_, bottom[0]->count() / bottom[0]->num())<< "top_k must be less than or equal to the number of classes."; }template <typename Dtype> void ArgMaxLayer<Dtype>::Reshape(const vector<Blob<Dtype>*>& bottom,vector<Blob<Dtype>*>* top) {if (out_max_val_) {// Produces max_ind and max_val(*top)[0]->Reshape(bottom[0]->num(), 2, top_k_, 1);} else {// Produces only max_ind(*top)[0]->Reshape(bottom[0]->num(), 1, top_k_, 1);} }
這兩個函數沒什么好說的嘛,很好理解。只是好像最開始學習使用caffe,并試著訓練一些模型,試著寫模型的配置文件時,沒有用過這個層一樣?!?
1.4 前饋函數:
template <typename Dtype> void ArgMaxLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom,vector<Blob<Dtype>*>* top) {const Dtype* bottom_data = bottom[0]->cpu_data();Dtype* top_data = (*top)[0]->mutable_cpu_data();int num = bottom[0]->num();int dim = bottom[0]->count() / bottom[0]->num();for (int i = 0; i < num; ++i) {std::vector<std::pair<Dtype, int> > bottom_data_vector;for (int j = 0; j < dim; ++j) {bottom_data_vector.push_back(std::make_pair(bottom_data[i * dim + j], j));}std::partial_sort(bottom_data_vector.begin(), bottom_data_vector.begin() + top_k_,bottom_data_vector.end(), std::greater<std::pair<Dtype, int> >());for (int j = 0; j < top_k_; ++j) {top_data[(*top)[0]->offset(i, 0, j)] = bottom_data_vector[j].second;}if (out_max_val_) {for (int j = 0; j < top_k_; ++j) {top_data[(*top)[0]->offset(i, 1, j)] = bottom_data_vector[j].first;}}} }
我想可以用下面這樣一個圖來表述ArgMaxLayer的作用:
這個圖的最有端,也表明了其計算過程,所以再去讀一下上面的前饋函數,就很容易理解了吧。
2 ConcatLayer:
Takes at least two Blob%s and concatenates them along either the num or channel dimension, outputting the result.
2.1 原理介紹:
前饋:(矩陣合并)
反饋:(矩陣分割)
有沒有覺得奇怪,什么地方會用這種層呢?其實至少在google的論文中看到了確實用得上這種層,也就是那個“盜夢空間”結構。
2.2 屬性變量:
Blob<Dtype> col_bob_;int count_;int num_;int channels_;int height_;int width_;int concat_dim_;
其中兩個變量不怎么認識:
col_bob_:
concat_dim_:指定在鏈接Blob時的維度,例如當concat_dim_,表示從第2個維度鏈接Blob。
其余的幾個變量都是比較熟悉了,不過需要注意的是,這里的幾個值都是用于設置top層Blob大小的。
2.3 構造函數:
template <typename Dtype> void ConcatLayer<Dtype>::LayerSetUp(const vector<Blob<Dtype>*>& bottom,vector<Blob<Dtype>*>* top) {concat_dim_ = this->layer_param_.concat_param().concat_dim();CHECK_GE(concat_dim_, 0) <<"concat_dim should be >= 0";CHECK_LE(concat_dim_, 1) <<"For now concat_dim <=1, it can only concat num and channels"; }template <typename Dtype> void ConcatLayer<Dtype>::Reshape(const vector<Blob<Dtype>*>& bottom,vector<Blob<Dtype>*>* top) {// Initialize with the first blob.count_ = bottom[0]->count();num_ = bottom[0]->num();channels_ = bottom[0]->channels();height_ = bottom[0]->height();width_ = bottom[0]->width();for (int i = 1; i < bottom.size(); ++i) {count_ += bottom[i]->count();if (concat_dim_== 0) {num_ += bottom[i]->num();} else if (concat_dim_ == 1) {channels_ += bottom[i]->channels();} else if (concat_dim_ == 2) {height_ += bottom[i]->height();} else if (concat_dim_ == 3) {width_ += bottom[i]->width();}}(*top)[0]->Reshape(num_, channels_, height_, width_);CHECK_EQ(count_, (*top)[0]->count()); }
這里在初始化的時候,? Reshape() 中,注意到那個for了吧。假設bottom中有K個Blob,鏈接的維度是1,那么自然top層Blob的channels_維等于bottom中K個channels之和。?
2.4 前饋反饋函數:
前饋:
template <typename Dtype> void ConcatLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom,vector<Blob<Dtype>*>* top) {Dtype* top_data = (*top)[0]->mutable_cpu_data();if (concat_dim_== 0) {int offset_num = 0;for (int i = 0; i < bottom.size(); ++i) {const Dtype* bottom_data = bottom[i]->cpu_data();int num_elem = bottom[i]->count();caffe_copy(num_elem, bottom_data, top_data+(*top)[0]->offset(offset_num));offset_num += bottom[i]->num();}} else if (concat_dim_ == 1) {int offset_channel = 0;for (int i = 0; i < bottom.size(); ++i) {const Dtype* bottom_data = bottom[i]->cpu_data();int num_elem =bottom[i]->channels()*bottom[i]->height()*bottom[i]->width();for (int n = 0; n < num_; ++n) {caffe_copy(num_elem, bottom_data+bottom[i]->offset(n),top_data+(*top)[0]->offset(n, offset_channel));}offset_channel += bottom[i]->channels();} // concat_dim_ is guaranteed to be 0 or 1 by LayerSetUp.} }
這里的實現中,算是默認了,鏈接的維度只可能是第0維和第1維。既然這樣的話,Reshape中也沒有必要寫那么多了嘛。
其它的就相當于是矩陣的拼接。
反饋:
template <typename Dtype> void ConcatLayer<Dtype>::Backward_cpu(const vector<Blob<Dtype>*>& top,const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom) {const Dtype* top_diff = top[0]->cpu_diff();if (concat_dim_ == 0) {int offset_num = 0;for (int i = 0; i < bottom->size(); ++i) {Blob<Dtype>* blob = (*bottom)[i];if (propagate_down[i]) {Dtype* bottom_diff = blob->mutable_cpu_diff();caffe_copy(blob->count(), top_diff + top[0]->offset(offset_num),bottom_diff);}offset_num += blob->num();}} else if (concat_dim_ == 1) {int offset_channel = 0;for (int i = 0; i < bottom->size(); ++i) {Blob<Dtype>* blob = (*bottom)[i];if (propagate_down[i]) {Dtype* bottom_diff = blob->mutable_cpu_diff();int num_elem = blob->channels()*blob->height()*blob->width();for (int n = 0; n < num_; ++n) {caffe_copy(num_elem, top_diff + top[0]->offset(n, offset_channel),bottom_diff + blob->offset(n));}}offset_channel += blob->channels();}} // concat_dim_ is guaranteed to be 0 or 1 by LayerSetUp. }
同樣,反饋的時候,就是矩陣分割的問題。
3 EltwiseLayer:
Compute elementwise operations, such as product and sum, along multiple input Blobs.
3.1 原理介紹:
對多個矩陣之間按元素進行某種操作,通過源碼可以看到,一共提供了:乘以,求和,取最大值,三種操作。
前面介紹了那么多前饋和反饋的原理,這里理解起來應該很容易。這里三種操作,分別進行就好了。
3.2 屬性變量:
EltwiseParameter_EltwiseOp op_;vector<Dtype> coeffs_;Blob<int> max_idx_;bool stable_prod_grad_;
既然實現的是多個Blob之間的某種操作,那么自然會定義是什么操作,所以有了變量op_,但是EltwiseParameter_EltwiseOp類型是在什么地方定義的?
coeffs_:該變量的大小應該是和bottom層的Blob個數是相同的,也就是說如果在進行求和的時候,是按照加權求和的。也就是:
其中的 y 和 x_i 都是矩陣,而coeffs_i是一個值。
max_idx_:如果是進行取最大值操作,為了在反饋的時候,能夠反饋得回去,所以需要記錄最大值來源于哪個Blob。從后面會看到top層的Blob和bottom的Blob尺寸大小是相同的,但是top層只有一個Blob,而bottom層有多個Blob。
stable_prod_grad_:在乘積方式反饋的時候,控制反饋的方式。
3.3 構造函數:
template <typename Dtype> void EltwiseLayer<Dtype>::LayerSetUp(const vector<Blob<Dtype>*>& bottom,vector<Blob<Dtype>*>* top) {CHECK(this->layer_param().eltwise_param().coeff_size() == 0|| this->layer_param().eltwise_param().coeff_size() == bottom.size()) <<"Eltwise Layer takes one coefficient per bottom blob.";CHECK(!(this->layer_param().eltwise_param().operation()== EltwiseParameter_EltwiseOp_PROD&& this->layer_param().eltwise_param().coeff_size())) <<"Eltwise layer only takes coefficients for summation.";op_ = this->layer_param_.eltwise_param().operation();// Blob-wise coefficients for the elementwise operation.coeffs_ = vector<Dtype>(bottom.size(), 1);if (this->layer_param().eltwise_param().coeff_size()) {for (int i = 0; i < bottom.size(); ++i) {coeffs_[i] = this->layer_param().eltwise_param().coeff(i);}}stable_prod_grad_ = this->layer_param_.eltwise_param().stable_prod_grad(); }template <typename Dtype> void EltwiseLayer<Dtype>::Reshape(const vector<Blob<Dtype>*>& bottom,vector<Blob<Dtype>*>* top) {const int num = bottom[0]->num();const int channels = bottom[0]->channels();const int height = bottom[0]->height();const int width = bottom[0]->width();for (int i = 1; i < bottom.size(); ++i) {CHECK_EQ(num, bottom[i]->num());CHECK_EQ(channels, bottom[i]->channels());CHECK_EQ(height, bottom[i]->height());CHECK_EQ(width, bottom[i]->width());}(*top)[0]->Reshape(num, channels, height, width);// If max operation, we will initialize the vector index part.if (this->layer_param_.eltwise_param().operation() ==EltwiseParameter_EltwiseOp_MAX && top->size() == 1) {max_idx_.Reshape(bottom[0]->num(), channels, height, width);} }
從這里的? Reshape() 中看到,該層的所有輸入Blob的尺寸必須相同。?
3.4 前饋反饋函數:
前饋:
template <typename Dtype> void EltwiseLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom, vector<Blob<Dtype>*>* top) {int* mask = NULL;const Dtype* bottom_data_a = NULL;const Dtype* bottom_data_b = NULL;const int count = (*top)[0]->count();Dtype* top_data = (*top)[0]->mutable_cpu_data();switch (op_) {case EltwiseParameter_EltwiseOp_PROD:caffe_mul(count, bottom[0]->cpu_data(), bottom[1]->cpu_data(), top_data);for (int i = 2; i < bottom.size(); ++i) {caffe_mul(count, top_data, bottom[i]->cpu_data(), top_data);}break;case EltwiseParameter_EltwiseOp_SUM:caffe_set(count, Dtype(0), top_data);// TODO(shelhamer) does BLAS optimize to sum for coeff = 1?for (int i = 0; i < bottom.size(); ++i) {caffe_axpy(count, coeffs_[i], bottom[i]->cpu_data(), top_data);}break;case EltwiseParameter_EltwiseOp_MAX:// Initializemask = max_idx_.mutable_cpu_data();caffe_set(count, -1, mask);caffe_set(count, Dtype(-FLT_MAX), top_data);// bottom 0 & 1bottom_data_a = bottom[0]->cpu_data();bottom_data_b = bottom[1]->cpu_data();for (int idx = 0; idx < count; ++idx) {if (bottom_data_a[idx] > bottom_data_b[idx]) {top_data[idx] = bottom_data_a[idx]; // maxvalmask[idx] = 0; // maxid} else {top_data[idx] = bottom_data_b[idx]; // maxvalmask[idx] = 1; // maxid}}// bottom 2++for (int blob_idx = 2; blob_idx < bottom.size(); ++blob_idx) {bottom_data_b = bottom[blob_idx]->cpu_data();for (int idx = 0; idx < count; ++idx) {if (bottom_data_b[idx] > top_data[idx]) {top_data[idx] = bottom_data_b[idx]; // maxvalmask[idx] = blob_idx; // maxid}}}break;default:LOG(FATAL) << "Unknown elementwise operation.";} }
這里的代碼直接看,容易理解。
反饋:
template <typename Dtype> void EltwiseLayer<Dtype>::Backward_cpu(const vector<Blob<Dtype>*>& top,const vector<bool>& propagate_down, vector<Blob<Dtype>*>* bottom) {const int* mask = NULL;const int count = top[0]->count();const Dtype* top_data = top[0]->cpu_data();const Dtype* top_diff = top[0]->cpu_diff();for (int i = 0; i < bottom->size(); ++i) {if (propagate_down[i]) {const Dtype* bottom_data = (*bottom)[i]->cpu_data();Dtype* bottom_diff = (*bottom)[i]->mutable_cpu_diff();switch (op_) {case EltwiseParameter_EltwiseOp_PROD:if (stable_prod_grad_) {bool initialized = false;for (int j = 0; j < bottom->size(); ++j) {if (i == j) { continue; }if (!initialized) {caffe_copy(count, (*bottom)[j]->cpu_data(), bottom_diff);initialized = true;} else {caffe_mul(count, (*bottom)[j]->cpu_data(), bottom_diff,bottom_diff);}}} else {caffe_div(count, top_data, bottom_data, bottom_diff);}caffe_mul(count, bottom_diff, top_diff, bottom_diff);break;case EltwiseParameter_EltwiseOp_SUM:if (coeffs_[i] == Dtype(1)) {caffe_copy(count, top_diff, bottom_diff);} else {caffe_cpu_scale(count, coeffs_[i], top_diff, bottom_diff);}break;case EltwiseParameter_EltwiseOp_MAX:mask = max_idx_.cpu_data();for (int index = 0; index < count; ++index) {Dtype gradient = 0;if (mask[index] == i) {gradient += top_diff[index];}bottom_diff[index] = gradient;}break;default:LOG(FATAL) << "Unknown elementwise operation.";}}} }
反饋中的 求和反饋,取最大值的反饋,都還是很好理解。
乘積的反饋好像有點怪怪的。首先來看看成績反饋時的基本原理:
所以直接使用top_data/bottom_data再乘以top_diff,這個是很好理解的。
可是源代碼中提供了兩種方式:
第1中方式是:計算mul(x_i),i=0...k-1且i != j
第2種方式就是:top_data/bottom_data
只要數據不是很多0,結果應該是差不多的,那么為什么會用這兩種方式呢?不理解。
4 FlattenLayer:
總結
以上是生活随笔為你收集整理的Eltwise_layer简介的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 思维导图如何画如何做好思维导图
- 下一篇: caffe基本函数