CNN 神经网络tricks 学习总结
TRICKS IN? ?DEEP??LEARNING
???????????????????????????????????????????????????????? IN?THIS?DOC?,?ONLY??WITH??LITTLE?BRIEF?EXPLANATION,?RECORED?IN?DAILY?STUDY
????????????????????????????????????????????????????????? Last?update?2018.4.7
############################################################################
1、變量初始化
-----初始化變量
var?=?tf.Variable(tf.random_normal([2,?3],?stddev=0.2,?mean=0.0))
tf.random_normal()
tf.truncated_normal()
tf.random_uniform()
tf.random_gamma()
############################################################################
2、Loss?Func
--A--交叉熵H(p,q)刻畫的是兩個概率分布之間的距離,常用于分類問題
--?y_表示真實值
cross_entropy?=?-tf.reduce_mean(
????y_*tf.clip_by_value(y,?1e-10,?1.0)
)
cross_entropy?=?tf.reduce_mean(?-tf.reduce_sum(y_?*?tf.log(?y),?reduction_indices=[1]))?
因為交叉熵一般會與softmax?回歸一起使用,所以Tensorflow封裝了函數
cross_entropy?=?tf.nn.softmax_cross_entropy_with_logits(y,y_)
得到softmax回歸之后的交叉熵
--B---MSE?均方誤差,常用于回歸問題
--?y_表示真實值
mse?=?tf.reduce_mean(tf.square(?y_?-?y))
------自定義損失函數常用基本函數
tf.reduce_sum();?tf.select();tf.greater()
loss?=?tf.reduce_mean(tf.reduce_sum(tf.square(ys?-?prediction),
????????????????????????????????????????reduction_indices=[1]))
############################################################################
3、weights_with_L2_loss
def?weights_with_loss(shape,?wl=None):
????"""
????獲取帶有L2_Loss的權重,?并添加到collection?loss?中
????最后我們可以使用?loss?=?tf.add_n(tf.get_collection("loss"),?name='total_loss')
????計算出總體loss
????weights_with_loss?一般不用于第一層和最后一層,多見于全連接層
????:param?shape:?weights_shape
????:param?wl:?weights_loss_ratio
????:return:?weights
????"""
????w?=?tf.Variable(tf.truncated_normal(shape=shape,?stddev=0.01,?dtype=tf.float32))
????if?wl?is?not?None:
????????weights_loss?=?tf.multiply(tf.nn.l2_loss(w),?wl,?name='weights_loss')
????????tf.add_to_collection("loss",?weights_loss)
????return?w
############################################################################
4、batch_normallization
????def??batch_normalization(self,?input,?decay=0.9,?eps=1e-5):
????????"""
????????Batch?Normalization
????????Result?in:
????????????*?Reduce?DropOut
????????????*?Sparse?Dependencies?on?Initial-value(e.g.?weight,?bias)
????????????*?Accelerate?Convergence
????????????*?Enable?to?increase?training?rate
Usage:?apply?to?(after)conv_layers
????????Args:?output?of?convolution?or?fully-connection?layer
????????Returns:?Normalized?batch
????????"""
????????shape?=?input.get_shape().as_list()
????????n_out?=?shape[-1]
????????beta?=?tf.Variable(tf.zeros([n_out]))
????????gamma?=?tf.Variable(tf.ones([n_out]))
????????if?len(shape)?==?2:
????????????batch_mean,?batch_var?=?tf.nn.moments(input,?[0])
????????else:
????????????batch_mean,?batch_var?=?tf.nn.moments(input,?[0,?1,?2])
????????ema?=?tf.train.ExponentialMovingAverage(decay=decay)
????????def?mean_var_with_update():
????????????ema_apply_op?=?ema.apply([batch_mean,?batch_var])
????????????with?tf.control_dependencies([ema_apply_op]):
????????????????return?tf.identity(batch_mean),?tf.identity(batch_var)
????????mean,?var?=?tf.cond(self.train_phase,?mean_var_with_update,
????????????????????????????lambda:?(ema.average(batch_mean),?ema.average(batch_var)))
????????return?tf.nn.batch_normalization(input,?mean,?var,?beta,?gamma,?eps)
############################################################################
5、LRN
def?LRN(x,?R,?alpha,?beta,?name=None,?bias=1.0):
????"""
????LRN?apply?to?(after)conv_layers
????:param?x:?input_tensor
????:param?R:?depth_radius
????:param?alpha:?alpha?in?math?formula
????:param?beta:?beta?in?match?formula
????:param?name:
????:param?bias:
????:return:
????"""
????return?tf.nn.local_response_normalization(x,?depth_radius=R,?alpha=alpha,
??????????????????????????????????????????????beta=beta,?bias=bias,?name=name)
############################################################################
5、gradient?decent
----gradient?decent?&?backpropagation
gradient?decent?:主要用于優化單個參數的取值??【所謂梯度就是一階導數】
backpropagation:?給出了一個高效的方式在所有參數上使用梯度下降法
需要注意:
(1)gradient?decent不能保證全局最優
(2)損失函數實在所有訓練數據上的損失和,故gradient?decent計算時間很長
|
^
gradient?decent?????Adam(折中方式,每次計算一個batch的損失函數和)???SGD
############################################################################
6、learning?rate
-----learning_rate?決定了參數每次更新的幅度
-----decayed_learning_rate:
global_step?=?tf.Variable(0,tf.int32)
learning_rate?=?tf.train.exponential_decay(
0.1,?global_step,?100,?0.96,?staircase?=?True)
.....
learning_rate?=?tf.train.GradientDescentOptimizer(learning_rate).minimize(loss,?global_step=global_step?)
每100輪過后?lr?乘以?0.96
############################################################################
7、full?connection
----經典全連接層:
tf.nn.relu(tf,matmul(x,w)+biases)
----全連接層一般會和?dropout連用,?防止過擬合
############################################################################
8、PCA
def RGB_PCA(images):pixels = images.reshape(-1, images.shape[-1])idx = np.random.random_integers(0, pixels.shape[0], 1000000)pixels = [pixels[i] for i in idx]pixels = np.array(pixels, dtype=np.uint8).Tm = np.mean(pixels)/256.C = np.cov(pixels)/(256.*256.)l, v = np.linalg.eig(C)return l, v, mdef RGB_variations(image, eig_val, eig_vec):a = np.random.randn(3)v = np.array([a[0]*eig_val[0], a[1]*eig_val[1], a[2]*eig_val[2]])variation = np.dot(eig_vec, v)return image + variationl,v,m = RGB_PCA(img) img = RGB_variations(img,l,v) imshow(img)
總結
以上是生活随笔為你收集整理的CNN 神经网络tricks 学习总结的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: DOS网络命令之 tracert
- 下一篇: tf.Variable和 tf.get_