當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

CNN 神经网络tricks 学习总结

發布時間：2024/1/23 编程问答 36 豆豆

生活随笔收集整理的這篇文章主要介紹了 CNN 神经网络tricks 学习总结小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

TRICKS IN? ?DEEP??LEARNING

???????????????????????????????????????????????????????? IN?THIS?DOC?,?ONLY??WITH??LITTLE?BRIEF?EXPLANATION,?RECORED?IN?DAILY?STUDY

????????????????????????????????????????????????????????? Last?update?2018.4.7

############################################################################

1、變量初始化

-----初始化變量

var?=?tf.Variable(tf.random_normal([2,?3],?stddev=0.2,?mean=0.0))

tf.random_normal()

tf.truncated_normal()

tf.random_uniform()

tf.random_gamma()

############################################################################

2、Loss?Func

--A--交叉熵H(p,q)刻畫的是兩個概率分布之間的距離，常用于分類問題

--?y_表示真實值

cross_entropy?=?-tf.reduce_mean(

????y_*tf.clip_by_value(y,?1e-10,?1.0)

)

cross_entropy?=?tf.reduce_mean(?-tf.reduce_sum(y_?*?tf.log(?y),?reduction_indices=[1]))?

因為交叉熵一般會與softmax?回歸一起使用，所以Tensorflow封裝了函數

cross_entropy?=?tf.nn.softmax_cross_entropy_with_logits(y,y_)

得到softmax回歸之后的交叉熵

--B---MSE?均方誤差，常用于回歸問題

--?y_表示真實值

mse?=?tf.reduce_mean(tf.square(?y_?-?y))

------自定義損失函數常用基本函數

tf.reduce_sum();?tf.select();tf.greater()

loss?=?tf.reduce_mean(tf.reduce_sum(tf.square(ys?-?prediction),

????????????????????????????????????????reduction_indices=[1]))

############################################################################

3、weights_with_L2_loss

def?weights_with_loss(shape,?wl=None):

????"""

????獲取帶有L2_Loss的權重,?并添加到collection?loss?中

????最后我們可以使用?loss?=?tf.add_n(tf.get_collection("loss"),?name='total_loss')

????計算出總體loss

????weights_with_loss?一般不用于第一層和最后一層，多見于全連接層

????:param?shape:?weights_shape

????:param?wl:?weights_loss_ratio

????:return:?weights

????"""

????w?=?tf.Variable(tf.truncated_normal(shape=shape,?stddev=0.01,?dtype=tf.float32))

????if?wl?is?not?None:

????????weights_loss?=?tf.multiply(tf.nn.l2_loss(w),?wl,?name='weights_loss')

????????tf.add_to_collection("loss",?weights_loss)

????return?w

############################################################################

4、batch_normallization

????def??batch_normalization(self,?input,?decay=0.9,?eps=1e-5):

????????"""

????????Batch?Normalization

????????Result?in:

????????????*?Reduce?DropOut

????????????*?Sparse?Dependencies?on?Initial-value(e.g.?weight,?bias)

????????????*?Accelerate?Convergence

????????????*?Enable?to?increase?training?rate

Usage:?apply?to?(after)conv_layers

????????Args:?output?of?convolution?or?fully-connection?layer

????????Returns:?Normalized?batch

????????"""

????????shape?=?input.get_shape().as_list()

????????n_out?=?shape[-1]

????????beta?=?tf.Variable(tf.zeros([n_out]))

????????gamma?=?tf.Variable(tf.ones([n_out]))

????????if?len(shape)?==?2:

????????????batch_mean,?batch_var?=?tf.nn.moments(input,?[0])

????????else:

????????????batch_mean,?batch_var?=?tf.nn.moments(input,?[0,?1,?2])

????????ema?=?tf.train.ExponentialMovingAverage(decay=decay)

????????def?mean_var_with_update():

????????????ema_apply_op?=?ema.apply([batch_mean,?batch_var])

????????????with?tf.control_dependencies([ema_apply_op]):

????????????????return?tf.identity(batch_mean),?tf.identity(batch_var)

????????mean,?var?=?tf.cond(self.train_phase,?mean_var_with_update,

????????????????????????????lambda:?(ema.average(batch_mean),?ema.average(batch_var)))

????????return?tf.nn.batch_normalization(input,?mean,?var,?beta,?gamma,?eps)

############################################################################

5、LRN

def?LRN(x,?R,?alpha,?beta,?name=None,?bias=1.0):

????"""

????LRN?apply?to?(after)conv_layers

????:param?x:?input_tensor

????:param?R:?depth_radius

????:param?alpha:?alpha?in?math?formula

????:param?beta:?beta?in?match?formula

????:param?name:

????:param?bias:

????:return:

????"""

????return?tf.nn.local_response_normalization(x,?depth_radius=R,?alpha=alpha,

??????????????????????????????????????????????beta=beta,?bias=bias,?name=name)

############################################################################

5、gradient?decent

----gradient?decent?&?backpropagation

gradient?decent?:主要用于優化單個參數的取值??【所謂梯度就是一階導數】

backpropagation：?給出了一個高效的方式在所有參數上使用梯度下降法

需要注意：

（1）gradient?decent不能保證全局最優

（2）損失函數實在所有訓練數據上的損失和，故gradient?decent計算時間很長

gradient?decent?????Adam(折中方式，每次計算一個batch的損失函數和)???SGD

############################################################################

6、learning?rate

-----learning_rate?決定了參數每次更新的幅度

-----decayed_learning_rate：

global_step?=?tf.Variable(0,tf.int32)

learning_rate?=?tf.train.exponential_decay(

0.1,?global_step,?100,?0.96,?staircase?=?True)

.....

learning_rate?=?tf.train.GradientDescentOptimizer(learning_rate).minimize(loss,?global_step=global_step?)

每100輪過后?lr?乘以?0.96

############################################################################

7、full?connection

----經典全連接層：

tf.nn.relu(tf,matmul(x,w)+biases)

----全連接層一般會和?dropout連用，?防止過擬合

############################################################################

8、PCA

def RGB_PCA(images):pixels = images.reshape(-1, images.shape[-1])idx = np.random.random_integers(0, pixels.shape[0], 1000000)pixels = [pixels[i] for i in idx]pixels = np.array(pixels, dtype=np.uint8).Tm = np.mean(pixels)/256.C = np.cov(pixels)/(256.*256.)l, v = np.linalg.eig(C)return l, v, mdef RGB_variations(image, eig_val, eig_vec):a = np.random.randn(3)v = np.array([a[0]*eig_val[0], a[1]*eig_val[1], a[2]*eig_val[2]])variation = np.dot(eig_vec, v)return image + variationl,v,m = RGB_PCA(img) img = RGB_variations(img,l,v) imshow(img)

總結

以上是生活随笔為你收集整理的CNN 神经网络tricks 学习总结的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： DOS网络命令之 tracert
下一篇： tf.Variable和 tf.get_