當前位置：首頁 > 人工智能 > 循环神经网络 >内容正文

循环神经网络

RNN-循环神经网络-02Tensorflow中的实现

發布時間：2025/3/15 循环神经网络 51 豆豆

生活随笔收集整理的這篇文章主要介紹了 RNN-循环神经网络-02Tensorflow中的实现小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

關于基本的RNN和LSTM的概念和BPTT算法可以查看這里
參考文章：
- https://r2rt.com/recurrent-neural-networks-in-tensorflow-i.html
- https://r2rt.com/styles-of-truncated-backpropagation.html

一、源代碼實現一個binary例子

1、例子描述

(1) 數據描述

輸入數據X是二進制的一串序列, 在t時刻，有50%的概率是1，50%的概率是0，比如：X=[1,1,0,0,1,0.....]

輸出數據Y：
- 在時刻t，50%的概率是1，50%的概率是0；
- 如果Xt?3是1，則Yt?100%是1（增加50%）；
- 如果Xt?8是1，則Yt?25%是1（減少25%）；
  - 所以如果Xt?3和Xt?8都是1，則Yt?50%+50%-25%=75%的概率是1
所以，輸出數據是有兩個依賴關系的

(2) 損失函數

使用cross-entropy損失函數進行訓練
這里例子很簡單，根據數據生成的規則，我們可以簡單的計算一下不同情況下的cross-entropy值
[1]?如果rnn沒有學到兩個依賴關系, 則最終預測正確的概率是62.5%，cross entropy值為0.66計算如下
- Xt?3={1→Xt?8={1→0.5+0.5?0.25=0.750→0.5+0.5=10→Xt?8={1→0.5?0.25=0.250→0.5
- 所以正確預測1的概率為：(0.75+1+0.25+0.5)/4=0.625
- 所以cross entropy值為：-[plog(p)+(1-p)log(1-p)]=0.66
[2]?如果rnn學到第一個依賴關系，50%的情況下預測準確度為87.5%，50%的情況下預測準確度為62.5%，cross entropy值為0.52
- 因為X是隨機生成，0/1各占50%,想象生成了很多的數，根據大數定律，50%的情況是1，對應到?[1]?中的上面的情況就是:(0.75+1)/2=0.875的概率預測正確，其余的50%就和[1]中一樣了（去除學到的一個依賴，其余就是沒有學到依賴）62.5%
- 損失值：-0.5 * (0.875 * .log(0.875) + 0.125 * log(0.125))-0.5 * (0.625 * np.log(0.625) + 0.375 * log(0.375)))=0.52
[3]?如果rnn兩個依賴都學到了，則25%的情況下100%預測正確，25%的情況下50%預測正確，50%的情況向75%預測正確，cross entropy值為0.45
- 1/4的情況就是Xt?3=1和Xt?8=0?100%預測正確
- 1/4的情況就是Xt?3=0和Xt?8=0?50%預測正確
- 1/2的情況75%預測正確（0.5+0.5-0.25）
- 損失值：-0.50 * (0.75 * np.log(0.75) + 0.25 * np.log(0.25)) - 0.25 * (2 * 0.50 * np.log (0.50)) - 0.25 * (0) = 0.45

2、網絡結構

根據時刻t的輸入向量Xt和時刻t-1的狀態向量state?St?1計算得出當前的狀態向量St和輸出的結果概率向量Pt
Label數據是Y
所以有：
St=tanh(W(Xt?St?1))+bs Pt=softmax(USt+bp)
- 這里?表示向量的拼接
- W∈Rd×(2+d),bs∈Rd,U∈R2×d,bp∈R2
  - d是?state?向量的長度
  - W是二維的矩陣，因為是將Xt和St?1拼接起來和W運算的，2對應輸入的X?one-hot之后，所以是2
  - U是最后輸出預測的權值
- 初始化state?S?1?為0向量

需要注意的是?cell?并不一定是只有一個neuron unit，而是有n個hidden units
- 下圖的state size=4

3、Tensorflow中RNN BPTT實現方式

1) 截斷反向傳播（TRUNCATED BACKPROPAGATION）

假設我們訓練含有1000000個數據的序列，如果全部訓練的話，整個的序列都feed進RNN中，容易造成梯度消失或爆炸的問題
所以解決的方法就是truncated backpropagation，我們將序列截斷來進行訓練(num_steps)

2) tensorflow中的BPTT算法實現

一般截斷的反向傳播是：在當前時間t,往前反向傳播num_steps步即可
- 如下圖，長度為6的序列，截斷步數是3

但是Tensorflow中的實現并不是這樣(如下圖)
- 它是將長度為6的序列分為了兩部分，每一部分長度為3
- 前一部分計算得到的final state用于下一部分計算的initial state

所以tensorflow風格的反向傳播并沒有有效的反向傳播num_steps步(對比一般的方式，依賴關系變的弱一些)
- 所以比如想要學習有8依賴關系的序列（我們的例子中就是），一般要設置的大于8
另外，有人做實驗比較了兩種方式here，發現一般的實現方式中的n步和Tensorflow中截斷設置為2n的結果相似

3) 關于這個例子，tensorflow風格的實現

如下圖，num_steps=5, state_size=4，就是截斷反向傳播的步數truncated backprop steps是5步，state_size就是cell中的神經元的個數
如果需要截斷的步數增多，可以適當增加state_size來記錄更多的信息
- 好比傳統的神經網絡，就是增加隱藏層的神經元個數
途中的注釋是下面的列子代碼中定義變量的shape, 可以對照參考

4、自己實現例子中的RNN

全部代碼：https://github.com/lawlite19/Blog-Back-Up/blob/master/code/rnn/rnn_implement.py

1) 實現過程

導入包：

1234	import numpy as npimport tensorflow as tffrom tensorflow.python import debug as tf_debugimport matplotlib.pyplot as plt

超參數
- 這里num_steps=5就是只能記憶5步, 所以只能學習到一個依賴(因為至少8步才能學到第二個依賴)，我們看結果最后的cross entropy是否在0.52左右
  123456 '''超參數'''num_steps = 5batch_size = 200num_classes = 2state_size = 4learning_rate = 0.1
生成數據
- 就是按照我們描述的規則

123456789101112131415161718

'''生成數據就是按照文章中提到的規則，這里生成1000000個'''def gen_data(size=1000000):X = np.array(np.random.choice(2, size=(size,)))Y = []'''根據規則生成Y'''for i in range(size): threshold = 0.5if X[i-3] == 1:threshold += 0.5if X[i-8] == 1:threshold -=0.25if np.random.rand() > threshold:Y.append(0)else:Y.append(1)return X, np.array(Y)

生成batch數據，因為我們使用sgd訓練

12345678910111213141516171819

'''生成batch數據'''def gen_batch(raw_data, batch_size, num_step):raw_x, raw_y = raw_datadata_length = len(raw_x)batch_patition_length = data_length // batch_size # ->5000data_x = np.zeros([batch_size, batch_patition_length], dtype=np.int32) # ->(200, 5000)data_y = np.zeros([batch_size, batch_patition_length], dtype=np.int32) # ->(200, 5000)'''填到矩陣的對應位置'''for i in range(batch_size):data_x[i] = raw_x[batch_patition_length*i:batch_patition_length*(i+1)]# 每一行取batch_patition_length個數，即5000data_y[i] = raw_y[batch_patition_length*i:batch_patition_length*(i+1)]epoch_size = batch_patition_length // num_steps # ->5000/5=1000 就是每一輪的大小for i in range(epoch_size): # 抽取 epoch_size 個數據x = data_x[:, i * num_steps:(i + 1) * num_steps] # ->(200, 5)y = data_y[:, i * num_steps:(i + 1) * num_steps]yield (x, y) # yield 是生成器，生成器函數在生成值后會自動掛起并暫停他們的執行和狀態（最后就是for循環結束后的結果，共有1000個(x, y)）def gen_epochs(n, num_steps):for i in range(n):yield gen_batch(gen_data(), batch_size, num_steps)

定義RNN的輸入

這里每個數需要one-hot處理

unstack方法就是將n維的數據拆成若開個n-1的數據，axis指定根據哪個維度拆的，比如(200,5,2)三維數據，按axis=1會有5個(200,2)的二維數據

1234567

'''定義placeholder'''x = tf.placeholder(tf.int32, [batch_size, num_steps], name="x")y = tf.placeholder(tf.int32, [batch_size, num_steps], name='y')init_state = tf.zeros([batch_size, state_size])'''RNN輸入'''x_one_hot = tf.one_hot(x, num_classes)rnn_inputs = tf.unstack(x_one_hot, axis=1)

定義RNN的cell（關鍵步驟）

這里關于name_scope和variable_scope的用法可以查看這里

12345678910

'''定義RNN cell'''with tf.variable_scope('rnn_cell'):W = tf.get_variable('W', [num_classes + state_size, state_size])b = tf.get_variable('b', [state_size], initializer=tf.constant_initializer(0.0)) def rnn_cell(rnn_input, state):with tf.variable_scope('rnn_cell', reuse=True):W = tf.get_variable('W', [num_classes+state_size, state_size])b = tf.get_variable('b', [state_size], initializer=tf.constant_initializer(0.0))return tf.tanh(tf.matmul(tf.concat([rnn_input, state],1),W) + b)

將cell添加到計算圖中

1234567

'''將rnn cell添加到計算圖中'''state = init_staternn_outputs = []for rnn_input in rnn_inputs:state = rnn_cell(rnn_input, state) # state會重復使用，循環rnn_outputs.append(state)final_state = rnn_outputs[-1] # 得到最后的state

定義預測，損失函數，和優化方法

sparse_softmax_cross_entropy_with_logits會自動one-hot

1234567891011

'''預測，損失，優化'''with tf.variable_scope('softmax'):W = tf.get_variable('W', [state_size, num_classes]) b = tf.get_variable('b', [num_classes], initializer=tf.constant_initializer(0.0))logits = [tf.matmul(rnn_output, W) + b for rnn_output in rnn_outputs]predictions = [tf.nn.softmax(logit) for logit in logits]y_as_list = tf.unstack(y, num=num_steps, axis=1)losses = [tf.nn.sparse_softmax_cross_entropy_with_logits(labels=label,logits=logit) for logit, label in zip(logits, y_as_list)]total_loss = tf.reduce_mean(losses)train_step = tf.train.AdagradOptimizer(learning_rate).minimize(total_loss)

訓練網絡

123456789101112131415161718192021

'''訓練網絡'''def train_rnn(num_epochs, num_steps, state_size=4, verbose=True):with tf.Session() as sess:sess.run(tf.global_variables_initializer())#sess = tf_debug.LocalCLIDebugWrapperSession(sess)training_losses = []for idx, epoch in enumerate(gen_epochs(num_epochs, num_steps)):training_loss = 0training_state = np.zeros((batch_size, state_size)) # ->(200, 4)if verbose:print('\nepoch', idx)for step, (X, Y) in enumerate(epoch):tr_losses, training_loss_, training_state, _ = \sess.run([losses, total_loss, final_state, train_step], feed_dict={x:X, y:Y, init_state:training_state})training_loss += training_loss_if step % 100 == 0 and step > 0:if verbose:print('第 {0} 步的平均損失 {1}'.format(step, training_loss/100))training_losses.append(training_loss/100)training_loss = 0return training_losses

顯示結果

1234	training_losses = train_rnn(num_epochs=1, num_steps=num_steps, state_size=state_size)print(training_losses[0])plt.plot(training_losses)plt.show()

2) 實驗結果

num_steps=5, state=4
- 可以看到初試的損失值大約0.66, 最后學到一個依賴關系，最終損失值0.52左右

num_step=10, state=16
- 學到了兩個依賴，最終損失值接近0.45

5、使用Tensorflow的cell實現

1) 使用static rnn方式

將我們之前自己實現的cell和添加到計算圖中步驟改為如下即可

123	cell = tf.contrib.rnn.BasicRNNCell(num_units=state_size)rnn_outputs, final_state = tf.contrib.rnn.static_rnn(cell=cell, inputs=rnn_inputs, initial_state=init_state)

2) 使用dynamic_rnn方式

這里僅僅替換cell就不行了，RNN輸入
- 直接就是三維的形式
  12 '''RNN輸入'''rnn_inputs = tf.one_hot(x, num_classes)
使用dynamic_rnn

12	cell = tf.contrib.rnn.BasicRNNCell(num_units=state_size)rnn_outputs, final_state = tf.nn.dynamic_rnn(cell, rnn_inputs, initial_state=init_state)

預測，損失

由于rnn_inputs是三維的，所以先轉成二維的，計算結束后再轉換回三維[batch_size, num_steps, num_classes]

12345678910

'''因為rnn_outputs是三維的，這里需要將其轉成2維的，矩陣運算后再轉換回來[batch_size, num_steps, num_classes]'''logits = tf.reshape(tf.matmul(tf.reshape(rnn_outputs, [-1, state_size]), W) +b, \shape=[batch_size, num_steps, num_classes])predictions = tf.nn.softmax(logits)y_as_list = tf.unstack(y, num=num_steps, axis=1)losses = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y,logits=logits)total_loss = tf.reduce_mean(losses)train_step = tf.train.AdagradOptimizer(learning_rate).minimize(total_loss)

Reference

https://r2rt.com/recurrent-neural-networks-in-tensorflow-i.html
https://r2rt.com/styles-of-truncated-backpropagation.html

https://web.stanford.edu/class/psych209a/ReadingsByDate/02_25/Williams%20Zipser95RecNets.pdf

原文地址：?http://lawlite.me/2017/06/16/RNN-%E5%BE%AA%E7%8E%AF%E7%A5%9E%E7%BB%8F%E7%BD%91%E7%BB%9C-02Tensorflow%E4%B8%AD%E7%9A%84%E5%AE%9E%E7%8E%B0/

總結

以上是生活随笔為你收集整理的RNN-循环神经网络-02Tensorflow中的实现的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： RNN-循环神经网络和LSTM_01基础
下一篇： RNN-LSTM循环神经网络-03Ten

日韩av黄I国产麻豆传媒I国产91av视频在线观看I日韩一区二区三区在线看I美女国产在线I麻豆视频国产在线观看I成人黄色短片

循环神经网络

RNN-循环神经网络-02Tensorflow中的实现

一、源代碼實現一個binary例子

1、例子描述

(1) 數據描述

(2) 損失函數

2、網絡結構

3、Tensorflow中RNN BPTT實現方式

1) 截斷反向傳播（TRUNCATED BACKPROPAGATION）

2) tensorflow中的BPTT算法實現

3) 關于這個例子，tensorflow風格的實現

4、自己實現例子中的RNN

1) 實現過程

2) 實驗結果

5、使用Tensorflow的cell實現

1) 使用static rnn方式

2) 使用dynamic_rnn方式

Reference

總結