日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Tensorflow实现LSTM详解

發布時間:2024/7/5 编程问答 29 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Tensorflow实现LSTM详解 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

關于什么是 LSTM 我就不詳細闡述了,吳恩達老師視頻課里面講的很好,我大概記錄了課上的內容在吳恩達《序列模型》筆記一,網上也有很多寫的好的解釋,比如:LSTM入門、理解LSTM網絡

然而,理解挺簡單,上手寫的時候還是遇到了很多的問題,網上大部分的博客都沒有講清楚 cell 參數的設置,在我看了N多篇文章后終于搞明白了,寫出來讓大家少走一些彎路吧!

如上圖是一個LSTM的單元,可以應用到多種RNN結構中,常用的應該是 one-to-manymany-to-many


下面介紹 many-to-many 這種結構:

  • batch_size:批度訓練大小,即讓 batch_size 個句子同時訓練。
  • time_steps:時間長度,即句子的長度
  • embedding_size:組成句子的單詞的向量長度(embedding size)
  • hidden_size:隱藏單元數,一個LSTM結構是一個神經網絡(如上圖就是一個LSTM單元),每個小黃框是一個神經網絡,小黃框的隱藏單元數就是hidden_size,那么這個LSTM單元就有 4*hidden_size 個隱藏單元。
  • 每個LSTM單元的輸出 C、h,都是向量,他們的長度都是當前 LSTM 單元的 hidden_size。
  • n_words:語料庫中單詞個數。
  • 實現方式一:

    import tensorflow as tf import numpy as np from tensorflow.contrib import rnndef add_layer(inputs, in_size, out_size, activation_function=None): # 單層神經網絡weights = tf.Variable(tf.random_normal([in_size, out_size]))baises = tf.Variable(tf.zeros([1, out_size]) + 0.1)wx_b = tf.matmul(inputs, weights) + baisesif activation_function is None:outputs = wx_belse:outputs = activation_function(wx_b)return outputsn_words = 15 embedding_size = 8 hidden_size = 8 # 一般hidden_size和embedding_size是相同的 batch_size = 3 time_steps = 5w = tf.Variable(tf.random_normal([n_words, embedding_size], stddev=0.01)) # 模擬參數 W sentence = tf.Variable(np.arange(15).reshape(batch_size, time_step, 1)) # 模擬訓練的句子:3條句子,每個句子5個單詞 shape(3,5,1) input_s = tf.nn.embedding_lookup(w, sentence) # 將單詞映射到向量:每個單詞變成了size為8的向量 shape=(3,5,1,8) input_s = tf.reshape(input_s, [-1, 5, 8]) # shape(3,5,8)with tf.name_scope("LSTM"): # trustlstm_cell = rnn.BasicLSTMCell(hidden_size, state_is_tuple=True, name='lstm_layer') h_0 = tf.zeros([batch_size, embedding_size]) # shape=(3,8)c_0 = tf.zeros([batch_size, embedding_size]) # shape=(3,8)state = rnn.LSTMStateTuple(c=c_0, h=h_0) # 設置初始狀態outputs = []for i in range(time_steps): # 句子長度if i > 0: tf.get_variable_scope().reuse_variables() # 名字相同cell使用的參數w就一樣,為了避免重名引起別的的問題,設置一下變量重用output, state = lstm_cell(input_s[:, i, :], state) # output:[batch_size,embedding_size] shape=(3,8)outputs.append(output) # outputs:[TIME_STEP,batch_size,embedding_size] shape=(5,3,8)path = tf.concat(outputs, 1) # path:[batch_size,embedding_size*TIME_STEP] shape=(3, 40)path_embedding = add_layer(path, time_step * embedding_size, embedding_size) # path_embedding:[batch_size, embedding_size]with tf.Session() as s:s.run(tf.global_variables_initializer())# 因為使用的參數數量都還比較小,打印一些變量看看就能明白是怎么操作的print(s.run(outputs))print(s.run(path_embedding))

    比如一批訓練64句話,每句話20個單詞,每個詞向量長度為200,隱藏層單元個數為128
    那么訓練一批句子,輸入的張量維度是[64,20,200],ht,ct? 的維度是[128],那么LSTM單元參數矩陣的維度是[128+200,4x128],
    在時刻1,把64句話的第一個單詞作為輸入,即輸入一個[64,200]的矩陣,由于會和 ht 進行concat,輸入矩陣變成了[64,200+128],輸入矩陣會和參數矩陣[200+128,4x128]相乘,輸出為[64,4x128],也就是每個黃框的輸出為[64,128],黃框之間會進行一些操作,但不改變維度,輸出依舊是[64,128],即每個句子經過LSTM單元后,輸出的維度是128,所以每個LSTM輸出的都是向量,包括Ct,ht,所以它們的長度都是當前LSTM單元的hidden_size 。那么我們就知道cell_output的維度為[64,128]
    之后的時刻重復剛才同樣的操作,那么outputs的維度是[20,64,128].
    softmax相當于全連接層,將outputs映射到vocab_size個單詞上,進行交叉熵誤差計算。
    然后根據誤差更新LSTM參數矩陣和全連接層的參數。

    實現方式二:

    測試數據鏈接:https://pan.baidu.com/s/1j9sgPmWUHM5boM5ekj3Q2w 提取碼:go3f

    import pandas as pd import numpy as np import matplotlib.pyplot as plt import tensorflow as tfdata = pd.read_excel("seq_data.xlsx") # 讀取序列數據 data = data.values[1:800] # 取前800個 normalize_data = (data - np.mean(data)) / np.std(data) # 標準化數據 s = np.std(data) m = np.mean(data) time_step = 96 # 序列段長度 rnn_unit = 8 # 隱藏層節點數目 lstm_layers = 2 # cell層數 batch_size = 7 # 序列段批處理數目 input_size = 1 # 輸入維度 output_size = 1 # 輸出維度 lr = 0.006 # 學習率train_x, train_y = [], [] for i in range(len(data) - time_step - 1):x = normalize_data[i:i + time_step]y = normalize_data[i + 1:i + time_step + 1]train_x.append(x.tolist())train_y.append(y.tolist()) X = tf.placeholder(tf.float32, [None, time_step, input_size]) # shape(?,time_step, input_size) Y = tf.placeholder(tf.float32, [None, time_step, output_size]) # shape(?,time_step, out_size) weights = {'in': tf.Variable(tf.random_normal([input_size, rnn_unit])),'out': tf.Variable(tf.random_normal([rnn_unit, 1]))} biases = {'in': tf.Variable(tf.constant(0.1, shape=[rnn_unit, ])),'out': tf.Variable(tf.constant(0.1, shape=[1, ]))} def lstm(batch):w_in = weights['in']b_in = biases['in']input = tf.reshape(X, [-1, input_size])input_rnn = tf.matmul(input, w_in) + b_ininput_rnn = tf.reshape(input_rnn, [-1, time_step, rnn_unit])cell = tf.nn.rnn_cell.MultiRNNCell([tf.nn.rnn_cell.BasicLSTMCell(rnn_unit) for i in range(lstm_layers)])init_state = cell.zero_state(batch, dtype=tf.float32)output_rnn, final_states = tf.nn.dynamic_rnn(cell, input_rnn, initial_state=init_state, dtype=tf.float32)output = tf.reshape(output_rnn, [-1, rnn_unit])w_out = weights['out']b_out = biases['out']pred = tf.matmul(output, w_out) + b_outreturn pred, final_statesdef train_lstm():global batch_sizewith tf.variable_scope("sec_lstm"):pred, _ = lstm(batch_size)loss = tf.reduce_mean(tf.square(tf.reshape(pred, [-1]) - tf.reshape(Y, [-1])))train_op = tf.train.AdamOptimizer(lr).minimize(loss)saver = tf.train.Saver(tf.global_variables())loss_list = []with tf.Session() as sess:sess.run(tf.global_variables_initializer())for i in range(100): # We can increase the number of iterations to gain better result.start = 0end = start + batch_sizewhile (end < len(train_x)):_, loss_ = sess.run([train_op, loss], feed_dict={X: train_x[start:end], Y: train_y[start:end]})start += batch_sizeend = end + batch_sizeloss_list.append(loss_)if i % 10 == 0:print("Number of iterations:", i, " loss:", loss_list[-1])if i > 0 and loss_list[-2] > loss_list[-1]:saver.save(sess, 'model_save1\\modle.ckpt')# I run the code in windows 10,so use 'model_save1\\modle.ckpt'# if you run it in Linux,please use 'model_save1/modle.ckpt'print("The train has finished")train_lstm()def prediction():with tf.variable_scope("sec_lstm", reuse=tf.AUTO_REUSE):pred, _ = lstm(1)saver = tf.train.Saver(tf.global_variables())with tf.Session() as sess:saver.restore(sess, 'model_save1\\modle.ckpt')# I run the code in windows 10,so use 'model_save1\\modle.ckpt'# if you run it in Linux,please use 'model_save1/modle.ckpt'predict = []for i in range(0, np.shape(train_x)[0]):next_seq = sess.run(pred, feed_dict={X: [train_x[i]]})predict.append(next_seq[-1])plt.figure()plt.plot(list(range(len(data))), data, color='b')plt.plot(list(range(time_step + 1, np.shape(train_x)[0] + 1 + time_step)), [value * s + m for value in predict],color='r')plt.show()prediction()

    參考文章:

    基于TensorFlow構建LSTM
    TensorFlow實戰:LSTM的結構與cell中的參數

    總結

    以上是生活随笔為你收集整理的Tensorflow实现LSTM详解的全部內容,希望文章能夠幫你解決所遇到的問題。

    如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。