【Pytorch神经网络实战案例】11 循环神经网络结构训练语言模型并进行简单预测
1 語(yǔ)言模型步驟
簡(jiǎn)單概述:根據(jù)輸入內(nèi)容,繼續(xù)輸出后面的句子。
1.1 根據(jù)需求拆分任務(wù)
- (1)先對(duì)模型輸入一段文字,令模型輸出之后的一個(gè)文字。
- (2)將模型預(yù)測(cè)出來(lái)的文字當(dāng)成輸入,再放到模型里,使模型預(yù)測(cè)出下一個(gè)文字,這樣循環(huán)下去,以使RNN完成一句話的輸出。
1.2 根據(jù)任務(wù)設(shè)計(jì)功能模塊
- (1)模型能夠記住前面文字的語(yǔ)義;
- (2)能夠根據(jù)前面的語(yǔ)義和一個(gè)輸入文字,輸出下一個(gè)文字。
1.3 根據(jù)功能模塊設(shè)計(jì)實(shí)現(xiàn)方案
RNN模型的接口可以輸出兩個(gè)結(jié)果:預(yù)測(cè)值和當(dāng)前狀態(tài)
2 語(yǔ)言模型的代碼實(shí)現(xiàn)
2.1 準(zhǔn)備樣本數(shù)據(jù)
樣本內(nèi)容:
在塵世的紛擾中,只要心頭懸掛著遠(yuǎn)方的燈光,我們就會(huì)堅(jiān)持不懈地走,理想為我們灌注了精神的蘊(yùn)藉。所以,生活再平凡、再普通、再瑣碎,我們都要堅(jiān)持一種信念,默守一種精神,為自己積淀站立的信心,前行的氣力。
2.1.1定義基本工具函數(shù)---make_Language_model.py(第1部分)
首先引入頭文件,然后定義相關(guān)函數(shù):get_ch_lable()從文件中獲取文本,get_ch._able_v0將文本數(shù)組轉(zhuǎn)方向量,具體代碼如下:
import numpy as np import torch import torch.nn.functional as F import time import random from collections import Counter# 1.1 定義基本的工具函數(shù) RANDOM_SEED = 123 torch.manual_seed(RANDOM_SEED) DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')def elapsed(sec): # 計(jì)算時(shí)間函數(shù)if sec < 60:return str(sec) + "sec"elif sec<(60*60):return str(sec/60) + "min"else:return str(sec/(60*60)) + "hour"training_file = 'csv_list/wordstest.txt' # 定義樣本文件#中文字 def get_ch_label(txt_file): # 提取樣本中的漢字labels = ""with open(txt_file,'rb') as f :for label in f:labels = labels + label.decode("gb2312",errors = 'ignore')return labels#中文多文件 def readalltxt(txt_files): # 處理中文labels = []for txt_file in txt_files:target = get_ch_label(txt_file)labels.append(target)return labels# 將漢子轉(zhuǎn)化成向量,支持文件和內(nèi)存對(duì)象里的漢字轉(zhuǎn)換 def get_ch_label_v(txt_file,word_num_map,txt_label = None):words_size = len(word_num_map)to_num = lambda word:word_num_map.get(word,words_size)if txt_file != None:txt_label = get_ch_label(txt_file)# 使用to_num()實(shí)現(xiàn)單個(gè)漢子轉(zhuǎn)成向量的功能,如果沒(méi)有該漢字,則將返回words_size(值為69)labels_vector = list(map(to_num,txt_label)) # 將漢字列表中的每個(gè)元素傳入到to_num()進(jìn)行轉(zhuǎn)換return labels_vector2.1.2 樣本預(yù)處理---make_Language_model.py(第2部分)
樣本預(yù)處理這個(gè)一套指讀取整個(gè)樣本,將其放入training_data里,獲取全部的字表words,并生成樣本向量wordlabel與向量有對(duì)應(yīng)關(guān)系的word_num_map,代碼如下:
# 1.2 樣本預(yù)處理 training_data = get_ch_label(training_file) print("加載訓(xùn)練模型中") print("該樣本長(zhǎng)度:",len(training_data)) counter = Counter(training_data) words = sorted(counter) words_size = len(words) word_num_map = dict(zip(words,range(words_size))) print("字表大小:",words_size) wordlabel = get_ch_label_v(training_file,word_num_map) # 加載訓(xùn)練模型中 # 該樣本長(zhǎng)度: 75 # 字表大小: 41(去重)? ? 上述結(jié)果表示樣本文件里一共有75個(gè)文字,其中掉重復(fù)的文字之后,還有41個(gè)。這41個(gè)文字將作為字表詞典,建立文字與索引值的對(duì)應(yīng)關(guān)系。
? ? 在訓(xùn)練模型時(shí),每個(gè)文字都會(huì)被轉(zhuǎn)化成數(shù)字形式的索引值輸入模型。模型的輸出是這41個(gè)文字的概率,即把每個(gè)文字當(dāng)成一類。
2.2 代碼實(shí)現(xiàn):構(gòu)建循環(huán)神經(jīng)網(wǎng)絡(luò)模型---make_Language_model.py(第3部分)
使用GRU構(gòu)建RNN模型,令RNN模型只接收一個(gè)序列的喻入字符,并預(yù)測(cè)出下一個(gè)序列的字符。
在該模型里,所需要完成的步驟如下:
2.2.1 代碼實(shí)現(xiàn)
# 1.3 構(gòu)建循環(huán)神經(jīng)網(wǎng)絡(luò)(RNN)模型 class GRURNN(torch.nn.Module):def __init__(self,word_size,embed_dim,hidden_dim,output_size,num_layers):super(GRURNN, self).__init__()self.num_layers = num_layersself.hidden_dim = hidden_dimself.embed = torch.nn.Embedding(word_size,embed_dim)# 定義一個(gè)多層的雙向?qū)?# 預(yù)測(cè)結(jié)果:形狀為[序列,批次,維度hidden_dim×2],因?yàn)槭请p向RNN,故維度為hidden_dim# 序列狀態(tài):形狀為[層數(shù)×2,批次,維度hidden_dim]self.gru = torch.nn.GRU(input_size=embed_dim,hidden_size=hidden_dim,num_layers=num_layers,bidirectional=True)self.fc = torch.nn.Linear(hidden_dim *2,output_size)# 全連接層,充當(dāng)模型的輸出層,用于對(duì)GRU輸出的預(yù)測(cè)結(jié)果進(jìn)行處理,得到最終的分類結(jié)果def forward(self,features,hidden):embeded = self.embed(features.view(1,-1))output,hidden = self.gru(embeded.view(1,1,-1),hidden)# output = self.attention(output)output = self.fc(output.view(1,-1))return output,hiddendef init_zero_state(self): # 對(duì)于GRU層狀態(tài)的初始化,每次迭代訓(xùn)練之前,需要對(duì)GRU的狀態(tài)進(jìn)行清空,因?yàn)檩斎氲男蛄惺?,故torch.zeros的第二個(gè)參數(shù)為1init_hidden = torch.zeros(self.num_layers*2,1,self.hidden_dim).to(DEVICE)return init_hidden2.3?代碼實(shí)現(xiàn):實(shí)例化,訓(xùn)練模型--make_Language_model.py(第3部分)
# 1.4 實(shí)例化模型類,并訓(xùn)練模型 EMBEDDING_DIM = 10 # 定義詞嵌入維度 HIDDEN_DIM = 20 # 定義隱藏層維度 NUM_LAYERS = 1 # 定義層數(shù) # 實(shí)例化模型 model = GRURNN(words_size,EMBEDDING_DIM,HIDDEN_DIM,words_size,NUM_LAYERS) model = model.to(DEVICE) optimizer = torch.optim.Adam(model.parameters(),lr=0.005)# 定義測(cè)試函數(shù) def evaluate(model,prime_str,predict_len,temperature=0.8):hidden = model.init_zero_state().to(DEVICE)predicted = ""# 處理輸入語(yǔ)義for p in range(len(prime_str) -1):_,hidden = model(prime_str[p],hidden)predicted = predicted + words[predict_len]inp = prime_str[-1] # 獲得輸入字符predicted = predicted + words[inp]#按照指定長(zhǎng)度輸出預(yù)測(cè)字符for p in range(predict_len):output,hidden = model(inp,hidden) # 將輸入字符和狀態(tài)傳入模型# 從多項(xiàng)式中分布采樣# 在測(cè)試環(huán)境下,使用溫度的參數(shù)和指數(shù)計(jì)算對(duì)模型的輸出結(jié)果進(jìn)行微調(diào),保證其數(shù)值是大于0的數(shù),小于0,torch.multinomial()會(huì)報(bào)錯(cuò)# 同時(shí),使用多項(xiàng)式分布的方式進(jìn)行采樣,生成預(yù)測(cè)結(jié)果output_dist = output.data.view(-1).div(temperature).exp()inp = torch.multinomial(output_dist,1)[0] # 獲取采樣結(jié)果predicted = predicted + words[inp] # 將索引轉(zhuǎn)化成漢字保存在字符串中return predicted# 定義參數(shù)并訓(xùn)練 training_iters = 5000 display_step = 1000 n_input = 4 step = 0 offset = random.randint(0,n_input+1) end_offset = n_input + 1while step < training_iters: # 按照迭代次數(shù)訓(xùn)練模型start_time = time.time() # 計(jì)算起始時(shí)間#隨機(jī)取一個(gè)位置偏移if offset > (len(training_data)-end_offset):offset = random.randint(0,n_input+1)# 制作輸入樣本inwords = wordlabel[offset:offset+n_input]inwords = np.reshape(np.array(inwords),[n_input,-1,1])# 制作標(biāo)簽樣本out_onehot = wordlabel[offset+1:offset+n_input+1]hidden = model.init_zero_state() # RNN的狀態(tài)清零optimizer.zero_grad()loss = 0.0inputs = torch.LongTensor(inwords).to(DEVICE)targets = torch.LongTensor(out_onehot).to(DEVICE)for c in range(n_input): # 按照輸入長(zhǎng)度將樣本預(yù)測(cè)輸入模型并進(jìn)行預(yù)測(cè)outputs,hidden = model(inputs[c],hidden)loss = loss + F.cross_entropy(outputs,targets[c].view(1))loss = loss / n_inputloss.backward()optimizer.step()# 輸出日志with torch.set_grad_enabled(False):if (step+1)%display_step == 0 :print(f'Time elapesd:{(time.time() - start_time)/60:.4f}min')print(f'step {step + 1}|Loss {loss.item():.2f}\n\n')with torch.no_grad():print(evaluate(model,inputs,32),'\n')print(50*'=')step = step +1# 每次迭代結(jié)束,將偏移值相后移動(dòng)n_input+1個(gè)距離單位,可以保證輸入數(shù)據(jù)的樣本相互均勻,否則會(huì)出現(xiàn)文本兩邊的樣本訓(xùn)練次數(shù)較少的情況。offset = offset + (n_input+1) print("Finished!")2.4 代碼實(shí)現(xiàn):運(yùn)行模型生成句子--make_Language_model.py(第4部分)
# 1.5 運(yùn)行模型生成句子 while True:prompt = "輸入幾個(gè)文字:"sentence = input(prompt)inputword = sentence.strip()try:inputword = get_ch_label_v(None,word_num_map,inputword)keys = np.reshape(np.array(inputword),[len(inputword),-1,1])# get_ch_label_v()中,如果在字典中找不到對(duì)應(yīng)的索引,就會(huì)為其分配一個(gè)無(wú)效的索引值,# 進(jìn)而在 evaluate()函數(shù)中調(diào)用模型的時(shí),差不多對(duì)應(yīng)對(duì)的有效詞向量而終止報(bào)錯(cuò)model.eval()with torch.no_grad():sentence = evaluate(model,torch.LongTensor(keys).to(DEVICE),32)print(sentence)except: # 異常處理,當(dāng)輸入的文字不在模型字典中時(shí),系統(tǒng)會(huì)報(bào)錯(cuò),有意設(shè)置,防止輸入超范圍的字詞print("還沒(méi)學(xué)會(huì)")3 代碼總覽--make_Language_model.py
import numpy as np import torch import torch.nn.functional as F import time import random from collections import Counter# 1.1 定義基本的工具函數(shù) RANDOM_SEED = 123 torch.manual_seed(RANDOM_SEED) DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')def elapsed(sec): # 計(jì)算時(shí)間函數(shù)if sec < 60:return str(sec) + "sec"elif sec<(60*60):return str(sec/60) + "min"else:return str(sec/(60*60)) + "hour"training_file = 'csv_list/wordstest.txt' # 定義樣本文件#中文字 def get_ch_label(txt_file): # 提取樣本中的漢字labels = ""with open(txt_file,'rb') as f :for label in f:labels = labels + label.decode("gb2312",errors = 'ignore')return labels#中文多文件 def readalltxt(txt_files): # 處理中文labels = []for txt_file in txt_files:target = get_ch_label(txt_file)labels.append(target)return labels# 將漢子轉(zhuǎn)化成向量,支持文件和內(nèi)存對(duì)象里的漢字轉(zhuǎn)換 def get_ch_label_v(txt_file,word_num_map,txt_label = None):words_size = len(word_num_map)to_num = lambda word:word_num_map.get(word,words_size)if txt_file != None:txt_label = get_ch_label(txt_file)# 使用to_num()實(shí)現(xiàn)單個(gè)漢子轉(zhuǎn)成向量的功能,如果沒(méi)有該漢字,則將返回words_size(值為69)labels_vector = list(map(to_num,txt_label)) # 將漢字列表中的每個(gè)元素傳入到to_num()進(jìn)行轉(zhuǎn)換return labels_vector# 1.2 樣本預(yù)處理 training_data = get_ch_label(training_file) print("加載訓(xùn)練模型中") print("該樣本長(zhǎng)度:",len(training_data)) counter = Counter(training_data) words = sorted(counter) words_size = len(words) word_num_map = dict(zip(words,range(words_size))) print("字表大小:",words_size) wordlabel = get_ch_label_v(training_file,word_num_map) # 加載訓(xùn)練模型中 # 該樣本長(zhǎng)度: 75 # 字表大小: 41(去重)# 1.3 構(gòu)建循環(huán)神經(jīng)網(wǎng)絡(luò)(RNN)模型 class GRURNN(torch.nn.Module):def __init__(self,word_size,embed_dim,hidden_dim,output_size,num_layers):super(GRURNN, self).__init__()self.num_layers = num_layersself.hidden_dim = hidden_dimself.embed = torch.nn.Embedding(word_size,embed_dim)# 定義一個(gè)多層的雙向?qū)?# 預(yù)測(cè)結(jié)果:形狀為[序列,批次,維度hidden_dim×2],因?yàn)槭请p向RNN,故維度為hidden_dim# 序列狀態(tài):形狀為[層數(shù)×2,批次,維度hidden_dim]self.gru = torch.nn.GRU(input_size=embed_dim,hidden_size=hidden_dim,num_layers=num_layers,bidirectional=True)self.fc = torch.nn.Linear(hidden_dim *2,output_size)# 全連接層,充當(dāng)模型的輸出層,用于對(duì)GRU輸出的預(yù)測(cè)結(jié)果進(jìn)行處理,得到最終的分類結(jié)果def forward(self,features,hidden):embeded = self.embed(features.view(1,-1))output,hidden = self.gru(embeded.view(1,1,-1),hidden)# output = self.attention(output)output = self.fc(output.view(1,-1))return output,hiddendef init_zero_state(self): # 對(duì)于GRU層狀態(tài)的初始化,每次迭代訓(xùn)練之前,需要對(duì)GRU的狀態(tài)進(jìn)行清空,因?yàn)檩斎氲男蛄惺?,故torch.zeros的第二個(gè)參數(shù)為1init_hidden = torch.zeros(self.num_layers*2,1,self.hidden_dim).to(DEVICE)return init_hidden# 1.4 實(shí)例化模型類,并訓(xùn)練模型 EMBEDDING_DIM = 10 # 定義詞嵌入維度 HIDDEN_DIM = 20 # 定義隱藏層維度 NUM_LAYERS = 1 # 定義層數(shù) # 實(shí)例化模型 model = GRURNN(words_size,EMBEDDING_DIM,HIDDEN_DIM,words_size,NUM_LAYERS) model = model.to(DEVICE) optimizer = torch.optim.Adam(model.parameters(),lr=0.005)# 定義測(cè)試函數(shù) def evaluate(model,prime_str,predict_len,temperature=0.8):hidden = model.init_zero_state().to(DEVICE)predicted = ""# 處理輸入語(yǔ)義for p in range(len(prime_str) -1):_,hidden = model(prime_str[p],hidden)predicted = predicted + words[predict_len]inp = prime_str[-1] # 獲得輸入字符predicted = predicted + words[inp]#按照指定長(zhǎng)度輸出預(yù)測(cè)字符for p in range(predict_len):output,hidden = model(inp,hidden) # 將輸入字符和狀態(tài)傳入模型# 從多項(xiàng)式中分布采樣# 在測(cè)試環(huán)境下,使用溫度的參數(shù)和指數(shù)計(jì)算對(duì)模型的輸出結(jié)果進(jìn)行微調(diào),保證其數(shù)值是大于0的數(shù),小于0,torch.multinomial()會(huì)報(bào)錯(cuò)# 同時(shí),使用多項(xiàng)式分布的方式進(jìn)行采樣,生成預(yù)測(cè)結(jié)果output_dist = output.data.view(-1).div(temperature).exp()inp = torch.multinomial(output_dist,1)[0] # 獲取采樣結(jié)果predicted = predicted + words[inp] # 將索引轉(zhuǎn)化成漢字保存在字符串中return predicted# 定義參數(shù)并訓(xùn)練 training_iters = 5000 display_step = 1000 n_input = 4 step = 0 offset = random.randint(0,n_input+1) end_offset = n_input + 1while step < training_iters: # 按照迭代次數(shù)訓(xùn)練模型start_time = time.time() # 計(jì)算起始時(shí)間#隨機(jī)取一個(gè)位置偏移if offset > (len(training_data)-end_offset):offset = random.randint(0,n_input+1)# 制作輸入樣本inwords = wordlabel[offset:offset+n_input]inwords = np.reshape(np.array(inwords),[n_input,-1,1])# 制作標(biāo)簽樣本out_onehot = wordlabel[offset+1:offset+n_input+1]hidden = model.init_zero_state() # RNN的狀態(tài)清零optimizer.zero_grad()loss = 0.0inputs = torch.LongTensor(inwords).to(DEVICE)targets = torch.LongTensor(out_onehot).to(DEVICE)for c in range(n_input): # 按照輸入長(zhǎng)度將樣本預(yù)測(cè)輸入模型并進(jìn)行預(yù)測(cè)outputs,hidden = model(inputs[c],hidden)loss = loss + F.cross_entropy(outputs,targets[c].view(1))loss = loss / n_inputloss.backward()optimizer.step()# 輸出日志with torch.set_grad_enabled(False):if (step+1)%display_step == 0 :print(f'Time elapesd:{(time.time() - start_time)/60:.4f}min')print(f'step {step + 1}|Loss {loss.item():.2f}\n\n')with torch.no_grad():print(evaluate(model,inputs,32),'\n')print(50*'=')step = step +1# 每次迭代結(jié)束,將偏移值相后移動(dòng)n_input+1個(gè)距離單位,可以保證輸入數(shù)據(jù)的樣本相互均勻,否則會(huì)出現(xiàn)文本兩邊的樣本訓(xùn)練次數(shù)較少的情況。offset = offset + (n_input+1) print("Finished!")# 1.5 運(yùn)行模型生成句子 while True:prompt = "輸入幾個(gè)文字:"sentence = input(prompt)inputword = sentence.strip()try:inputword = get_ch_label_v(None,word_num_map,inputword)keys = np.reshape(np.array(inputword),[len(inputword),-1,1])# get_ch_label_v()中,如果在字典中找不到對(duì)應(yīng)的索引,就會(huì)為其分配一個(gè)無(wú)效的索引值,# 進(jìn)而在 evaluate()函數(shù)中調(diào)用模型的時(shí),差不多對(duì)應(yīng)對(duì)的有效詞向量而終止報(bào)錯(cuò)model.eval()with torch.no_grad():sentence = evaluate(model,torch.LongTensor(keys).to(DEVICE),32)print(sentence)except: # 異常處理,當(dāng)輸入的文字不在模型字典中時(shí),系統(tǒng)會(huì)報(bào)錯(cuò),有意設(shè)置,防止輸入超范圍的字詞print("還沒(méi)學(xué)會(huì)")模型結(jié)果:訓(xùn)練的并不好,但是還能用
總結(jié)
以上是生活随笔為你收集整理的【Pytorch神经网络实战案例】11 循环神经网络结构训练语言模型并进行简单预测的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: 【Pytorch神经网络理论篇】 35
- 下一篇: matlab中计算不等式的解,大神们,求