【Spark NLP】第 15 章:聊天机器人
??🔎大家好,我是Sonhhxg_柒,希望你看完之后,能對(duì)你有所幫助,不足請(qǐng)指正!共同學(xué)習(xí)交流🔎
📝個(gè)人主頁(yè)-Sonhhxg_柒的博客_CSDN博客?📃
🎁歡迎各位→點(diǎn)贊👍 + 收藏?? + 留言📝?
📣系列專欄 - 機(jī)器學(xué)習(xí)【ML】?自然語(yǔ)言處理【NLP】? 深度學(xué)習(xí)【DL】
?
?🖍foreword
?說(shuō)明?本人講解主要包括Python、機(jī)器學(xué)習(xí)(ML)、深度學(xué)習(xí)(DL)、自然語(yǔ)言處理(NLP)等內(nèi)容。
如果你對(duì)這個(gè)系列感興趣的話,可以關(guān)注訂閱喲👋
文章目錄
問(wèn)題陳述和約束
計(jì)劃項(xiàng)目
設(shè)計(jì)解決方案
實(shí)施解決方案
測(cè)試和測(cè)量解決方案
業(yè)務(wù)指標(biāo)
以模型為中心的指標(biāo)
審查
結(jié)論
當(dāng)我們討論語(yǔ)言模型時(shí),我們展示了如何生成文本。構(gòu)建一個(gè)聊天機(jī)器人是類似的,除了我們正在為一個(gè)交換建模。這可以使我們的要求更復(fù)雜,或者實(shí)際上更簡(jiǎn)單,具體取決于我們要如何解決問(wèn)題。
在本章中,我們將討論一些可以對(duì)此建模的方法,然后我們將構(gòu)建一個(gè)程序,該程序?qū)⑹褂蒙赡P蛠?lái)獲取然后生成響應(yīng)。首先,讓我們談?wù)勈裁词窃捳Z(yǔ)。
形態(tài)學(xué)和句法告訴我們?cè)~素是如何組合成詞的,詞是如何組合成短語(yǔ)和句子的。將句子組合成更大的語(yǔ)言行為并不容易建模。有不恰當(dāng)?shù)木渥咏M合的想法。讓我們看一些例子:
I went to the doctor, yesterday. It is just a sprained ankle.
I went to the doctor, yesterday. Mosquitoes have 47 teeth.
在第一個(gè)例子中,第二句話顯然與第一句話有關(guān)。從這兩句話,結(jié)合常識(shí),我們可以推斷出說(shuō)話者是因?yàn)槟_踝問(wèn)題去看醫(yī)生,結(jié)果是扭傷。第二個(gè)例子沒(méi)有意義。從語(yǔ)言學(xué)的角度來(lái)看,句子是從概念生成的,然后編碼成單詞和短語(yǔ)。句子所表達(dá)的概念是相互聯(lián)系的,所以一個(gè)句子序列應(yīng)該由相似的概念聯(lián)系起來(lái)。無(wú)論對(duì)話中只有一個(gè)或多個(gè)發(fā)言者,這都是正確的。
話語(yǔ)的語(yǔ)用學(xué)對(duì)于理解如何對(duì)其建模很重要。如果我們正在為客戶服務(wù)交換建模,則響應(yīng)范圍可能會(huì)受到限制。這些有限類型的響應(yīng)通常稱為意圖。在構(gòu)建客戶服務(wù)聊天機(jī)器人時(shí),這大大降低了潛在的復(fù)雜性。如果我們對(duì)一般對(duì)話進(jìn)行建模,這可能會(huì)變得更加困難。語(yǔ)言模型學(xué)習(xí)序列中可能發(fā)生的事情,但它們無(wú)法學(xué)習(xí)生成概念。所以我們的選擇是要么構(gòu)建一些模型來(lái)模擬可能的序列,要么找到一種作弊的方法。
我們可以通過(guò)對(duì)無(wú)法識(shí)別的意圖構(gòu)建罐頭響應(yīng)來(lái)作弊。例如,如果用戶聲明我們的簡(jiǎn)單模型不期望,我們可以讓它回應(yīng),“對(duì)不起,我不明白。”?如果我們正在記錄對(duì)話,我們可以使用使用預(yù)設(shè)響應(yīng)的交換來(lái)擴(kuò)展我們涵蓋的意圖。
在我們所涵蓋的示例中,我們將構(gòu)建一個(gè)純粹為整個(gè)話語(yǔ)建模的程序。本質(zhì)上,它是一種語(yǔ)言模型。不同之處在于我們?nèi)绾问褂盟?/span>
本章與前幾章的不同之處在于它沒(méi)有使用 Spark。Spark 非常適合批量處理大量數(shù)據(jù)。在交互式應(yīng)用程序中它不是很好。此外,循環(huán)神經(jīng)網(wǎng)絡(luò)可能需要很長(zhǎng)時(shí)間來(lái)訓(xùn)練大量數(shù)據(jù)。因此,在本章中,我們正在處理一小部分?jǐn)?shù)據(jù)。如果您有正確的硬件,您可以更改 NLTK 處理以使用 Spark NLP。
問(wèn)題陳述和約束
我們將構(gòu)建一個(gè)故事構(gòu)建工具。這個(gè)想法是幫助某人寫(xiě)一個(gè)類似于格林童話故事的原創(chuàng)故事。從包含更多參數(shù)的意義上說(shuō),這個(gè)模型將比以前的語(yǔ)言模型復(fù)雜得多。該程序?qū)⑹且粋€(gè)腳本,它要求輸入句子并生成一個(gè)新句子。然后,用戶獲取該句子,對(duì)其進(jìn)行修改和更正,然后輸入它。
我們?cè)噲D解決的問(wèn)題是什么?
我們需要一個(gè)系統(tǒng)來(lái)推薦故事中的下一個(gè)句子。我們還必須認(rèn)識(shí)到文本生成技術(shù)的局限性。我們需要讓用戶參與循環(huán)。所以我們需要一個(gè)可以生成相關(guān)文本的模型和一個(gè)可以讓我們查看輸出的系統(tǒng)。
有哪些限制條件?
首先,我們需要一個(gè)具有兩個(gè)上下文概念的模型——前一個(gè)句子和當(dāng)前句子。我們不需要過(guò)多擔(dān)心性能,因?yàn)檫@將與人進(jìn)行交互。這似乎違反直覺(jué),因?yàn)榇蠖鄶?shù)交互式系統(tǒng)需要相當(dāng)?shù)偷难舆t。然而,如果你考慮這個(gè)程序正在產(chǎn)生什么,等待一到三秒的響應(yīng)并不是不合理的。
我們?nèi)绾谓鉀Q約束問(wèn)題?
我們將構(gòu)建一個(gè)用于生成文本的神經(jīng)網(wǎng)絡(luò),特別是 RNN,如第4章和第8章所述。我們可以在這個(gè)模型中學(xué)習(xí)詞嵌入,但我們可以使用預(yù)先構(gòu)建的嵌入。這將幫助我們更快地訓(xùn)練模型。
計(jì)劃項(xiàng)目
這個(gè)項(xiàng)目的大部分工作將是開(kāi)發(fā)一個(gè)模型。一旦我們有了模型,我們將構(gòu)建一個(gè)簡(jiǎn)單的腳本,我們可以用它來(lái)編寫(xiě)我們自己的格林式童話故事。一旦我們開(kāi)發(fā)了這個(gè)腳本,這個(gè)模型就有可能被用來(lái)驅(qū)動(dòng) Twitter 機(jī)器人或 Slackbot。
在文本生成的實(shí)際生產(chǎn)環(huán)境中,我們希望監(jiān)控生成文本的質(zhì)量。這將使我們能夠通過(guò)開(kāi)發(fā)更有針對(duì)性的訓(xùn)練數(shù)據(jù)來(lái)改進(jìn)生成的文本。
設(shè)計(jì)解決方案
如果你還記得我們的語(yǔ)言模型,我們使用了三層。
我們輸入固定大小的字符窗口并預(yù)測(cè)下一個(gè)字符。現(xiàn)在我們需要找到一種方法來(lái)考慮更大的文本部分。有幾個(gè)選項(xiàng)。
許多 RNN 架構(gòu)包括一個(gè)用于學(xué)習(xí)單詞嵌入的層。這僅需要我們學(xué)習(xí)更多參數(shù),因此我們將使用預(yù)訓(xùn)練的 GloVe 模型。此外,我們將在令牌級(jí)別上構(gòu)建模型,而不是像以前那樣在角色級(jí)別上構(gòu)建模型。
我們可以使窗口大小比平均句子大得多。這有利于保持相同的模型架構(gòu)。缺點(diǎn)是我們的 LSTM 層必須在很長(zhǎng)的距離上維護(hù)信息。我們可以使用一種用于機(jī)器翻譯的架構(gòu)。
讓我們考慮連接方法。
當(dāng)前輸入將是句子上的窗口,因此對(duì)于給定句子的每個(gè)窗口,我們將使用相同的上下文向量。這種方法的好處是能夠擴(kuò)展到多個(gè)句子。缺點(diǎn)是模型必須學(xué)會(huì)平衡遠(yuǎn)近的信息。
讓我們考慮有狀態(tài)的方法。
通過(guò)減少前一句的影響,這有助于使訓(xùn)練更容易。然而,這是一把雙刃劍,因?yàn)樯舷挛慕o我們的信息較少。我們將使用這種方法。
實(shí)施解決方案
讓我們從導(dǎo)入開(kāi)始。本章將依賴 Keras。
from collections import Counter import pickle as pklimport nltk import numpy as np import pandas as pdfrom keras.models import Model from keras.layers import Input, Embedding, LSTM, Dense, CuDNNLSTM from keras.layers.merge import Concatenate import keras.utils as ku import keras.preprocessing as kp import tensorflow as tf np.random.seed(1) tf.set_random_seed(2)讓我們還為句子的開(kāi)頭和結(jié)尾以及未知標(biāo)記定義一些特殊標(biāo)記。
START = '>' END = '###' UNK = '???'現(xiàn)在,我們可以加載數(shù)據(jù)了。我們需要替換一些特殊字符。
with open('grimms_fairytales.txt', encoding='UTF-8') as fp:text = fp.read()text = text\.replace('\t', ' ')\.replace('“', '"')\.replace('”', '"')\.replace('“', '"')\.replace('‘', "'")\.replace('’', "'")現(xiàn)在,我們可以將我們的文本處理成標(biāo)記化的句子。
sentences = nltk.tokenize.sent_tokenize(text) sentences = [s.strip()for s in sentences] sentences = [[t.lower() for t in nltk.tokenize.wordpunct_tokenize(s)] for s in sentences] word_counts = Counter([t for s in sentences for t in s]) word_counts = pd.Series(word_counts) vocab = [START, END, UNK] + list(sorted(word_counts.index))我們需要為我們的模型定義一些超參數(shù)。
- dim是令牌嵌入的大小
- w是我們將使用的窗口的大小
- max_len是我們使用的句子長(zhǎng)度
- units是我們將用于 LSTM 的狀態(tài)向量的大小
現(xiàn)在,讓我們加載 GloVe 嵌入。
glove = {} with open('glove.6B/glove.6B.50d.txt', encoding='utf-8') as fp:for line in fp:token, embedding = line.split(maxsplit=1)if token in vocab:embedding = np.fromstring(embedding, 'f', sep=' ')glove[token] = embeddingvocab = list(sorted(glove.keys())) vocab_size = len(vocab)我們還需要查找 one-hot-encoded 輸出。
i2t = dict(enumerate(vocab)) t2i = {t: i for i, t in i2t.items()}token_oh = ku.to_categorical(np.arange(vocab_size)) token_oh = {t: token_oh[i,:] for t, i in t2i.items()}現(xiàn)在,我們可以定義一些實(shí)用函數(shù)。
我們需要填充句子的結(jié)尾;否則,我們將無(wú)法從句子中的最后一個(gè)單詞中學(xué)習(xí)。
def pad_sentence(sentence, length):sentence = sentence[:length]if len(sentence) < length:sentence += [END] * (length - len(sentence))return sentence我們還需要將句子轉(zhuǎn)換為矩陣。
def sent2mat(sentence, embedding):mat = [embedding.get(t, embedding[UNK]) for t in sentence]return np.array(mat)我們需要一個(gè)將序列轉(zhuǎn)換為滑動(dòng)窗口序列的函數(shù)。
def slide_seq(seq, w):window = []target = []for i in range(len(seq)-w-1):window.append(seq[i:i+w])target.append(seq[i+w])return window, target現(xiàn)在我們可以構(gòu)建我們的輸入矩陣。我們將有兩個(gè)輸入矩陣。一個(gè)來(lái)自上下文,一個(gè)來(lái)自當(dāng)前句子。
Xc = [] Xi = [] Y = []for i in range(len(sentences)-1):context_sentence = pad_sentence(sentences[i], max_len)xc = sent2mat(context_sentence, glove)input_sentence = [START]*(w-1) + sentences[i+1] + [END]*(w-1)for window, target in zip(*slide_seq(input_sentence, w)):xi = sent2mat(window, glove)y = token_oh.get(target, token_oh[UNK])Xc.append(np.copy(xc))Xi.append(xi)Y.append(y)Xc = np.array(Xc) Xi = np.array(Xi) Y = np.array(Y) print('context sentence: ', xc.shape) print('input sentence: ', xi.shape) print('target sentence: ', y.shape) context sentence: (42, 50) input sentence: (10, 50) target sentence: (4407,)讓我們建立我們的模型。
input_c = Input(shape=(max_len,dim,), dtype='float32') lstm_c, h, c = LSTM(units, return_state=True)(input_c)input_i = Input(shape=(w,dim,), dtype='float32') lstm_i = LSTM(units)(input_i, initial_state=[h, c])out = Dense(vocab_size, activation='softmax')(lstm_i) model = Model(input=[input_c, input_i], output=[out]) print(model.summary()) Model: "model_1" __________________________________________________________________________ Layer (type) Output Shape Param # Connected to ========================================================================== input_1 (InputLayer) (None, 42, 50) 0 __________________________________________________________________________ input_2 (InputLayer) (None, 10, 50) 0 __________________________________________________________________________ lstm_1 (LSTM) [(None, 200), (None, 200800 input_1[0][0] __________________________________________________________________________ lstm_2 (LSTM) (None, 200) 200800 input_2[0][0]lstm_1[0][1] lstm_1[0][2] __________________________________________________________________________ dense_1 (Dense) (None, 4407) 885807 lstm_2[0][0] ========================================================================== Total params: 1,287,407 Trainable params: 1,287,407 Non-trainable params: 0 __________________________________________________________________________ None model.compile(loss='categorical_crossentropy', optimizer='adam',metrics=['accuracy'])現(xiàn)在我們可以訓(xùn)練我們的模型了。根據(jù)您的硬件,這在 CPU 上每個(gè) epoch 可能需要四分鐘。這是我們迄今為止最復(fù)雜的模型,具有近 130 萬(wàn)個(gè)參數(shù)。
Epoch 1/10 145061/145061 [==============================] - 241s 2ms/step - loss: 3.7840 - accuracy: 0.3894 ... Epoch 10/10 145061/145061 [==============================] - 244s 2ms/step - loss: 1.8933 - accuracy: 0.5645一旦我們訓(xùn)練了這個(gè)模型,我們就可以嘗試生成一些句子。這個(gè)函數(shù)需要一個(gè)上下文句子和一個(gè)輸入句子——我們可以簡(jiǎn)單地提供一個(gè)單詞來(lái)開(kāi)始。該函數(shù)會(huì)將標(biāo)記附加到輸入句子,直到END生成標(biāo)記或我們達(dá)到最大允許長(zhǎng)度。
def generate_sentence(context_sentence, input_sentence, max_len=100):context_sentence = [t.lower() for t in nltk.tokenize.wordpunct_tokenize(context_sentence)]context_sentence = pad_sentence(context_sentence, max_len)context_vector = sent2mat(context_sentence, glove)input_sentence = [t.lower() for t in nltk.tokenize.wordpunct_tokenize(input_sentence)]input_sentence = [START] * (w-1) + input_sentenceinput_sentence = input_sentence[:w]output_sentence = input_sentenceinput_vector = sent2mat(input_sentence, glove)predicted_vector = model.predict([[context_vector], [input_vector]])predicted_token = i2t[np.argmax(predicted_vector)]output_sentence.append(predicted_token)i = 0while predicted_token != END and i < max_len:input_sentence = input_sentence[1:w] + [predicted_token]input_vector = sent2mat(input_sentence, glove)predicted_vector = model.predict([[context_vector], [input_vector]])predicted_token = i2t[np.argmax(predicted_vector)]output_sentence.append(predicted_token)i += 1return output_sentence因?yàn)槲覀冃枰峁┬戮渥拥牡谝粋€(gè)單詞,所以我們可以簡(jiǎn)單地從語(yǔ)料庫(kù)中找到的開(kāi)頭標(biāo)記進(jìn)行采樣。讓我們將需要的第一個(gè)單詞的分布保存為 JSON。
first_words = Counter([s[0] for s in sentence]) first_words = pd.Series(first_words) first_words = first_words.sum() first_words.to_json('grimm-first-words.json') with open('glove-dict.pkl', 'wb') as out:pkl.dump(glove, out) with open('vocab.pkl', 'wb') as out:pkl.dump(i2t, out)讓我們看看在沒(méi)有人工干預(yù)的情況下生成了什么。
context_sentence = ''' In old times, when wishing was having, there lived a King whose daughters were all beautiful, but the youngest was so beautiful that the sun itself, which has seen so much, was astonished whenever it shone in her face. '''.strip().replace('\n', ' ')input_sentence = np.random.choice(first_words.index, p=first_words)for _ in range(10):print(context_sentence, END)output_sentence = generate_sentence(context_sentence, input_sentence, max_len)output_sentence = ' '.join(output_sentence[w-1:-1])context_sentence = output_sentenceinput_sentence = np.random.choice(first_words.index, p=first_words) print(output_sentence, END) In old times, when wishing was having, there lived a King whose daughters were all beautiful, but the youngest was so beautiful that the sun itself, which has seen so much, was astonished whenever it shone in her face. ### " what do you desire ??? ### the king ' s son , however , was still beautiful , and a little chair there ' s blood and so that she is alive ??? ### the king ' s son , however , was still beautiful , and the king ' s daughter was only of silver , and the king ' s son came to the forest , and the king ' s son seated himself on the leg , and said , " i will go to church , and you shall be have lost my life ??? ### " what are you saying ??? ### cannon - maiden , and the king ' s daughter was only a looker - boy . ### but the king ' s daughter was humble , and said , " you are not afraid ??? ### then the king said , " i will go with you ??? ### " i will go with you ??? ### he was now to go with a long time , and the bird threw in the path , and the strong of them were on their of candles and bale - plants . ### then the king said , " i will go with you ??? ###該模型不會(huì)很快通過(guò)圖靈測(cè)試。這就是為什么我們需要一個(gè)人參與其中。讓我們構(gòu)建我們的腳本。首先,讓我們保存我們的模型。
model.save('grimm-model')我們的腳本需要能夠訪問(wèn)我們的一些實(shí)用函數(shù)以及超參數(shù)——例如dim,w.
%%writefile fairywriter.py """ 這個(gè)腳本幫助你生成一個(gè)童話故事。 """import pickle as pklimport nltk import numpy as np import pandas as pdfrom keras.models import load_model import keras.utils as ku import keras.preprocessing as kp import tensorflow as tfSTART = '>' END = '###' UNK = '???'FINISH_CMDS = ['finish', 'f'] BACK_CMDS = ['back', 'b'] QUIT_CMDS = ['quit', 'q'] CMD_PROMPT = ' | '.join(','.join(c) for c in [FINISH_CMDS, BACK_CMDS, QUIT_CMDS]) QUIT_PROMPT = '"{}" to quit'.format('" or "'.join(QUIT_CMDS)) ENDING = ['THE END']def pad_sentence(sentence, length):sentence = sentence[:length]if len(sentence) < length:sentence += [END] * (length - len(sentence))return sentencedef sent2mat(sentence, embedding):mat = [embedding.get(t, embedding[UNK]) for t in sentence]return np.array(mat)def generate_sentence(context_sentence, input_sentence, vocab, max_len=100, hparams=(42, 50, 10)):max_len, dim, w = hparamscontext_sentence = [t.lower() for t in nltk.tokenize.wordpunct_tokenize(context_sentence)]context_sentence = pad_sentence(context_sentence, max_len)context_vector = sent2mat(context_sentence, glove)input_sentence = [t.lower() for t in nltk.tokenize.wordpunct_tokenize(input_sentence)]input_sentence = [START] * (w-1) + input_sentenceinput_sentence = input_sentence[:w]output_sentence = input_sentenceinput_vector = sent2mat(input_sentence, glove)predicted_vector = model.predict([[context_vector], [input_vector]])predicted_token = vocab[np.argmax(predicted_vector)]output_sentence.append(predicted_token)i = 0while predicted_token != END and i < max_len:input_sentence = input_sentence[1:w] + [predicted_token]input_vector = sent2mat(input_sentence, glove)predicted_vector = model.predict([[context_vector], [input_vector]])predicted_token = vocab[np.argmax(predicted_vector)]output_sentence.append(predicted_token)i += 1return output_sentenceif __name__ == '__main__':model = load_model('grimm-model')(_, max_len, dim), (_, w, _) = model.get_input_shape_at(0)hparams = (max_len, dim, w)first_words = pd.read_json('grimm-first-words.json', typ='series')with open('glove-dict.pkl', 'rb') as fp:glove = pkl.load(fp)with open('vocab.pkl', 'rb') as fp:vocab = pkl.load(fp)print("Let's write a story!")title = input('Give me a title ({}) '.format(QUIT_PROMPT))story = [title]context_sentence = titleinput_sentence = np.random.choice(first_words.index, p=first_words)if title.lower() in QUIT_CMDS:exit()print(CMD_PROMPT)while True:input_sentence = np.random.choice(first_words.index, p=first_words)generated = generate_sentence(context_sentence, input_sentence, vocab, hparams=hparams)generated = ' '.join(generated)### 模型創(chuàng)建一個(gè)建議的句子print('Suggestion:', generated)### 用戶回復(fù)他們想要添加的句子 ### 用戶可以修改建議的句子或編寫(xiě)自己的### 這是將用于制作下一個(gè)建議句子的sentence = input('Sentence: ')if sentence.lower() in QUIT_CMDS:story = []breakelif sentence.lower() in FINISH_CMDS:story.append(np.random.choice(ENDING))breakelif sentence.lower() in BACK_CMDS:if len(story) == 1:print('You are at the beginning')story = story[:-1]context_sentence = story[-1]continueelse:story.append(sentence)context_sentence = sentenceprint('\n'.join(story))print('exiting...')讓我們運(yùn)行一下我們的腳本。我將使用它來(lái)閱讀建議并將其中的元素添加到下一行。一個(gè)更復(fù)雜的模型可能能夠生成可以編輯和添加的句子,但這個(gè)模型并不完全存在。
%run fairywriter.py Let's write a story! Give me a title ("quit" or "q" to quit) The Wolf Goes Home finish,f | back,b | quit,q Suggestion: > > > > > > > > > and when they had walked for the time , and the king ' s son seated himself on the leg , and said , " i will go to church , and you shall be have lost my life ??? ### Sentence: There was once a prince who got lost in the woods on the way to a church. Suggestion: > > > > > > > > > she was called hans , and as the king ' s daughter , who was so beautiful than the children , who was called clever elsie . ### Sentence: The prince was called Hans, and he was more handsome than the boys. Suggestion: > > > > > > > > > no one will do not know what to say , but i have been compelled to you ??? ### Sentence: The Wolf came along and asked, "does no one know where are?" Suggestion: > > > > > > > > > there was once a man who had a daughter who had three daughters , and he had a child and went , the king ' s daughter , and said , " you are growing and thou now , i will go and fetch Sentence: The Wolf had three daughters, and he said to the prince, "I will help you return home if you take one of my daughters as your betrothed." Suggestion: > > > > > > > > > but the king ' s daughter was humble , and said , " you are not afraid ??? ### Sentence: The prince asked, "are you not afraid that she will be killed as soon as we return home?" Suggestion: > > > > > > > > > i will go and fetch the golden horse ??? ### Sentence: The Wolf said, "I will go and fetch a golden horse as dowry." Suggestion: > > > > > > > > > one day , the king ' s daughter , who was a witch , and lived in a great forest , and the clouds of earth , and in the evening , came to the glass mountain , and the king ' s son Sentence: The Wolf went to find the forest witch that she might conjure a golden horse. Suggestion: > > > > > > > > > when the king ' s daughter , however , was sitting on a chair , and sang and reproached , and said , " you are not to be my wife , and i will take you to take care of your ??? ### Sentence: The witch reproached the wolf saying, "you come and ask me such a favor with no gift yourself?" Suggestion: > > > > > > > > > then the king said , " i will go with you ??? ### Sentence: So the wolf said, "if you grant me this favor, I will be your servant." Suggestion: > > > > > > > > > he was now to go with a long time , and the other will be polluted , and we will leave you ??? ### Sentence: f The Wolf Goes Home There was once a prince who got lost in the woods on the way to a church. The prince was called Hans, and he was more handsome than the boys. The Wolf came along and asked, "does no one know where are?" The Wolf had three daughters, and he said to the prince, "I will help you return home if you take one of my daughters as your betrothed." The prince asked, "are you not afraid that she will be killed as soon as we return home?" The Wolf said, "I will go and fetch a golden horse as dowry." The Wolf went to find the forest witch that she might conjure a golden horse. The witch reproached the wolf saying, "you come and ask me such a favor with no gift yourself?" So the wolf said, "if you grant me this favor, I will be your servant." THE END exiting..您可以進(jìn)行額外的 epochs 以獲得更好的建議,但要注意過(guò)度擬合。如果你過(guò)度擬合這個(gè)模型,那么如果你向它提供它無(wú)法識(shí)別的上下文和輸入,它會(huì)產(chǎn)生更糟糕的結(jié)果。
現(xiàn)在我們有了一個(gè)可以與之交互的模型,下一步就是將它與聊天機(jī)器人系統(tǒng)集成。大多數(shù)系統(tǒng)都需要一些服務(wù)于模型的服務(wù)器。具體情況取決于您的聊天機(jī)器人平臺(tái)。
測(cè)試和測(cè)量解決方案
與大多數(shù)應(yīng)用程序相比,衡量聊天機(jī)器人更多地取決于產(chǎn)品的最終目的。讓我們考慮一下我們將用于測(cè)量的不同類型的指標(biāo)。
業(yè)務(wù)指標(biāo)
如果您正在構(gòu)建一個(gè)聊天機(jī)器人來(lái)支持客戶服務(wù),那么業(yè)務(wù)指標(biāo)將以客戶體驗(yàn)為中心。如果您正在構(gòu)建一個(gè)用于娛樂(lè)目的的聊天機(jī)器人,就像這里的情況一樣,沒(méi)有明顯的業(yè)務(wù)指標(biāo)。但是,如果娛樂(lè)聊天機(jī)器人用于營(yíng)銷,您可以使用營(yíng)銷指標(biāo)。
以模型為中心的指標(biāo)
很難以模型在訓(xùn)練中測(cè)量的相同方式測(cè)量實(shí)時(shí)交互。在訓(xùn)練中,我們知道“正確”的答案,但由于模型的交互性,我們沒(méi)有明確的正確答案。要測(cè)量實(shí)時(shí)模型,您需要手動(dòng)標(biāo)記對(duì)話。
現(xiàn)在讓我們談?wù)劵A(chǔ)設(shè)施。
審查
在審查聊天機(jī)器人時(shí),您需要進(jìn)行任何項(xiàng)目所需的正常審查。額外的要求是將聊天機(jī)器人放在實(shí)際用戶的代理前面。與任何需要用戶交互的應(yīng)用程序一樣,用戶測(cè)試是核心。
結(jié)論
在本章中,我們學(xué)習(xí)了如何為交互式應(yīng)用程序構(gòu)建模型。有許多不同種類的聊天機(jī)器人。我們?cè)谶@里看到的示例是基于語(yǔ)言模型的,但我們也可以構(gòu)建推薦模型。這完全取決于您期待什么樣的互動(dòng)。在我們的情況下,我們正在輸入并接收完整的句子。如果您的應(yīng)用程序有一組受限制的響應(yīng),那么您的任務(wù)就會(huì)變得更容易。
總結(jié)
以上是生活随笔為你收集整理的【Spark NLP】第 15 章:聊天机器人的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: C++ 并发指南-atomic原子变量使
- 下一篇: 面试——路径、转发与重定向的区别