使用TensorFlow进行鬼写
Music has long been considered to be one of the most influential and powerful forms of artwork. As such, it has been used to express raw emotion from the artist and transfer it to the listener.
長期以來,音樂一直被認為是最有影響力和最有力的藝術品形式之一。 這樣,它已被用來表達藝術家的原始情感并將其傳遞給聽眾。
Being a fan of music myself, it was only natural to wonder how difficult it would be to generate lyrics using recurrent neural networks (RNNs). I really enjoy rap and hip hop music, so I chose to work off of artists in those genres. It was also a good fit since there is existing research on rap lyric generation.
作為自己的音樂迷,自然而然地想知道使用遞歸神經網絡(RNN)生成歌詞將有多么困難。 我真的很喜歡說唱和嘻哈音樂,所以我選擇與那些類型的藝術家合作。 由于已有關于說唱歌詞生成的研究,因此也很合適。
Recurrent neural networks can be used for many language modeling tasks such as: chat bots, predictive keyboards, and language translation. Recurrent neural networks work well when it comes to text generation because of their ability to work with sequential data. This is beneficial as we need to preserve the context of a sentence or, in this case, a verse.
遞歸神經網絡可用于許多語言建模任務,例如:聊天機器人,預測性鍵盤和語言翻譯。 循環神經網絡在處理文本生成時效果很好,因為它們可以處理順序數據。 這是有益的,因為我們需要保留句子的上下文,在這種情況下,還需要保留經文。
An explanation of how an RNN works would be that it looks at previous data from the sequence to predict the next element in the sequence. Let’s say we have an RNN trained to perform text prediction on your phone’s keyboard (You know, the word predictions that pop up as you type). Based on previous messages I’ve typed I could input something like “Wezley is super …” and the neural network will take that sequence, and give a set of predicted words to go off of, such as: “cool”, “smart”, and “funny”.
對RNN的工作方式的解釋是,它會查看序列中的前一個數據以預測序列中的下一個元素。 假設我們訓練過一個RNN,可以在手機鍵盤上執行文本預測(您知道,鍵入時會彈出單詞預測)。 根據我輸入的先前消息,我可以輸入“ Wezley is super…”之類的信息,然后神經網絡將采用該順序,并給出一組預測的單詞,例如:“ cool”,“ smart” , 又可笑”。
體系結構概述 (Overview of Architectures)
To add to this experiment, I wanted to train different recurrent neural network architectures to perform the rap lyric generation. I chose to go with SimpleRNN, Gated Recurrent Unit, Long Short Term Memory, and Convolution Neural Network + Long Short Term Memory based architectures. I chose these to ensure we are able to test each architecture against one-another to determine which would perform the best given the task. We don’t know if one model will outperform the other unless we try, right?
作為本實驗的補充,我想訓練不同的遞歸神經網絡體系結構來執行說唱歌詞的生成。 我選擇使用SimpleRNN,門控循環單元,長短期記憶以及基于卷積神經網絡+長短期記憶的體系結構。 我選擇了這些以確保我們能夠針對另一種架構進行測試,以確定哪種架構在執行任務時性能最佳。 除非我們嘗試,否則不知道一個模型是否會勝過另一個模型。
The SimpleRNN architecture was more-so for a baseline to see how the other architectures will perform. A SimpleRNN architecture is not very good for this specific task, because of the vanishing gradient problem. This means that the SimpleRNN won’t be very useful in remembering context throughout a bar/verse because it will lose early information about the sequence the further in the sequence we go. This leads to incoherent verses that you’ll see later on in the article. If you are curious and want a TL;DR of how the model performed: we get verses such as “I am, what stone private bedroom now” or “And how the low changed up last gas guitar thing.” Both of these verses were generated from a dataset of Drake lyrics. Neither of them make much sense. However, I’d argue that they’re still fire bars.
對于基線而言,SimpleRNN體系結構更適合查看其他體系結構的性能。 由于逐漸消失的梯度問題,SimpleRNN體系結構不適用于此特定任務。 這意味著SimpleRNN在記憶整個小節/單節中的上下文方面不是很有用,因為它會在我們走的序列越遠時丟失有關序列的早期信息。 這導致了不連貫的經文,您將在本文后面看到。 如果您好奇并希望獲得模型表現的TL; DR:我們會得到如下經文:“我是,現在是什么石制私人臥室”或“以及低點如何改變了最后的氣吉他。” 這兩個經文都是從Drake歌詞數據集中產生的。 他們倆都沒有多大意義。 但是,我認為他們仍然是火力發電廠。
The Gated Recurrent Unit architecture was the next architecture I tested. The gated recurrent unit differs from the SimpleRNN by being able to remember a little further down in the sequence. It accomplishes this by utilizing two gates, a reset gate and an update gate. These gates control if the previous sequence information continues through the network or if it it gets updated to the most recent step. I’ll go a little more in-depth on this further into the article.
門控循環單元架構是我測試的下一個架構。 門控循環單元與SimpleRNN的不同之處在于,它可以記住序列中的更深一層。 它通過利用兩個門,一個復位門和一個更新門來實現這一點。 這些門控制著先前的序列信息是否繼續通過網絡或是否更新為最新步驟。 在本文中,我將對此進行更深入的介紹。
The Long Short Term Memory architecture was another architecture that was tested for this project. The LSTM differs from the SimpleRNN by, again, being able to remember further down the sequence. The LSTM has an advantage over the GRU by being able to remember longer sequences due to being a little more complex. The LSTM has three gates, instead of two, that control the information it forgets, carries on in the sequence, and updates from the latest step. Again, the LSTM will be covered a little more in-depth later on in the article.
長短期內存體系結構是為此項目測試的另一種體系結構。 LSTM與SimpleRNN的不同之處再次在于,它可以進一步記憶序列。 LSTM與GRU相比具有優勢,因為它稍微復雜一點,因此能夠記住更長的序列。 LSTM具有三個而不是兩個門來控制它忘記的信息,按順序進行并從最新步驟進行更新。 同樣,本文稍后將更深入地介紹LSTM。
The final architecture I tested was a mixture of a convolution neural network and long short term memory RNN. I threw this one in as a thought experiment based off of a paper that I read which used a C-LSTM architecture for text classification (Reference in Colab notebook). I wondered if the CNN would allow the LSTM to generalize a bar and better understand the stylistic elements of an artist. While fun to see a CNN in a text generation problem, I didn’t notice much of a different between this and the LSTM model.
我測試的最終體系結構是卷積神經網絡和長期短期記憶RNN的混合體。 我以閱讀的論文為基礎進行了思想實驗,以此作為思想實驗,該論文使用C-LSTM體系結構進行了文本分類(Colab筆記本中的參考)。 我想知道CNN是否可以使LSTM歸納標準并更好地了解藝術家的風格元素。 在文本生成問題中看到CNN很有趣,但我并沒有注意到它與LSTM模型之間有很多不同。
獲取數據集 (Obtaining the Dataset)
With a defined set of architectures created, I set out to find the dataset I wanted to use for this problem.
創建一組定義的體系結構后,我開始查找要用于此問題的數據集。
The dataset didn’t really matter to me, so long as it contained lyrics from prominent artists. I wanted to generate lyrics based off of artists I listen to often. This was so I could recognize if the model was able to generate similar lyrics. Don’t worry though! I didn’t determine a model’s performance solely off of what I thought sounded good. I also used a set of metrics that have been described in recent literature on the subject.
數據集對我來說并不重要,只要它包含著名藝術家的歌詞即可。 我想根據經常聽的歌手來產生歌詞。 這樣我就可以識別出該模型是否能夠生成相似的歌詞。 不過不要擔心! 我并不能僅僅根據我認為聽起來不錯的方式來確定模型的性能。 我還使用了一組有關該主題的最新文獻中描述的指標。
The dataset I found was here on Kaggle and was provided by Paul Mooney.
我發現的數據集在Kaggle上 ,由Paul Mooney提供。
This dataset was great because it contained lyrics from many of the rap/hip hop artists that I listen to. It also didn’t have any weird characters and took care of some of censoring of explicit lyrics.
這個數據集很棒,因為它包含了我聽過的許多說唱/嘻哈歌手的歌詞。 它也沒有任何怪異的字符,并負責一些顯式歌詞的審查。
準備數據 (Preparing the Data)
With the dataset in hand, I set out to load and prepare the data for training.
有了數據集,我便開始加載和準備數據以進行訓練。
The first thing I did was load in the data and finish censoring it. I used a preexisting Python library to perform the censorship so that I didn’t have to create a “naughty words” list manually. Unfortunately the library didn’t censor every word, so I apologize if you stumble across something explicit in the published notebook for this article.
我要做的第一件事是加載數據并完成審查。 我使用一個預先存在的Python庫來執行檢查,因此不必手動創建“頑皮的話”列表。 不幸的是,圖書館并沒有審查每個詞,因此,如果您偶然發現本文中已發表筆記本中的明顯內容,我深表歉意。
With the lyrics read in and censored, I went ahead and split them into an array of bars. I didn’t do any other processing to the bars, but in the future I may try this again and add <start> and <end> tags to each bar. This way the model can possibly learn when to end the sequence. For now, I had it generate bars of randomized lengths and the results were good enough for the initial experiment.
讀完歌詞并對其進行審查后,我繼續將其拆分為多個小節。 我沒有對條進行任何其他處理,但是將來我可能會再次嘗試此操作,并在每個條中添加<start>和<end>標記。 這樣,模型就可以學習何時結束序列。 現在,我讓它生成隨機長度的條形圖,結果足以用于初始實驗。
Once I finished splitting the data, I created a Markov model utilizing the markovify Python library. The Markov model will be used to generate the beginning sequences for each bar. This will help us ensure that the beginning of the sequence is somewhat coherent before passing it to the trained models. The models will then take the sequence and finish generating the lyrics for the bar.
分割完數據后,我就使用markovify Python庫創建了一個Markov模型。 馬爾可夫模型將用于生成每個小節的開始序列。 這將幫助我們確保在將序列傳遞給訓練后的模型之前,序列的開始是連貫的。 然后,模型將采用序列并完成為小節生成歌詞。
The next step was to tokenize the lyrics so that they would be in a format that the models could understand. Tokenization is actually a pretty cool process, as it basically splits up the words into a dictionary of words with IDs tied to them and changes each bar into an array of the corresponding word IDs. There is an example of this in the published notebook, but here’s another example of this in action:
下一步是標記歌詞,以使歌詞采用模型可以理解的格式。 令牌化實際上是一個非常酷的過程,因為它基本上將單詞拆分成帶有綁定ID的單詞字典,并將每個小節更改為相應單詞ID的數組。 在已發布的筆記本中有一個示例,但是這是一個實際的示例:
For an example, let’s say we were to tokenize the following sentences:
例如,假設我們要標記以下句子:
“Wezley is cool”
“韋茲利很酷”
“You are cool”
“你很酷”
“TensorFlow is very cool”
“ TensorFlow非常酷”
The following sequences would be produced:
將產生以下序列:
[1, 2, 3]
[1,2,3]
[4, 5, 3]
[4、5、3]
[6, 2, 7, 3]
[6,2,7,3]
Where the word dictionary is:
其中單詞字典是:
[‘Wezley’ : 1, ‘is’ : 2, ‘cool’ : 3, ‘You’ : 4, ‘are’ : 5, ‘TensorFlow’ : 6, ‘very’ : 7]
['Wezley':1,'is':2,'cool':3,'You':4,'are':5,'TensorFlow':6,'very':7]
As-is, these sequences can’t be fed into a model since they are of different lengths. To fix this, we add padding to the front of the arrays.
照原樣,這些序列的長度不同,因此無法輸入模型。 為了解決這個問題,我們在數組的前面添加了填充。
With padding we get:
通過填充,我們得到:
[0, 1, 2, 3]
[0,1,2,3]
[0, 4, 5, 3]
[0,4,5,3]
[6, 2, 7, 3]
[6,2,7,3]
With the bars tokenized, I was finally able to create my X and y data for training. The train_X data consisted of an entire bar, minus the last word. The train_y data was the last word in the bar.
在標記條化之后,我終于能夠創建我的X和y數據進行訓練。 train_X數據包括一個完整的小節,減去最后一個單詞。 train_y數據是該欄中的最后一個單詞。
Looking into the future, as with adding the <start> and <end> tags to the bars. I want to try changing up the way I’m splitting the training data. Maybe have the next version of this predict an entire bar based off the previous bar. That’ll be a project for another day though.
展望未來,就像在欄上添加<start>和<end>標記一樣。 我想嘗試改變分割訓練數據的方式。 也許讓此版本的下一個版本根據上一個柱形來預測整個柱形。 那將是另一天的項目。
定義模型 (Defining the Models)
With the data imported and split into the train_X and train_y sets. It’s time to define the model architectures and begin training.
導入數據并將其拆分為train_X和train_y集。 現在是時候定義模型架構并開始培訓了。
First up is the SimpleRNN architecture! The SimpleRNN will give a good baseline against the GRU, LSTM, and CNN+LSTM architectures.
首先是SimpleRNN架構! SimpleRNN將為GRU,LSTM和CNN + LSTM體系結構提供良好的基線。
The SimpleRNN unit can be expressed arithmetically as:
SimpleRNN單元可以算術表示為:
Where h(t) is expressed as the hidden state at a given point in time t. As you can see in the equation, the SimpleRNN relies on the previous hidden state h(t-1) and the current input x(t) to give us the current hidden state.
其中h(t)表示為在給定時間點t的隱藏狀態。 從方程式中可以看出,SimpleRNN依賴于先前的隱藏狀態h(t-1)和當前輸入x(t)來提供當前的隱藏狀態。
The SimpleRNN is great because of its ability to work with sequence data. The shortfall is in its simplicity. The SimpleRNN is unable to remember data further back in the sequence and thus suffers from the vanishing gradient problem. The vanishing gradient problem occurs when we start getting further down the sequence. This is when earlier states have a harder time being expressed. There is no mechanism in a SimpleRNN to help is keep track of previous states.
SimpleRNN之所以出色,是因為它具有處理序列數據的能力。 不足之處在于其簡單性。 SimpleRNN無法記住序列中更遠的數據,因此遭受梯度消失的困擾。 當我們開始進一步深入序列時,就會出現消失的梯度問題。 這是較早的狀態很難表達的時候。 SimpleRNN中沒有機制可以幫助跟蹤以前的狀態。
Vanishing Gradient Visualization消失的梯度可視化In code, the SimpleRNN network looks like:
在代碼中,SimpleRNN網絡如下所示:
SimpleRNN ArchitectureSimpleRNN架構The data being fed into the network is only expressed as a N*T vector, where the SimpleRNN is expecting an N*T*D vector. We correct this by adding an embedding layer to give the vector the D dimension. The embedding layer allows for the inputs to be transformed into a dense vector that can be fed into the SimpleRNN cells. For more information on the embedding layer see the TensorFlow documentation here.
饋入網絡的數據僅表示為N * T向量,其中SimpleRNN期望使用N * T * D向量。 我們通過添加嵌入層為向量賦予D維來糾正此問題。 嵌入層允許將輸入轉換成可以輸入到SimpleRNN單元中的密集向量。 對于埋入層的更多信息,請參閱TensorFlow文檔這里 。
I’m utilizing the Adam optimizer with a learning rate of 0.001. I’m using categorical cross-entropy as my loss function. Categorical cross-entropy is being used because we are trying to classify the next word in the sequence given the previous steps.
我正在使用學習率為0.001的Adam優化器。 我正在使用分類交叉熵作為損失函數。 正在使用分類交叉熵,因為我們正在嘗試根據前面的步驟對序列中的下一個單詞進行分類。
SimpleRNN CellSimpleRNN單元Next up is the network utilizing the Gated Recurrent Unit.
接下來是利用門控循環單元的網絡。
The GRU improves upon the SimpleRNN cell by introducing a reset and update gate. At a high level, these gates are used to decide which information we want to retain/lose previous states.
通過引入重置和更新門,GRU對SimpleRNN單元進行了改進。 在較高級別上,這些門用來確定我們要保留/丟失先前狀態的信息。
The GRU is expressed as:
GRU表示為:
Where z(t) is the update gate, r(t) is the reset gate, and h(t) is the hidden cell state.
其中z(t)是更新門, r(t)是重置門, h(t)是隱藏單元狀態。
Here’s how the GRU looks in action:
這是GRU運作的樣子:
GRU CellGRU細胞Here is how the GRU network is constructed in TensorFlow:
這是在TensorFlow中構建GRU網絡的方式:
Again, I’m utilizing Adam for the optimizer and categorical cross-entropy as the loss function.
同樣,我將Adam用于優化程序,并將分類交叉熵作為損失函數。
The Long Short Term Memory architecture was the next to be utilized.
長短期內存架構是下一個要使用的架構。
The long short term memory cell has advantages over the SimpleRNN and GRU cells by being able retain even more information further down the sequence. The LSTM utilizes three different gates as oppose to the GRU’s two, and retains a cell state throughout the network. The GRU is known to have the advantage of speed over the LSTM, in that it is able to generalize faster and utilize fewer parameters. However, the LSTM tends to take the cake when it comes to retaining more contextual data throughout a sequence.
長期短期存儲單元比SimpleRNN和GRU單元具有優勢,因為它可以在序列中進一步保留更多信息。 LSTM與GRU的兩個相反,利用了三個不同的門,并在整個網絡中保留了單元狀態。 眾所周知,GRU具有速度優于LSTM的優勢,因為它能夠更快地泛化并利用更少的參數。 但是,當要在整個序列中保留更多上下文數據時,LSTM往往是蛋糕。
The LSTM cell can be expressed as:
LSTM單元可以表示為:
Where f(t) represents the forget gate, and determines how much of the previous state to forget. Then i(t) represents the input gates which determines how much of the new information we will add to the cell state. The o(t) is the output gate, which determines which information will be progressing to the next hidden state. The cell state is represented by c(t), and the hidden state is h(t).
其中f(t)表示忘記門,并確定要忘記的先前狀態有多少。 然后i(t)代表輸入門,它決定了我們將添加到單元狀態的新信息量。 o(t)是輸出門,它確定哪些信息將前進到下一個隱藏狀態。 單元狀態由c(t)表示 ,隱藏狀態為h(t)。
Here is a visualization of data progressing through and LSTM cell:
這是通過LSTM單元進行的數據可視化:
LSTM CellLSTM電池See below for the implementation in code:
參見下面的代碼實現:
The final architecture I wanted to test was a combination of a convolution neural network and LSTM.
我要測試的最終體系結構是卷積神經網絡和LSTM的組合。
This network was a thought experiment to see how the results would differ from the LSTM, GRU, and SimpleRNN. I was actually surprised at some of the verses it was about to put out.
該網絡是一個思想實驗,旨在查看結果與LSTM,GRU和SimpleRNN的不同之處。 實際上,我對即將推出的某些經文感到驚訝。
Here is the code for the architecture:
這是該體系結構的代碼:
用模型發火 (Generating Fire with the Models)
Creating the models for this project was only about half of the work. The other half was generating song lyrics utilizing the trained model.
為該項目創建模型僅完成了一半的工作。 另一半是利用訓練好的模型來生成歌曲歌詞。
In my opinion, this is where the project became really fun. I was able to take the models I trained and utilize them for a non-trivial task.
我認為,這是該項目真正有趣的地方。 我能夠采用我訓練的模型,并將其用于一項重要任務。
This project was heavily inspired by “Evaluating Creative Language Generation: The Case of Rap Lyric Ghost Writing” by Peter Potash, Alexey Romanov, and Anna Rumshishky. With that, I’m going to utilize some of the methods outlined in their paper for evaluating the output of the models against the original lyrics from the artist.
這個項目的靈感來自Peter Potash,Alexey Romanov和Anna Rumshishky的“ 評估創新語言的產生:說唱抒情鬼寫作的案例” 。 這樣,我將利用他們論文中概述的一些方法,根據藝術家的原始歌詞來評估模型的輸出。
The methods I’m utilizing to evaluate bars and generate raps are: comprehension score, rhyme index, and lyrical uniqueness. I’ll discuss how I calculated these shortly.
我用來評估小節和產生說唱的方法是:理解力得分,韻律指數和抒情唯一性。 我將在短期內討論如何計算這些。
A high level overview of how I’m generating songs can be described as:
我如何生成歌曲的高級概述可以描述為:
- Utilize Markov model to generate first four words of a bar 利用馬爾可夫模型生成小節的前四個單詞
- Take the output of the Markov model and feed them into the RNN 取馬爾可夫模型的輸出并將其輸入RNN
- Evaluate the output of the RNN against the original lyrics for uniqueness, similar rhyme index, and similar comprehension score 根據原始歌詞評估RNN的輸出是否具有唯一性,相似的韻律指數和相似的理解力得分
- Either throw out the bar (if it’s trash), or add it to the song (if it’s fire) 扔掉酒吧(如果是垃圾桶),或將其添加到歌曲(如果是火桶)
Fairly simple, right?
很簡單,對吧?
Let’s jump into the code of how this is done.
讓我們跳入完成此操作的代碼。
First, I have a function named generate_rap. This function handles the main functionality of generating a rap song. generate_rap takes in the model I want to use to generate the rap (SimpleRNN, GRU, LSTM, or CNN+LSTM), the max bar length, how many bars we want in the rap, score thresholds, and how many tries we want for generating a fire bar. The score thresholds define how well the bar scores before it is considered fire — in this case, the closer to 0 the bar is, the more fire it is. Here is how the function looks in code:
首先,我有一個名為generate_rap的函數。 此功能處理產生說唱歌曲的主要功能。 generate_rap接受我要用于生成說唱的模型(SimpleRNN,GRU,LSTM或CNN + LSTM),最大條長度,我們在說唱中需要多少條,得分閾值以及我們想要進行多少次嘗試產生火條。 得分閾值定義條在被認為是開火之前得分的程度-在這種情況下,條越接近0,則開火越多。 該函數在代碼中的外觀如下:
As you can see, we generate a random bar, score it based on the artist’s average rhyme index, average comprehension, and the uniqueness of the bar. Then if the bar meets the score threshold it is graduated into the final song. If the algorithm fails to generate a fire bar within the defined max tries, it’ll put the best scored bar in the song and move on.
如您所見,我們生成一個隨機小節,根據藝術家的平均韻律指數,平均理解度和小節的唯一性對其評分。 然后,如果小節達到樂譜閾值,則將其定級為最終歌曲。 如果該算法未能在定義的最大嘗試次數內生成火線,它將在歌曲中得分最高的火線并繼續前進。
Within generate_rap I’m utilizing another function named generate_bar. This function takes in a seed phrase, the model we are using to generate the sequence, and the sequence’s length. generate_bar will then tokenize the seed phrase and feed it into the provided model until the sequence hits the desired length, then return the output. Here is the code:
在generate_rap中,我利用了另一個名為generate_bar的函數。 此函數包含一個種子短語,我們用于生成序列的模型以及序列的長度。 然后generate_bar將標記種子短語并將其饋送到提供的模型中,直到序列達到所需的長度,然后返回輸出。 這是代碼:
To score the bars, I’m utilizing a function named score_bar. This function takes in the bar we want to score, the artist’s original lyrics, the artist’s average comprehension score, and the artist’s average rhyme index. score_bar calculates the input bar’s comprehension score, rhyme index, and uniqueness index then scores the bar.
為了給小節打分,我利用了一個名為score_bar的函數。 此功能包含我們想要得分的小節,藝術家的原始歌詞,藝術家的平均理解分數和藝術家的平均韻律指數。 score_bar計算輸入小節的理解分數,韻律指數和唯一性指數,然后對小節進行評分。
The bar’s score can be positive or negative with 0 being the best score a bar can achieve. A score of 0 means that the bar has the same rhyme index and comprehension score while remaining completely unique from the original artist’s lyrics. A perfect score of 0 will be impossible to achieve, which is why we are defining min and max thresholds.
小節的分數可以是正數或負數,0是小節可以達到的最佳分數。 分數為0表示該小節具有相同的韻律指數和理解力分數,而與原始歌手的歌詞完全不同。 完美分數0將無法實現,這就是為什么我們要定義最小和最大閾值。
The score_bar function looks like:
score_bar函數如下所示:
To calculate the rhyme index of a bar, I’m utilizing the method as described in “Evaluating Creative Language Generation: The Case of Rap Lyric Ghostwriting.” Rhyme index is calculated by taking the number of rhymed syllables and dividing that by the total number of syllables in the bar or song. Here is that implementation in code:
為了計算小節的韻律指數,我使用了“評估創意語言生成:Rap歌詞代筆案例”中描述的方法。 韻律指數是通過將押韻音節的數量除以小節或歌曲中的總音節數量來計算的。 這是代碼中的實現:
For comparing the uniqueness of the generated bar, I’m computing the cosine distance between the generated bar and all of the artist’s bars. I’m then getting the average distance to compute the total uniqueness score. Here is how that looks:
為了比較生成的條的唯一性,我計算了生成的條與藝術家的所有條之間的余弦距離。 然后,我得到平均距離來計算總的唯一性分數。 看起來是這樣的:
結果 (The Results)
With all of this I was finally able to generate a full rap utilizing the four models I trained. After generating the rap, I took the generated song and calculated the rhyme index and comprehension scores. Surprisingly the full song still remained fairly close to the original artist’s rhyme index and comprehension score.
有了這些,我終于能夠使用我訓練的四個模型來產生完整的說唱。 產生說唱之后,我拿起產生的歌曲并計算了韻律指數和理解力分數。 令人驚訝的是,整首歌仍然與原歌手的韻律指數和理解力分數相當接近。
Here are some of the outputs when training off of Drake lyrics.
這是訓練Drake歌詞時的一些輸出。
The SimpleRNN:
SimpleRNN:
Generated rap with avg rhyme density: 0.5030674846625767 and avg readability of: 2.0599999999999996 Rap Generated with SimpleRNN: Now you’re throwing me baby know it knowLook I gotta started with you hook drake
I swear it happened no tellin’ yeah yeah
....
The GRU:
GRU:
Generated rap with avg rhyme density: 0.5176470588235295 and avg readability of: 1.9449999999999998 Rap Generated with GRU: That's why I died everything big crazy on meWho keepin' score up yeah yeah yeah yeah
I've loved and you everything big crazy on me on
....
The LSTM:
LSTM:
Generated rap with avg rhyme density: 0.3684210526315789 and avg readability of: 1.9749999999999996 Rap Generated with LSTM: Get the **** lick alone same that wait nowup ****, see what uh huh heart thing up yeah
Despite the things though up up up up yeah yeah
....
The LSTM+CSNN:
LSTM + CSNN:
Generated rap with avg rhyme density: 0.33519553072625696 and avg readability of: 2.2599999999999993 Rap Generated with CNN+LSTM: They still out know play through now out outI got it dedicate dedicate you yeah
I've been waiting much much aye aye days aye aye
....
For the full lyrics and list of references, take a look at the Google Colab notebook. Also feel free to try it yourself and change the artist for the style you want to mimic.
有關完整歌詞和參考文獻列表, 請查看Google Colab筆記本 。 也可以隨意嘗試一下,并根據您想模仿的風格來改變藝術家。
As far as the SimpleRNN vs GRU vs LSTM vs CNN+LSTM experiment goes, I would say that the LSTM tended to have the best results. The CNN+LSTM had too many repetitive words in a bar, and I think this has to do with the CNN generalizing the sequence as a whole. The SimpleRNN and GRU produced pretty incoherent bars, and their rhyme densities were really far off from the original artist.
就SimpleRNN,GRU,LSTM,CNN + LSTM實驗而言,我想說LSTM往往有最好的結果。 CNN + LSTM的條形圖中有太多重復的單詞,我認為這與CNN概括了整個序列有關。 SimpleRNN和GRU產生了非常不連貫的小節,并且它們的韻律密度與原始藝術家的確相距甚遠。
That's it! Let me know what you think in the comments. I’d love to build upon this project in the future. If you have any suggestions for things I need to change to get better results, let me know! Thank you for reading.
而已! 讓我知道您在評論中的想法。 將來我會希望以此項目為基礎。 如果您對我需要更改以獲得更好結果的任何建議,請告訴我! 感謝您的閱讀。
Check out my GitHub for the code to this project, and other cool projects!
查看我的GitHub,獲取該項目以及其他出色項目的代碼!
翻譯自: https://towardsdatascience.com/ghost-writing-with-tensorflow-49e77e26978f
總結
以上是生活随笔為你收集整理的使用TensorFlow进行鬼写的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 《阿凡达 2》登影史票房榜第四,全球票房
- 下一篇: NLP的特征工程