日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

二十四、TextCNN的原理和实现

發布時間:2024/9/16 编程问答 34 豆豆
生活随笔 收集整理的這篇文章主要介紹了 二十四、TextCNN的原理和实现 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

1.1 TextCNN的模型原理

  • Yoon Kim在2014年將卷積神經網絡CNN應用到文本分類任務,利用多個不同大小的卷積核來提取句子中的關鍵信息,從而能夠更好地捕捉局部相關性。

1.2 TextCNN的詳細過程

  • Embedding:詞嵌入層,單詞的向量表示
  • Convolution:卷積層,提取單詞的特征
  • MaxPolling:池化層,將不同長度句子經過pooling層之后都能變成定長的表示
  • FullConnection:全連接層,輸出每個類別的概率

1.3 通道

  • 圖像中可以利用 (R, G, B) 作為不同channel

  • 文本的輸入的channel通常是不同方式的embedding方式(比如 word2vec或Skip-Gram)

1.4 一維卷積

  • 文本是一維數據,因此在TextCNN卷積用的是一維卷積。

  • 一維卷積需要通過設計不同 kernel_size 的 filter 獲取不同寬度的視野。

1.5 代碼

  • 步驟一:導入工具庫
```python import numpy as np import torch import torch.nn as nn import torch.optim as optim import torch.nn.functional as F
  • 步驟二:模型初始化
def __init__(self):super(TextCNN, self).__init__()self.num_filters_total = num_filters * len(filter_sizes)self.W = nn.Embedding(vocab_size, embedding_size)self.Weight = nn.Linear(self.num_filters_total, num_classes, bias=False)self.Bias = nn.Parameter(torch.ones([num_classes]))self.filter_list = nn.ModuleList([nn.Conv2d(1, num_filters,(size, embedding_size)) for size in filter_sizes])print(" +++")
  • 步驟三:前向計算
def forward(self, X):embedded_chars = self.W(X) # [batch_size, embedding_size, sequence_length]embedded_chars = embedded_chars.unsqueeze(1) # add channel(=1) [batch, channel(=1), sequence_length, embedding_size]pooled_outputs = []for i, conv in enumerate(self.filter_list):# conv : [input_channel(=1), output_channel(=3), (filter_height, filter_width), bias_option]h = F.relu(conv(embedded_chars))# mp : ((filter_height, filter_width))mp = nn.MaxPool2d((sequence_length - filter_sizes[i] + 1, 1))# pooled : [batch_size(=6), output_height(=1), output_width(=1), output_channel(=3)]pooled = mp(h).permute(0, 3, 2, 1)pooled_outputs.append(pooled)# [batch_size(=6), output_height(=1), output_width(=1), output_channel(=3) * 3]h_pool = torch.cat(pooled_outputs, len(filter_sizes))# [batch_size(=6), output_height * output_width * (output_channel * 3)]h_pool_flat = torch.reshape(h_pool, [-1,self.num_filters_total])model = self.Weight(h_pool_flat) + self.Bias # [batch_size, num_classes]return model
  • 步驟四:主函數
if __name__ == '__main__':embedding_size = 2 # 詞向量的維度sequence_length = 3 # 句子的長度num_classes = 2 # 分類結果filter_sizes = [2, 2, 2] # 卷積核num_filters = 3 # 通道的數量# 3 words sentences (=sequence_length is 3)sentences = ["i love you", "he loves me", "she likes baseball", "i hate you", "sorry for that", "this is awful"]labels = [1, 1, 1, 0, 0, 0] # 1 is good, 0 is not good.# 1.建立詞匯表word_list = " ".join(sentences).split()word_list = list(set(word_list))word_dict = {w: i for i, w in enumerate(word_list)}vocab_size = len(word_dict)# 2. 構建模型model = TextCNN()criterion = nn.CrossEntropyLoss()optimizer = optim.Adam(model.parameters(), lr=0.001)inputs = torch.LongTensor([np.asarray([word_dict[n] for n in sen.split()]) for sen in sentences])targets = torch.LongTensor([out for out in labels]) # To using Torch Softmax Loss function# 3.訓練for epoch in range(5000):optimizer.zero_grad()output = model(inputs)# output : [batch_size, num_classes], target_batch : [batch_size] (LongTensor, not one-hot)loss = criterion(output, targets)if (epoch + 1) % 1000 == 0:print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.6f}'.format(loss))loss.backward()optimizer.step()# Testtest_text = 'sorry hate you'tests = [np.asarray([word_dict[n] for n in test_text.split()])]test_batch = torch.LongTensor(tests)# Predictpredict = model(test_batch).data.max(1, keepdim=True)[1]if predict[0][0] == 0:print(test_text, "is Bad Mean...")else:print(test_text, "is Good Mean!!")

總結

以上是生活随笔為你收集整理的二十四、TextCNN的原理和实现的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。