日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

pytorch中nn.Embedding和nn.LSTM和nn.Linear

發(fā)布時間:2024/3/12 编程问答 32 豆豆
生活随笔 收集整理的這篇文章主要介紹了 pytorch中nn.Embedding和nn.LSTM和nn.Linear 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

使用pytorch實現(xiàn)一個LSTM網(wǎng)絡(luò)很簡單,最基本的有三個要素:nn.Embedding, nn.LSTM, nn.Linear

基本框架為:

class LSTMModel(nn.Module):def __init__(self, embedding_dim, hidden_dim, vocab_size, tagset_size):super(LSTMModel, self).__init__()self.hidden_dim = hidden_dim# vacab_size是使用的字典的長度self.word_embeddings = nn.Embedding(vocab_size, embedding_dim)# LSTM模塊使用word_embeddings作為輸入,輸出的維度為hidden_dimself.lstm = nn.LSTM(embedding_dim, hidden_dim)# nn.Linear將LSTM模塊的輸出映射到目標(biāo)向量空間,即線性空間self.linear = nn.Linear(hidden_dim, tagset_size)def forward(self, sentence):embeds = self.word_embeddings(sentence)lstm_out, _ = self.lstm(embeds.view(len(sentence), 1, -1))linear_out = self.linear(lstm_out.view(len(sentence), -1))# 之后使用score function計算score并返回結(jié)果score = 分?jǐn)?shù)計算方法return score

可以直接看結(jié)尾的總結(jié)

CLASS torch.nn.Embedding(num_embeddings: int, embedding_dim: int, padding_idx: Optional[int] = None, max_norm: Optional[float] = None, norm_type: float = 2.0, scale_grad_by_freq: bool = False, sparse: bool = False, _weight: Optional[torch.Tensor] = None)

先看pytorch官網(wǎng)的定義:

A simple lookup table that stores?embeddings of a fixed dictionary and size. 一個有設(shè)置了固定詞典和大小的查詢表,存儲的是embeddings(嵌入)

This module is often used to store word?embeddings and retrieve them using indices. The input to the module is a list of indices, and the output is the corresponding word embeddings. 這個模塊經(jīng)常用來存儲詞嵌入和通過索引進(jìn)行檢索。該模塊的輸入是一個索引列表,輸出是相對應(yīng)的詞嵌入。

再看初始化該模塊時所需要的參數(shù):

Parameters

  • num_embeddings?(int) – size of the dictionary of?embeddings? 這個就是所用詞典的大小

  • embedding_dim?(int) – the size of each?embedding?vector? 詞嵌入向量的維度

  • 下邊的都是可選參數(shù):

  • padding_idx?(int,?optional) – If given, pads the output with the?embedding?vector at?padding_idx?(initialized to zeros) whenever it encounters the index. 用來指定padding的位置,初始化為0

  • max_norm?(float,?optional) – If given, each?embedding?vector with norm larger than?max_norm?is renormalized to have norm?max_norm.

  • norm_type?(float,?optional) – The p of the p-norm to compute for the?max_norm?option. Default?2.

  • scale_grad_by_freq?(boolean,?optional) – If given, this will scale gradients by the inverse of frequency of the words in the mini-batch. Default?False.

  • sparse?(bool,?optional) – If?True, gradient w.r.t.?weight?matrix will be a sparse tensor. See Notes for more details regarding sparse gradients.

Variables

? ? ~Embedding.weight?(Tensor) – the learnable weights of the module of shape (num_embeddings, embedding_dim) initialized from?\mathcal{N}(0, 1)N(0,1)

?

Shape:

  • Input:?(*), LongTensor of arbitrary shape containing the indices to extract.? 一定是LongTensor,任意shape,

  • Output:?(*, H), where?*?is the input shape and H=embedding_dim

?

Examples:

>>> # an Embedding module containing 10 tensors of size 3 >>> embedding = nn.Embedding(10, 3) >>> # a batch of 2 samples of 4 indices each >>> input = torch.LongTensor([[1,2,4,5],[4,3,2,9]]) >>> embedding(input)tensor([[[-0.0251, -1.6902, 0.7172], # 1[-0.6431, 0.0748, 0.6969], # 2[ 1.4970, 1.3448, -0.9685], # 4[-0.3677, -2.7265, -0.1685]], # 5[[ 1.4970, 1.3448, -0.9685], # 4[ 0.4362, -0.4004, 0.9400], # 3[-0.6431, 0.0748, 0.6969], # 2[ 0.9124, -2.3616, 1.1151]]]) # 9>>> # example with padding_idx >>> embedding = nn.Embedding(10, 3, padding_idx=0) >>> input = torch.LongTensor([[0,2,0,5]]) >>> embedding(input)tensor([[[ 0.0000, 0.0000, 0.0000], # 0[ 0.1535, -2.0309, 0.9315], # 2[ 0.0000, 0.0000, 0.0000], # 0[-0.1655, 0.9897, 0.0635]]]) # 5

該模型初始化為:包括10個tensor向量,每個向量的size是3

輸入的size為(2,4),2為batch_size,即有多少個sequence,4為每個sequence的size,———— (batch_size, sequence_length)

輸出的size為(2,4,3),3為embedding_dim,也就是每個embedding向量的長度

?

?

CLASS torch.nn.LSTM(*args, **kwargs)

介紹就不說了,參考https://colah.github.io/posts/2015-08-Understanding-LSTMs/

Parameters

  • input_size?– The number of expected features in the input?x? 每個輸入sample里feature向量的長度是多少,對應(yīng)Embedding里的embedding_dim

  • hidden_size?– The number of features in the hidden state?h??

  • num_layers?– Number of recurrent layers. E.g., setting?num_layers=2?would mean stacking two?LSTMs together to form a?stacked LSTM, with the second LSTM taking in outputs of the first LSTM and computing the final results. Default: 1

  • bias?– If?False, then the layer does not use bias weights?b_ih?and?b_hh. Default:?True

  • batch_first?– If?True, then the input and output tensors are provided as (batch, seq, feature). Default:?False

  • dropout?– If non-zero, introduces a?Dropout?layer on the outputs of each?LSTM?layer except the last layer, with dropout probability equal to?dropout. Default: 0

  • bidirectional?– If?True, becomes a bidirectional?LSTM. Default:?False

?

Inputs: input, (h_0, c_0)

  • input?of shape?(seq_len, batch, input_size): tensor containing the features of the input sequence. The input can also be a packed variable length sequence. See?torch.nn.utils.rnn.pack_padded_sequence()?or?torch.nn.utils.rnn.pack_sequence()?for details.? 主要輸入的shape,(seq_len, batch, input_size),batch在第二個位置

  • h_0?of shape?(num_layers * num_directions, batch, hidden_size): tensor containing the initial hidden state for each element in the batch. If the?LSTM?is bidirectional, num_directions should be 2, else it should be 1. 初始hidden_state

  • c_0?of shape?(num_layers * num_directions, batch, hidden_size): tensor containing the initial cell state for each element in the batch.? 初始cell_state

    If?(h_0, c_0)?is not provided, both?h_0?and?c_0?default to zero. 沒有手動初始化,會自動初始化為0

Outputs: output, (h_n, c_n)

  • output?of shape?(seq_len, batch, num_directions * hidden_size): tensor containing the output features?(h_t)?from the last layer of the?LSTM, for each?t. If a?torch.nn.utils.rnn.PackedSequence?has been given as the input, the output will also be a packed sequence.

    For the unpacked case, the directions can be separated using?output.view(seq_len,?batch,?num_directions,?hidden_size), with forward and backward being direction?0?and?1?respectively. Similarly, the directions can be separated in the packed case.

  • h_n?of shape?(num_layers * num_directions, batch, hidden_size): tensor containing the hidden state for?t = seq_len.

    Like?output, the layers can be separated using?h_n.view(num_layers,?num_directions,?batch,?hidden_size)?and similarly for?c_n.

  • c_n?of shape?(num_layers * num_directions, batch, hidden_size): tensor containing the cell state for?t = seq_len.

Examples:

>>> rnn = nn.LSTM(10, 20, 2) # input_size: 10, hidden_size: 20, num_layer: 2 >>> input = torch.randn(5, 3, 10) # sequence length: 5, batch size: 3, input size: 10 >>> h0 = torch.randn(2, 3, 20) >>> c0 = torch.randn(2, 3, 20) >>> output, (hn, cn) = rnn(input, (h0, c0)) >>> output.size()torch.Size([5, 3, 20])

初始化的lstm的input size為10, hidden size為20,num layer為2

輸入的shape為(5,3,10),sequence長度為5,batch size為3,input size為10

輸出的shape為(5,3,20),sequence長度為5,batch size為3,output size為20,20是由方向的數(shù)量(1 or 2)* hidden size

?

?

CLASS torch.nn.Linear(in_features: int, out_features: int, bias: bool = True)

Parameters

  • in_features?– size of each input sample? 每個input的feature的數(shù)量

  • out_features?– size of each output sample? 每個output的feature的數(shù)量

  • bias?– If set to?False, the layer will not learn an additive bias. Default:?True

Shape:

  • Input: (N,?,Hin?)?where?*??means any number of additional dimensions and Hin?=in_features? 輸入的size

  • Output: (N,?,Hout?)?where all but the last dimension are the same shape as the input and Hout?=out_features. 輸出的size

>>> m = nn.Linear(20, 30) >>> input = torch.randn(128, 20) >>> output = m(input) # 輸入有128個sample,每個sample的長度是20 >>> print(output.size()) # 輸出同樣會是128個sample,每個sample的長度為30 torch.Size([128, 30])

?

總結(jié):

nn.Embeddings:

初始化需要num_embeddings,即使用的字典里字的數(shù)量,?embedding_dim,即生成的詞嵌入向量的長度

輸入:(batch_size, sequence_size)

輸出:(batch_size, sequence_size, embedding_dim)將shape變化一下,可以直接作為lstm的輸入

?

nn.LSTM:

初始化需要input_size,即每個輸入的sample里feature向量的長度,對應(yīng)embedding里的embedding_dim,?hidden_size,即hidden_state向量的長度,num_layer,即幾層lstm

輸入:(sequence_len, batch_size, input_size)

輸出:(sequence_len, batch_size, num_directions*hidden_size)

?

nn.Linear

初始化需要in_features,out_features

輸入:(batch_size,in_features)

輸出:(batch_size,out_features)

總結(jié)

以上是生活随笔為你收集整理的pytorch中nn.Embedding和nn.LSTM和nn.Linear的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯,歡迎將生活随笔推薦給好友。