日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

BERT论文阅读(二): CG-BERT:Conditional Text Generation with BERT for Generalized Few-shot Intent Detection

發布時間:2025/4/5 编程问答 45 豆豆
生活随笔 收集整理的這篇文章主要介紹了 BERT论文阅读(二): CG-BERT:Conditional Text Generation with BERT for Generalized Few-shot Intent Detection 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

目錄

The proposed method

Input Representation

The Encoder?

?The Decoder

?fine-tuning


discriminate a joint label space consisting of both existing intent which have enough labeled data and novel intents which only have a few examples for each class.

==> Conditional Text Generation with BERT

The proposed method

CG-BERT: adopts the CVAE(Condiional ?Variational?AutoEncoder) framework and incorporates BERT into both the encoder and the decoder.

采用條件變分自編碼器,并將BERT融入到encoder-decoder中

  • the encoder: encodes the utterance x and its intent y together into a latent variable z and models the posterior distribution p(z|x,y), where y is the condition in the CVAE model.

編碼器:同時將話語x和意圖y編碼為一個潛變量z,并且模擬z的后驗概率分布p(z|x,y),y是CVAE模型中的條件。 ==> encoder模擬few-shot intent的數據概率分布

  • the decoder: decodes z and the intent y together to reconstruct the input utterance x.

解碼器:同時解碼變量z和意圖y,以便重構輸入話語x ==> 利用masked attention的特性限制attend,以保持文本生成這種特定任務left-to-right的特性,保留其autoregressive特性!

  • to generate new utterances for an novel intent y, we sample the latent variable z from a prior distribution p(z|y) and utilize the decoder to decode z and y??into new utterances.

為新意圖y生成新話語,我們通過從一個先驗分布p(z|y)采樣潛變量z,并且用解碼器解碼變量z和y to 新話語。

It's able to generate more utterances for the novel intent through sampling from the learned distribution.

通過從學到的概率分布中采樣,為新意圖生成更多的話語

Input Representation

input: intent + utterance text sentences (concatenated)

句子S1: CLS token + intent y + SEP token --> first intent sentence

句子S2: utterance x + SEP --> second utterance sentence

whole input: S1 + S2

CLS: as the representation for the whole input

variable z: encode the embeddings for [CLS] to the latent variable z

Text are tokenized into subword units by WordPiece

embedding: obtained for each token --> token embeddings, position embeddings, segment embeddings

a given token: constructed by summing these three embeddings and represented as?? with a total length of T tokens.

The Encoder?

models the distribution of diverse utterances for a given intent.

對給定intent,即few-shot intent,的不同話語分布進行建模

to obtain deep bidirectional context information <-- models the attention between the intent tokens and the utterance tokens

為獲得深度雙向上下文信息 <-- 利用意圖令牌和話語令牌之間的attention進行建模

the input representation:??

multiple self-attention heads:?

output of the previous layer??--> a triple of queries, keys and values

embeddings for the [CLS] token in the 6-th transformer block??--> sentence-level representation

sentence-level representation?? --> a latent variable z =?a latent vector z,?where prior distribution p(z|y) is a multivariate standard Gaussian distribution.

?u and??in the Gaussian distribution q(z|x,y) = N(u, ) --> to sample z

?The Decoder

?aims to reconstruct the input utterance x using the latent variable z and the intent y.

目的是用潛變量z和意圖y重構輸入話語x

residual connection from input representation H0 --> decoder H6'殘差連接z和H0

==> input of the decoder ?

left-to-right manner ==> 掩碼masked attention

the attention mask --> helps the transformer blocks fit into the conditional text generation task.?

attention掩碼 --> 幫助transformer塊適應有條件文本生成任務

not whole bidirectional attention to the input ==> instead a mask matrix to determine whether a pair of tokens can be attended to each other.

并不是全部雙向attention的輸入 ==> 而是用一個掩碼矩陣去決定一對令牌是否要相互關注

updated Attention:

?

?output of 12-th transformer block in decoder?,?is the embeddings for the latent variable z

To further increase the?impact of z and alleviate the vanishing latent variable?problem,

embeddings of z with all the tokens?,

Two fully-connected layers with?a layer normalization to get the final representation

to predict the next token at position t+1 <-- the embeddings in Hf at position t

?fine-tuning

in order to improve the performance in the few-shot intent?of model?learned from existing intents with enough labeled data.

reference: Cross-Lingual Natural Language Generation via Pre-training

總結

以上是生活随笔為你收集整理的BERT论文阅读(二): CG-BERT:Conditional Text Generation with BERT for Generalized Few-shot Intent Detection的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。