日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

今日arXiv精选 | 14篇EMNLP 2021最新论文

發(fā)布時間:2024/10/8 编程问答 33 豆豆
生活随笔 收集整理的這篇文章主要介紹了 今日arXiv精选 | 14篇EMNLP 2021最新论文 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

?關(guān)于?#今日arXiv精選?

這是「AI 學(xué)術(shù)前沿」旗下的一檔欄目,編輯將每日從arXiv中精選高質(zhì)量論文,推送給讀者。

Effective Sequence-to-Sequence Dialogue State Tracking

Comment: Accepted at EMNLP 2021

Link:?http://arxiv.org/abs/2108.13990

Abstract

Sequence-to-sequence models have been applied to a wide variety of NLP tasks,but how to properly use them for dialogue state tracking has not beensystematically investigated. In this paper, we study this problem from theperspectives of pre-training objectives as well as the formats of contextrepresentations. We demonstrate that the choice of pre-training objective makesa significant difference to the state tracking quality. In particular, we findthat masked span prediction is more effective than auto-regressive languagemodeling. We also explore using Pegasus, a span prediction-based pre-trainingobjective for text summarization, for the state tracking model. We found thatpre-training for the seemingly distant summarization task works surprisinglywell for dialogue state tracking. In addition, we found that while recurrentstate context representation works also reasonably well, the model may have ahard time recovering from earlier mistakes. We conducted experiments on theMultiWOZ 2.1-2.4 data sets with consistent observations.

Thermostat: A Large Collection of NLP Model Explanations and Analysis Tools

Comment: Accepted to EMNLP 2021 System Demonstrations

Link:?http://arxiv.org/abs/2108.13961

Abstract

In the language domain, as in other domains, neural explainability takes anever more important role, with feature attribution methods on the forefront.Many such methods require considerable computational resources and expertknowledge about implementation details and parameter choices. To facilitateresearch, we present Thermostat which consists of a large collection of modelexplanations and accompanying analysis tools. Thermostat allows easy access toover 200k explanations for the decisions of prominent state-of-the-art modelsspanning across different NLP tasks, generated with multiple explainers. Thedataset took over 10k GPU hours (>one year) to compile; compute time that thecommunity now saves. The accompanying software tools allow to analyseexplanations instance-wise but also accumulatively on corpus level. Users caninvestigate and compare models, datasets and explainers without the need toorchestrate implementation details. Thermostat is fully open source,democratizes explainability research in the language domain, circumventsredundant computations and increases comparability and replicability.

Robust Retrieval Augmented Generation for Zero-shot Slot Filling

Comment: Accepted at EMNLP 2021. arXiv admin note: substantial text overlap ?with arXiv:2104.08610

Link:?http://arxiv.org/abs/2108.13934

Abstract

Automatically inducing high quality knowledge graphs from a given collectionof documents still remains a challenging problem in AI. One way to make headwayfor this problem is through advancements in a related task known as slotfilling. In this task, given an entity query in form of [Entity, Slot, ?], asystem is asked to fill the slot by generating or extracting the missing valueexploiting evidence extracted from relevant passage(s) in the given documentcollection. The recent works in the field try to solve this task in anend-to-end fashion using retrieval-based language models. In this paper, wepresent a novel approach to zero-shot slot filling that extends dense passageretrieval with hard negatives and robust training procedures for retrievalaugmented generation models. Our model reports large improvements on both T-RExand zsRE slot filling datasets, improving both passage retrieval and slot valuegeneration, and ranking at the top-1 position in the KILT leaderboard.Moreover, we demonstrate the robustness of our system showing its domainadaptation capability on a new variant of the TACRED dataset for slot filling,through a combination of zero/few-shot learning. We release the source code andpre-trained models.

Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning

Comment: Accepted by EMNLP2021 main conference

Link:?http://arxiv.org/abs/2108.13888

Abstract

\textbf{P}re-\textbf{T}rained \textbf{M}odel\textbf{s} have been widelyapplied and recently proved vulnerable under backdoor attacks: the releasedpre-trained weights can be maliciously poisoned with certain triggers. When thetriggers are activated, even the fine-tuned model will predict pre-definedlabels, causing a security threat. These backdoors generated by the poisoningmethods can be erased by changing hyper-parameters during fine-tuning ordetected by finding the triggers. In this paper, we propose a strongerweight-poisoning attack method that introduces a layerwise weight poisoningstrategy to plant deeper backdoors; we also introduce a combinatorial triggerthat cannot be easily detected. The experiments on text classification tasksshow that previous defense methods cannot resist our weight-poisoning method,which indicates that our method can be widely applied and may provide hints forfuture model robustness studies.

When Retriever-Reader Meets Scenario-Based Multiple-Choice Questions

Comment: 10 pages, accepted to Findings of EMNLP 2021

Link:?http://arxiv.org/abs/2108.13875

Abstract

Scenario-based question answering (SQA) requires retrieving and readingparagraphs from a large corpus to answer a question which is contextualized bya long scenario description. Since a scenario contains both keyphrases forretrieval and much noise, retrieval for SQA is extremely difficult. Moreover,it can hardly be supervised due to the lack of relevance labels of paragraphsfor SQA. To meet the challenge, in this paper we propose a jointretriever-reader model called JEEVES where the retriever is implicitlysupervised only using QA labels via a novel word weighting mechanism. JEEVESsignificantly outperforms a variety of strong baselines on multiple-choicequestions in three SQA datasets.

Contrastive Domain Adaptation for Question Answering using Limited Text Corpora

Comment: Accepted to EMNLP 2021

Link:?http://arxiv.org/abs/2108.13854

Abstract

Question generation has recently shown impressive results in customizingquestion answering (QA) systems to new domains. These approaches circumvent theneed for manually annotated training data from the new domain and, instead,generate synthetic question-answer pairs that are used for training. However,existing methods for question generation rely on large amounts of syntheticallygenerated datasets and costly computational resources, which render thesetechniques widely inaccessible when the text corpora is of limited size. Thisis problematic as many niche domains rely on small text corpora, whichnaturally restricts the amount of synthetic data that can be generated. In thispaper, we propose a novel framework for domain adaptation called contrastivedomain adaptation for QA (CAQA). Specifically, CAQA combines techniques fromquestion generation and domain-invariant learning to answer out-of-domainquestions in settings with limited text corpora. Here, we train a QA system onboth source data and generated data from the target domain with a contrastiveadaptation loss that is incorporated in the training objective. By combiningtechniques from question generation and domain-invariant learning, our modelachieved considerable improvements compared to state-of-the-art baselines.

Enjoy the Salience: Towards Better Transformer-based Faithful Explanations with Word Salience

Comment: EMNLP 2021 Pre-print

Link:?http://arxiv.org/abs/2108.13759

Abstract

Pretrained transformer-based models such as BERT have demonstratedstate-of-the-art predictive performance when adapted into a range of naturallanguage processing tasks. An open problem is how to improve the faithfulnessof explanations (rationales) for the predictions of these models. In thispaper, we hypothesize that salient information extracted a priori from thetraining data can complement the task-specific information learned by the modelduring fine-tuning on a downstream task. In this way, we aim to help BERT notto forget assigning importance to informative input tokens when makingpredictions by proposing SaLoss; an auxiliary loss function for guiding themulti-head attention mechanism during training to be close to salientinformation extracted a priori using TextRank. Experiments for explanationfaithfulness across five datasets, show that models trained with SaLossconsistently provide more faithful explanations across four different featureattribution methods compared to vanilla BERT. Using the rationales extractedfrom vanilla BERT and SaLoss models to train inherently faithful classifiers,we further show that the latter result in higher predictive performance indownstream tasks.

Plan-then-Generate: Controlled Data-to-Text Generation via Planning

Comment: Accepted to Findings of EMNLP 2021

Link:?http://arxiv.org/abs/2108.13740

Abstract

Recent developments in neural networks have led to the advance indata-to-text generation. However, the lack of ability of neural models tocontrol the structure of generated output can be limiting in certain real-worldapplications. In this study, we propose a novel Plan-then-Generate (PlanGen)framework to improve the controllability of neural data-to-text models.Extensive experiments and analyses are conducted on two benchmark datasets,ToTTo and WebNLG. The results show that our model is able to control both theintra-sentence and inter-sentence structure of the generated output.Furthermore, empirical comparisons against previous state-of-the-art methodsshow that our model improves the generation quality as well as the outputdiversity as judged by human and automatic evaluations.

Automatic Rule Generation for Time Expression Normalization

Comment: Accepted to Findings of EMNLP 2021

Link:?http://arxiv.org/abs/2108.13658

Abstract

The understanding of time expressions includes two sub-tasks: recognition andnormalization. In recent years, significant progress has been made in therecognition of time expressions while research on normalization has laggedbehind. Existing SOTA normalization methods highly rely on rules or grammarsdesigned by experts, which limits their performance on emerging corpora, suchas social media texts. In this paper, we model time expression normalization asa sequence of operations to construct the normalized temporal value, and wepresent a novel method called ARTime, which can automatically generatenormalization rules from training data without expert interventions.Specifically, ARTime automatically captures possible operation sequences fromannotated data and generates normalization rules on time expressions withcommon surface forms. The experimental results show that ARTime cansignificantly surpass SOTA methods on the Tweets benchmark, and achievescompetitive results with existing expert-engineered rule methods on theTempEval-3 benchmark.

Discretized Integrated Gradients for Explaining Language Models

Comment: Accepted in EMNLP 2021

Link:?http://arxiv.org/abs/2108.13654

Abstract

As a prominent attribution-based explanation algorithm, Integrated Gradients(IG) is widely adopted due to its desirable explanation axioms and the ease ofgradient computation. It measures feature importance by averaging the model'soutput gradient interpolated along a straight-line path in the input dataspace. However, such straight-line interpolated points are not representativeof text data due to the inherent discreteness of the word embedding space. Thisquestions the faithfulness of the gradients computed at the interpolated pointsand consequently, the quality of the generated explanations. Here we proposeDiscretized Integrated Gradients (DIG), which allows effective attributionalong non-linear interpolation paths. We develop two interpolation strategiesfor the discrete word embedding space that generates interpolation points thatlie close to actual words in the embedding space, yielding more faithfulgradient computation. We demonstrate the effectiveness of DIG over IG throughexperimental and human evaluations on multiple sentiment classificationdatasets. We provide the source code of DIG to encourage reproducible research.

T3-Vis: a visual analytic framework for Training and fine-Tuning Transformers in NLP

Comment: 10 pages, 4 figures, accepted to EMNLP 2021 System Demonstration

Link:?http://arxiv.org/abs/2108.13587

Abstract

Transformers are the dominant architecture in NLP, but their training andfine-tuning is still very challenging. In this paper, we present the design andimplementation of a visual analytic framework for assisting researchers in suchprocess, by providing them with valuable insights about the model's intrinsicproperties and behaviours. Our framework offers an intuitive overview thatallows the user to explore different facets of the model (e.g., hidden states,attention) through interactive visualization, and allows a suite of built-inalgorithms that compute the importance of model components and different partsof the input sequence. Case studies and feedback from a user focus groupindicate that the framework is useful, and suggest several improvements.

Scheduled Sampling Based on Decoding Steps for Neural Machine Translation

Comment: Code is at https://github.com/Adaxry/ss_on_decoding_steps. To appear ?in EMNLP-2021 main conference. arXiv admin note: text overlap with ?arXiv:2107.10427

Link:?http://arxiv.org/abs/2108.12963

Abstract

Scheduled sampling is widely used to mitigate the exposure bias problem forneural machine translation. Its core motivation is to simulate the inferencescene during training by replacing ground-truth tokens with predicted tokens,thus bridging the gap between training and inference. However, vanillascheduled sampling is merely based on training steps and equally treats alldecoding steps. Namely, it simulates an inference scene with uniform errorrates, which disobeys the real inference scene, where larger decoding stepsusually have higher error rates due to error accumulations. To alleviate theabove discrepancy, we propose scheduled sampling methods based on decodingsteps, increasing the selection chance of predicted tokens with the growth ofdecoding steps. Consequently, we can more realistically simulate the inferencescene during training, thus better bridging the gap between training andinference. Moreover, we investigate scheduled sampling based on both trainingsteps and decoding steps for further improvements. Experimentally, ourapproaches significantly outperform the Transformer baseline and vanillascheduled sampling on three large-scale WMT tasks. Additionally, our approachesalso generalize well to the text summarization task on two popular benchmarks.

Distilling the Knowledge of Large-scale Generative Models into Retrieval Models for Efficient Open-domain Conversation

Comment: EMNLP21-Findings

Link:?http://arxiv.org/abs/2108.12582

Abstract

Despite the remarkable performance of large-scale generative models inopen-domain conversation, they are known to be less practical for buildingreal-time conversation systems due to high latency. On the other hand,retrieval models could return responses with much lower latency but showinferior performance to the large-scale generative models since theconversation quality is bounded by the pre-defined response set. To takeadvantage of both approaches, we propose a new training method called G2R(Generative-to-Retrieval distillation) that preserves the efficiency of aretrieval model while leveraging the conversational ability of a large-scalegenerative model by infusing the knowledge of the generative model into theretrieval model. G2R consists of two distinct techniques of distillation: thedata-level G2R augments the dialogue dataset with additional responsesgenerated by the large-scale generative model, and the model-level G2Rtransfers the response quality score assessed by the generative model to thescore of the retrieval model by the knowledge distillation loss. Throughextensive experiments including human evaluation, we demonstrate that ourretrieval-based conversation system trained with G2R shows a substantiallyimproved performance compared to the baseline retrieval model while showingsignificantly lower inference latency than the large-scale generative models.

Few-Shot Table-to-Text Generation with Prototype Memory

Comment: Accepted to Findings of EMNLP 2021

Link:?http://arxiv.org/abs/2108.12516

Abstract

Neural table-to-text generation models have achieved remarkable progress onan array of tasks. However, due to the data-hungry nature of neural models,their performances strongly rely on large-scale training examples, limitingtheir applicability in real-world applications. To address this, we propose anew framework: Prototype-to-Generate (P2G), for table-to-text generation underthe few-shot scenario. The proposed framework utilizes the retrievedprototypes, which are jointly selected by an IR system and a novel prototypeselector to help the model bridging the structural gap between tables andtexts. Experimental results on three benchmark datasets with threestate-of-the-art models demonstrate that the proposed framework significantlyimproves the model performance across various evaluation metrics.

·

·

總結(jié)

以上是生活随笔為你收集整理的今日arXiv精选 | 14篇EMNLP 2021最新论文的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯,歡迎將生活随笔推薦給好友。

主站蜘蛛池模板: 国产福利小视频在线 | 美女午夜激情 | 爱情岛论坛永久入址在线 | 91黑丝美女 | 欧美性生活xxx | 日韩特黄毛片 | 成人在线国产精品 | 一级特黄av | 成人网站在线进入爽爽爽 | 欧美69av| 青青草毛片 | 亚洲综合第一区 | 美女又爽又黄又免费 | v99av | 日韩免费网站 | 波多野结衣一二三四区 | 婷婷色在线播放 | www..com色| 亚洲国产午夜 | 四虎成人永久免费视频 | 女女调教被c哭捆绑喷水百合 | 成人精品视频99在线观看免费 | 美女张开腿流出白浆 | 日韩精品无码一区二区三区久久久 | 欧美乱妇狂野欧美视频 | 曰韩毛片 | 国模杨依粉嫩蝴蝶150p | 国产第5页| 欧美成人影音 | 久久久久久久久久久久Av | 男人懂得网站 | 超碰在线91 | 134vcc影院免费观看 | 做视频| 18禁肉肉无遮挡无码网站 | 欧美亚洲一级 | 青草99 | 亚洲123区 | 伦理亚洲 | 久久久久久久久久久久Av | 四虎在线精品 | 伊人网综合视频 | 99爱免费 | 天堂中文在线资 | 激情内射人妻1区2区3区 | 亚洲精品少妇 | 日韩av大片在线观看 | 久热精品在线视频 | 北条麻纪在线观看aⅴ | 久久91精品国产91久久小草 | 久久久久久久久免费 | 日韩午夜一区 | 日本精品入口免费视频 | 国产视频一区在线观看 | 91国内在线 | 国产电影一区在线观看 | 欧美久久久久久久久久 | 欧美日韩高清一区 | 丰满人妻中伦妇伦精品app | 四虎5151久久欧美毛片 | 18做爰免费视频网站 | 欧美成人高清在线 | 嫩草视频国产 | 蜜桃av网| 中国女人性猛交 | 久久四色 | 国产精品高清网站 | 成人在线免费视频观看 | 午夜免费视频观看 | 波多野结衣50连登视频 | 久久中文精品 | 一级特黄高清 | 国产av一区二区三区传媒 | 成人亚洲天堂 | 伊人色av| 国产精品日日做人人爱 | 成人精品亚洲 | 日本一区免费视频 | 美女主播福利视频 | 美女扒开屁股让男人捅 | 在线观看网页视频 | 狠狠干夜夜骑 | 九色视频91 | 国产又黄又猛又粗 | 久章草影院| 尤物视频在线观看国产 | 波多野结衣之潜藏淫欲 | 99色综合| 亚洲无码精品一区二区三区 | 国产夜夜爽 | 中文字幕69 | 亚洲影院一区二区三区 | 在线观看理论片 | 日韩欧美中出 | 国产粉嫩在线观看 | 中文字幕在线播放日韩 | 国产精品1024 | 日韩av三区 | 伊人久久视频 |