python nlp_使用python在stanford-nlp中的回指解析
這是使用CoreNLP輸出的數據結構的一種可能的解決方案.提供所有信息.這并不是完整的解決方案,可能需要擴展才能處理所有情況,但這是一個很好的起點.
from pycorenlp import StanfordCoreNLP
nlp = StanfordCoreNLP('http://localhost:9000')
def resolve(corenlp_output):
""" Transfer the word form of the antecedent to its associated pronominal anaphor(s) """
for coref in corenlp_output['corefs']:
mentions = corenlp_output['corefs'][coref]
antecedent = mentions[0] # the antecedent is the first mention in the coreference chain
for j in range(1, len(mentions)):
mention = mentions[j]
if mention['type'] == 'PRONOMINAL':
# get the attributes of the target mention in the corresponding sentence
target_sentence = mention['sentNum']
target_token = mention['startIndex'] - 1
# transfer the antecedent's word form to the appropriate token in the sentence
corenlp_output['sentences'][target_sentence - 1]['tokens'][target_token]['word'] = antecedent['text']
def print_resolved(corenlp_output):
""" Print the "resolved" output """
possessives = ['hers', 'his', 'their', 'theirs']
for sentence in corenlp_output['sentences']:
for token in sentence['tokens']:
output_word = token['word']
# check lemmas as well as tags for possessive pronouns in case of tagging errors
if token['lemma'] in possessives or token['pos'] == 'PRP$':
output_word += "'s" # add the possessive morpheme
output_word += token['after']
print(output_word, end='')
text = "Tom and Jane are good friends. They are cool. He knows a lot of things and so does she. His car is red, but " \n "hers is blue. It is older than hers. The big cat ate its dinner."
output = nlp.annotate(text, properties= {'annotators':'dcoref','outputFormat':'json','ner.useSUTime':'false'})
resolve(output)
print('Original:', text)
print('Resolved: ', end='')
print_resolved(output)
這給出以下輸出:
Original: Tom and Jane are good friends. They are cool. He knows a lot of things and so does she. His car is red, but hers is blue. It is older than hers. The big cat ate his dinner.
Resolved: Tom and Jane are good friends. Tom and Jane are cool. Tom knows a lot of things and so does Jane. Tom's car is red, but Jane's is blue. His car is older than Jane's. The big cat ate The big cat's dinner.
如您所見,當代詞具有句子首字母(標題大小寫)的先行詞(最后一個句子中的“大貓”而不是“大貓”)時,該解決方案不涉及更正情況.這取決于先行詞的類別-普通名詞先詞需要小寫,而專有名詞先詞則不需要.
其他一些臨時處理可能是必要的(關于我測試語句中的所有格).它還假定您不希望重復使用原始輸出令牌,因為它們已被此代碼修改.解決該問題的方法是復制原始數據結構或創建新屬性,并相應地更改print_resolved函數.
糾正任何分辨率錯誤也是另一個挑戰!
總結
以上是生活随笔為你收集整理的python nlp_使用python在stanford-nlp中的回指解析的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 差分放大电路差模共模公式_差分放大电路对
- 下一篇: python数据处理框架_python