日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

自然语言15_Part of Speech Tagging with NLTK

發布時間:2024/6/21 编程问答 30 豆豆
生活随笔 收集整理的這篇文章主要介紹了 自然语言15_Part of Speech Tagging with NLTK 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

sklearn實戰-乳腺癌細胞數據挖掘(博主親自錄制視頻教程)

https://study.163.com/course/introduction.htm?courseId=1005269003&utm_campaign=commission&utm_source=cp-400000000398149&utm_medium=share

?

https://www.pythonprogramming.net/part-of-speech-tagging-nltk-tutorial/?completed=/stemming-nltk-tutorial/

?

# -*- coding: utf-8 -*- """ Created on Sun Nov 13 09:14:13 2016@author: daxiong """ import nltk from nltk.corpus import state_union from nltk.tokenize import PunktSentenceTokenizer#訓練數據 train_text=state_union.raw("2005-GWBush.txt") #測試數據 sample_text=state_union.raw("2006-GWBush.txt") '''Punkt is designed to learn parameters (a list of abbreviations, etc.) unsupervised from a corpus similar to the target domain. The pre-packaged models may therefore be unsuitable: use PunktSentenceTokenizer(text) to learn parameters from the given text ''' #我們現在訓練punkttokenizer(分句器) custom_sent_tokenizer=PunktSentenceTokenizer(train_text) #訓練后,我們可以使用punkttokenizer(分句器) tokenized=custom_sent_tokenizer.tokenize(sample_text)''' nltk.pos_tag(["fire"]) #pos_tag(列表) Out[19]: [('fire', 'NN')] '''#文本詞性標記函數 def process_content():try:for i in tokenized[0:5]:words=nltk.word_tokenize(i)tagged=nltk.pos_tag(words)print(tagged)except Exception as e:print(str(e))process_content()

?

?

?

One of the more powerful aspects of the NLTK module is the Part of Speech tagging that it can do for you. This means labeling words in a sentence as nouns, adjectives, verbs...etc. Even more impressive, it also labels by tense, and more. Here's a list of the tags, what they mean, and some examples:

POS tag list:CC coordinating conjunction CD cardinal digit DT determiner EX existential there (like: "there is" ... think of it like "there exists") FW foreign word IN preposition/subordinating conjunction JJ adjective 'big' JJR adjective, comparative 'bigger' JJS adjective, superlative 'biggest' LS list marker 1) MD modal could, will NN noun, singular 'desk' NNS noun plural 'desks' NNP proper noun, singular 'Harrison' NNPS proper noun, plural 'Americans' PDT predeterminer 'all the kids' POS possessive ending parent's PRP personal pronoun I, he, she PRP$ possessive pronoun my, his, hers RB adverb very, silently, RBR adverb, comparative better RBS adverb, superlative best RP particle give up TO to go 'to' the store. UH interjection errrrrrrrm VB verb, base form take VBD verb, past tense took VBG verb, gerund/present participle taking VBN verb, past participle taken VBP verb, sing. present, non-3d take VBZ verb, 3rd person sing. present takes WDT wh-determiner which WP wh-pronoun who, what WP$ possessive wh-pronoun whose WRB wh-abverb where, when

How might we use this? While we're at it, we're going to cover a new sentence tokenizer, called the PunktSentenceTokenizer. This tokenizer is capable of unsupervised machine learning, so you can actually train it on any body of text that you use. First, let's get some imports out of the way that we're going to use:

import nltk from nltk.corpus import state_union from nltk.tokenize import PunktSentenceTokenizer

Now, let's create our training and testing data:

train_text = state_union.raw("2005-GWBush.txt") sample_text = state_union.raw("2006-GWBush.txt")

One is a State of the Union address from 2005, and the other is from 2006 from past President George W. Bush.

Next, we can train the Punkt tokenizer like:

custom_sent_tokenizer = PunktSentenceTokenizer(train_text)

Then we can actually tokenize, using:

tokenized = custom_sent_tokenizer.tokenize(sample_text)

Now we can finish up this part of speech tagging script by creating a function that will run through and tag all of the parts of speech per sentence like so:

def process_content():try: for i in tokenized[:5]: words = nltk.word_tokenize(i) tagged = nltk.pos_tag(words) print(tagged) except Exception as e: print(str(e)) process_content()

The output should be a list of tuples, where the first element in the tuple is the word, and the second is the part of speech tag. It should look like:

[('PRESIDENT', 'NNP'), ('GEORGE', 'NNP'), ('W.', 'NNP'), ('BUSH', 'NNP'), ("'S", 'POS'), ('ADDRESS', 'NNP'), ('BEFORE', 'NNP'), ('A', 'NNP'), ('JOINT', 'NNP'), ('SESSION', 'NNP'), ('OF', 'NNP'), ('THE', 'NNP'), ('CONGRESS', 'NNP'), ('ON', 'NNP'), ('THE', 'NNP'), ('STATE', 'NNP'), ('OF', 'NNP'), ('THE', 'NNP'), ('UNION', 'NNP'), ('January', 'NNP'), ('31', 'CD'), (',', ','), ('2006', 'CD'), ('THE', 'DT'), ('PRESIDENT', 'NNP'), (':', ':'), ('Thank', 'NNP'), ('you', 'PRP'), ('all', 'DT'), ('.', '.')] [('Mr.', 'NNP'), ('Speaker', 'NNP'), (',', ','), ('Vice', 'NNP'), ('President', 'NNP'), ('Cheney', 'NNP'), (',', ','), ('members', 'NNS'), ('of', 'IN'), ('Congress', 'NNP'), (',', ','), ('members', 'NNS'), ('of', 'IN'), ('the', 'DT'), ('Supreme', 'NNP'), ('Court', 'NNP'), ('and', 'CC'), ('diplomatic', 'JJ'), ('corps', 'NNS'), (',', ','), ('distinguished', 'VBD'), ('guests', 'NNS'), (',', ','), ('and', 'CC'), ('fellow', 'JJ'), ('citizens', 'NNS'), (':', ':'), ('Today', 'NN'), ('our', 'PRP$'), ('nation', 'NN'), ('lost', 'VBD'), ('a', 'DT'), ('beloved', 'VBN'), (',', ','), ('graceful', 'JJ'), (',', ','), ('courageous', 'JJ'), ('woman', 'NN'), ('who', 'WP'), ('called', 'VBN'), ('America', 'NNP'), ('to', 'TO'), ('its', 'PRP$'), ('founding', 'NN'), ('ideals', 'NNS'), ('and', 'CC'), ('carried', 'VBD'), ('on', 'IN'), ('a', 'DT'), ('noble', 'JJ'), ('dream', 'NN'), ('.', '.')] [('Tonight', 'NNP'), ('we', 'PRP'), ('are', 'VBP'), ('comforted', 'VBN'), ('by', 'IN'), ('the', 'DT'), ('hope', 'NN'), ('of', 'IN'), ('a', 'DT'), ('glad', 'NN'), ('reunion', 'NN'), ('with', 'IN'), ('the', 'DT'), ('husband', 'NN'), ('who', 'WP'), ('was', 'VBD'), ('taken', 'VBN'), ('so', 'RB'), ('long', 'RB'), ('ago', 'RB'), (',', ','), ('and', 'CC'), ('we', 'PRP'), ('are', 'VBP'), ('grateful', 'JJ'), ('for', 'IN'), ('the', 'DT'), ('good', 'NN'), ('life', 'NN'), ('of', 'IN'), ('Coretta', 'NNP'), ('Scott', 'NNP'), ('King', 'NNP'), ('.', '.')] [('(', 'NN'), ('Applause', 'NNP'), ('.', '.'), (')', ':')] [('President', 'NNP'), ('George', 'NNP'), ('W.', 'NNP'), ('Bush', 'NNP'), ('reacts', 'VBZ'), ('to', 'TO'), ('applause', 'VB'), ('during', 'IN'), ('his', 'PRP$'), ('State', 'NNP'), ('of', 'IN'), ('the', 'DT'), ('Union', 'NNP'), ('Address', 'NNP'), ('at', 'IN'), ('the', 'DT'), ('Capitol', 'NNP'), (',', ','), ('Tuesday', 'NNP'), (',', ','), ('Jan', 'NNP'), ('.', '.')]

At this point, we can begin to derive meaning, but there is still some work to do. The next topic that we're going to cover is chunking, which is where we group words, based on their parts of speech, into hopefully meaningful groups.

python風控評分卡建模和風控常識

https://study.163.com/course/introduction.htm?courseId=1005214003&utm_campaign=commission&utm_source=cp-400000000398149&utm_medium=share

轉載于:https://www.cnblogs.com/webRobot/p/6080069.html

總結

以上是生活随笔為你收集整理的自然语言15_Part of Speech Tagging with NLTK的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。

主站蜘蛛池模板: 免费黄色美女网站 | 男人的网站在线观看 | 欧美成人乱码一二三四区免费 | 国产精品第一页在线观看 | 亚洲欧美系列 | 精品国精品国产自在久不卡 | 亚洲在线观看视频 | 操视频网站 | 亚洲理论中文字幕 | 99久久久无码国产 | 国产日韩av一区二区 | 伊是香蕉大人久久 | 日韩少妇中文字幕 | 日韩欧美亚洲一区 | 丰满的女邻居 | 91网站在线观看视频 | 嫩草大剧院 | 久久久三区| 一区二区不卡视频在线观看 | 夜夜嗨av一区二区三区四区 | 黄金网站在线观看 | 日日骑夜夜操 | 日韩欧美手机在线 | 成人黄色免费观看 | 国产精品99久久久久 | 第一次破处视频 | 先锋资源在线视频 | 日韩黄色三级视频 | 午夜激情在线观看 | 色婷婷免费视频 | 男人操女人网站 | 欧美aⅴ | 午夜性刺激免费视频 | 亚洲精品a区 | 日韩一区二区三区在线 | www.youjizz.com视频 | 97人妻精品一区二区 | 精品久久久久久久中文字幕 | 亚洲大片免费看 | 国内精品人妻无码久久久影院蜜桃 | 波多在线播放 | 一区二区在线观看av | 在线不卡免费av | 亲吻刺激视频 | 黑人巨大精品一区二区在线 | 日韩福利一区二区 | 欧美日韩一区在线观看 | 乱色精品无码一区二区国产盗 | 性欧美久久久 | 国产亚洲精品久久久久久打不开 | 欧美激情天堂 | 免费成人av在线播放 | 天天摸夜夜操 | 手机av免费看 | 国产精品zjzjzj在线观看 | 亚洲喷水 | 久久精品动漫 | 色牛影院 | 老司机成人在线 | 夜夜爱夜夜操 | 久久亚洲国产 | 捆绑无遮挡打光屁股 | 日本三级吃奶头添泬 | 国产美女www | 夜夜爽夜夜| 麻豆md0077饥渴少妇 | 2018狠狠干 | 午夜天堂视频 | 国产不卡在线播放 | 精品国产一区二区三区久久久久久 | 亚洲操图| 性生生活大片又黄又 | 精品在线你懂的 | 国产做爰免费观看视频 | 日韩在线www | 国产人妻777人伦精品hd | 久久久精品网 | 免费成人看片 | 欧美视频精品 | 国产无遮挡a片又黄又爽 | 国产人澡人澡澡澡人碰视频 | 国产a网| 国产污视频在线看 | 高清免费av| 国产 日韩 欧美 精品 | 又大又粗弄得我出好多水 | 国产一区两区 | 日韩一级完整毛片 | 免费午夜视频 | 最新av | 天堂视频免费看 | 91影院在线播放 | 伦理欧美 | 少妇做爰k8经典 | 亚洲熟妇中文字幕五十中出 | 欧美裸体xxxx | 操操干 | 中文字幕国产综合 | 俺也去综合 |