中文高频词汇统计并绘制词云
生活随笔
收集整理的這篇文章主要介紹了
中文高频词汇统计并绘制词云
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
轉載自:https://www.jianshu.com/p/2052d21a704c
代碼如下:
# -*- encoding:utf-8 -*- import sys reload(sys) sys.setdefaultencoding('utf-8') import jieba.analyse from os import path from scipy.misc import imread import matplotlib as mpl import matplotlib.pyplot as plt from wordcloud import WordCloud, STOPWORDS, ImageColorGeneratorif __name__ == "__main__":mpl.rcParams['font.sans-serif'] = ['FangSong']#mpl.rcParams['axes.unicode_minus'] = Falsecontent = open("testing.txt","rb").read()# tags extraction based on TF-IDF algorithmtags = jieba.analyse.extract_tags(content, topK=100, withWeight=False)text =" ".join(tags)text = unicode(text)# read the maskd = path.dirname(__file__)trump_coloring = imread(path.join(d, "Trump.jpg"))wc = WordCloud(font_path='/home/appleyuchi/.local/share/fonts/HanYiQuanTangshiTJ.ttf',background_color="white", max_words=300, mask=trump_coloring,max_font_size=40, random_state=42) # generate word cloud wc.generate(text)# generate color from imageimage_colors = ImageColorGenerator(trump_coloring)plt.imshow(wc)plt.axis("off")plt.show()總結
以上是生活随笔為你收集整理的中文高频词汇统计并绘制词云的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 基于flask让图标显示在浏览器的标签中
- 下一篇: bitbucket初次使用