wordcloud:
- 安装模块:
pip install wordcloud - 基本使用:
WordCloud(font_path, background_color, width, height, max_words).generate(xxx)
font_path :文本的字体collocations :是否包含两个词的搭配,默认为true,所以会有重复的数据background_color :背景色width :幕布的宽度height :幕布的高度max_words :显示的最大词个数generate :读取文本文件 - 案例:
from wordcloud import WordCloud
with open("xxx.txt", encoding="utf-8") as r:
txt = r.read()
wordcloud = WordCloud(font_path="xxx.ttf", collocations=False, background_color="black", width=800, height=600, max_words=50).generate(txt)
img = wordcloud.to_image()
img.show()
wordcloud.to_file("xxx.jpg")
jieba:
- 安装模块:
pip install jieba - 基本格式:
jieba.analyse.extract_tags(xxx, topK, withWeight, allowPOS)
xxx :需要处理的文本topK :返回关键字的数量,重要性从高到低withWeight :返回每个关键字的权重allowPOS :需要提取的词性,n为名词、v为动词,传的值为元祖 - 案例:
import jieba.analyse
from wordcloud import WordCloud
text = ""
seg_list = jieba.analyse.extract_tags(text, allowPOS=("n", "v"))
txt_str = " ".join(seg_list)
wordcloud = WordCloud(font_path="xxx.ttf", collocations=False, background_color="black", width=800, height=600, max_words=50).generate(txt_str)
img = wordcloud.to_image()
img.show()
wordcloud.to_file("xxx.jpg")
|