周二结束软件课设的答辩,周三结束团学会议,接着是组原实验以及算法课设DDL 大三过分充实😄
遇到的报错的主要是transformsers的没有xBertTokenizer的问题以及sklearn没法安装的问题 暂时改用了BertTokenizer作为替代,还没找官方文档看区别
今天先跑出来preprocess.py,其他的文件看再逐个筛查8,环境真的配麻了
---------------------------------------------------------下午_电脑快没电啦------------------------------------------ 没有改JDK配置,参考博客CoreNLP安装教程 但是中文还是会报错 不管了,暂时也用不到中文 附上英文的测试代码以及输出结果:
from stanfordcorenlp import StanfordCoreNLP
nlp = StanfordCoreNLP(r'E:\SCI\SynFue_Demo\core_nlp\stanford-corenlp-latest\stanford-corenlp-4.3.1')
sentence = 'Guangdong University of Foreign Studies is located in Guangzhou.'
print ('Tokenize:', nlp.word_tokenize(sentence))
print ('Part of Speech:', nlp.pos_tag(sentence))
print ('Named Entities:', nlp.ner(sentence))
print ('Constituency Parsing:', nlp.parse(sentence))
print ('Dependency Parsing:', nlp.dependency_parse(sentence))
nlp.close()
输出结果(因为之前动了包里面自带的代码,导致有奇怪的语句混入)
sa= ('::1', 9001, 0, 0)
/?properties=%7B%27annotators%27%3A+%27ssplit%2Ctokenize%27%2C+%27outputFormat%27%3A+%27json%27%7D&pipelineLanguage=en
Tokenize: ['Guangdong', 'University', 'of', 'Foreign', 'Studies', 'is', 'located', 'in', 'Guangzhou', '.']
sa= ('::1', 9001, 0, 0)
/?properties=%7B%27annotators%27%3A+%27pos%27%2C+%27outputFormat%27%3A+%27json%27%7D&pipelineLanguage=en
Part of Speech: [('Guangdong', 'NNP'), ('University', 'NNP'), ('of', 'IN'), ('Foreign', 'NNP'), ('Studies', 'NNPS'), ('is', 'VBZ'), ('located', 'VBN'), ('in', 'IN'), ('Guangzhou', 'NNP'), ('.', '.')]
sa= ('::1', 9001, 0, 0)
/?properties=%7B%27annotators%27%3A+%27ner%27%2C+%27outputFormat%27%3A+%27json%27%7D&pipelineLanguage=en
Named Entities: [('Guangdong', 'ORGANIZATION'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('Foreign', 'ORGANIZATION'), ('Studies', 'ORGANIZATION'), ('is', 'O'), ('located', 'O'), ('in', 'O'), ('Guangzhou', 'CITY'), ('.', 'O')]
sa= ('::1', 9001, 0, 0)
/?properties=%7B%27annotators%27%3A+%27pos%2Cparse%27%2C+%27outputFormat%27%3A+%27json%27%7D&pipelineLanguage=en
Constituency Parsing: (ROOT
(S
(NP
(NP (NNP Guangdong) (NNP University))
(PP (IN of)
(NP (NNP Foreign) (NNPS Studies))))
(VP (VBZ is)
(VP (VBN located)
(PP (IN in)
(NP (NNP Guangzhou)))))
(. .)))
sa= ('::1', 9001, 0, 0)
/?properties=%7B%27annotators%27%3A+%27depparse%27%2C+%27outputFormat%27%3A+%27json%27%7D&pipelineLanguage=en
Dependency Parsing: [('ROOT', 0, 7), ('compound', 2, 1), ('nsubj:pass', 7, 2), ('case', 5, 3), ('compound', 5, 4), ('nmod', 2, 5), ('aux:pass', 7, 6), ('case', 9, 8), ('obl', 7, 9), ('punct', 7, 10)]
sa= ('::1', 9002, 0, 0)
/?properties=%7B%27annotators%27%3A+%27ssplit%2Ctokenize%27%2C+%27outputFormat%27%3A+%27json%27%7D&pipelineLanguage=zh
coreNLP使用文档
23:00好家伙,依赖边出不来,报错: 明天做一下,准备参考这个
|