[C++知识库] 科研训练第五周：关于《Learn from Syntax: Improving Pair-wise Aspect and Opinion Terms Extraction with Ric》的复现—

开发: C++知识库 Java知识库 JavaScript Python PHP知识库人工智能区块链大数据移动开发嵌入式开发工具数据结构与算法开发测试游戏开发网络协议系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑笔记本显卡显示器固态硬盘硬盘耳机手机 iphone vivo oppo 小米华为单反装机图拉丁

-> C++知识库 -> 科研训练第五周：关于《Learn from Syntax: Improving Pair-wise Aspect and Opinion Terms Extraction with Ric》的复现—— -> 正文阅读

[C++知识库]科研训练第五周：关于《Learn from Syntax: Improving Pair-wise Aspect and Opinion Terms Extraction with Ric》的复现——

周二结束软件课设的答辩，周三结束团学会议，接着是组原实验以及算法课设DDL
大三过分充实😄

遇到的报错的主要是transformsers的没有xBertTokenizer的问题以及sklearn没法安装的问题
暂时改用了BertTokenizer作为替代，还没找官方文档看区别

今天先跑出来preprocess.py，其他的文件看再逐个筛查8，环境真的配麻了

---------------------------------------------------------下午_电脑快没电啦------------------------------------------

没有改JDK配置，参考博客CoreNLP安装教程
但是中文还是会报错
在这里插入图片描述
不管了，暂时也用不到中文
附上英文的测试代码以及输出结果：

from stanfordcorenlp import StanfordCoreNLP
nlp = StanfordCoreNLP(r'E:\SCI\SynFue_Demo\core_nlp\stanford-corenlp-latest\stanford-corenlp-4.3.1')

sentence = 'Guangdong University of Foreign Studies is located in Guangzhou.'
print ('Tokenize:', nlp.word_tokenize(sentence))
print ('Part of Speech:', nlp.pos_tag(sentence))
print ('Named Entities:', nlp.ner(sentence))
print ('Constituency Parsing:', nlp.parse(sentence))#语法树
print ('Dependency Parsing:', nlp.dependency_parse(sentence))#依存句法
nlp.close() # Do not forget to close! The backend server will consume a lot memery

输出结果（因为之前动了包里面自带的代码，导致有奇怪的语句混入）

sa= ('::1', 9001, 0, 0)
/?properties=%7B%27annotators%27%3A+%27ssplit%2Ctokenize%27%2C+%27outputFormat%27%3A+%27json%27%7D&pipelineLanguage=en
Tokenize: ['Guangdong', 'University', 'of', 'Foreign', 'Studies', 'is', 'located', 'in', 'Guangzhou', '.']
sa= ('::1', 9001, 0, 0)
/?properties=%7B%27annotators%27%3A+%27pos%27%2C+%27outputFormat%27%3A+%27json%27%7D&pipelineLanguage=en
Part of Speech: [('Guangdong', 'NNP'), ('University', 'NNP'), ('of', 'IN'), ('Foreign', 'NNP'), ('Studies', 'NNPS'), ('is', 'VBZ'), ('located', 'VBN'), ('in', 'IN'), ('Guangzhou', 'NNP'), ('.', '.')]
sa= ('::1', 9001, 0, 0)
/?properties=%7B%27annotators%27%3A+%27ner%27%2C+%27outputFormat%27%3A+%27json%27%7D&pipelineLanguage=en
Named Entities: [('Guangdong', 'ORGANIZATION'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('Foreign', 'ORGANIZATION'), ('Studies', 'ORGANIZATION'), ('is', 'O'), ('located', 'O'), ('in', 'O'), ('Guangzhou', 'CITY'), ('.', 'O')]
sa= ('::1', 9001, 0, 0)
/?properties=%7B%27annotators%27%3A+%27pos%2Cparse%27%2C+%27outputFormat%27%3A+%27json%27%7D&pipelineLanguage=en
Constituency Parsing: (ROOT
  (S
    (NP
      (NP (NNP Guangdong) (NNP University))
      (PP (IN of)
        (NP (NNP Foreign) (NNPS Studies))))
    (VP (VBZ is)
      (VP (VBN located)
        (PP (IN in)
          (NP (NNP Guangzhou)))))
    (. .)))
sa= ('::1', 9001, 0, 0)
/?properties=%7B%27annotators%27%3A+%27depparse%27%2C+%27outputFormat%27%3A+%27json%27%7D&pipelineLanguage=en
Dependency Parsing: [('ROOT', 0, 7), ('compound', 2, 1), ('nsubj:pass', 7, 2), ('case', 5, 3), ('compound', 5, 4), ('nmod', 2, 5), ('aux:pass', 7, 6), ('case', 9, 8), ('obl', 7, 9), ('punct', 7, 10)]
sa= ('::1', 9002, 0, 0)
/?properties=%7B%27annotators%27%3A+%27ssplit%2Ctokenize%27%2C+%27outputFormat%27%3A+%27json%27%7D&pipelineLanguage=zh