本次使用的中间件版本:
elasticsearch:7.13.3
ik:7.13.2
springboot:2.1.3.RELEASE
1、中文分词器IK下载安装 下载地址
https://github.com/medcl/elasticsearch-analysis-ik/releases
2、下载后,解压到elasticsearch的插件目录中 3、修改IK的版本描述,因为这次用的IK的版本低于ES的版本 IK的版本号修改成当前ES的版本号 4、再次启动ES,通过ES的可视化工具,创建index 给刚才追加的索引,追加mapping 查看刚才设定的mapping,已经设定成功了 看看分词器的效果,使用其他工具也可以,这次使用postman ik_smart的分词效果(汉语习惯) ik_max_word的分词效果(穷举效果) 至此,ES + IK的简单设置,基本完成
5、springboot中集成ES 修改springboot的pom依赖
<!-- springboot集成elasticsearch的依赖 -->
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>7.6.2</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>
yml配置文件中进行追加
spring:
data:
elasticsearch:
cluster-name: elasticsearch
cluster-nodes: 127.0.0.1:9200
6、使用java程序进行插入数据
@Autowired
private RestHighLevelClient restHighLevelClient;
IndexRequest indexRequest = new IndexRequest("tongyuan6_index", "_doc", "1");
User user = new User();
user.setName("李娜");
user.setAge("25");
user.setInfo("全国人民大团结万岁,中国的交通很发达,道路很漂亮,非常畅通?");
indexRequest.source(JSONObject.toJSONString(user), XContentType.JSON);
IndexResponse index = restHighLevelClient.index(indexRequest, RequestOptions.DEFAULT);
System.out.println(JSONObject.toJSONString(index));
7、使用java程序进行分词、高亮检索
public List<JSONObject> searchContentListByES(PmcBimComment pmcBimComment) {
List<JSONObject> blogList = new ArrayList<JSONObject>();
try {
SearchRequest searchRequest = new SearchRequest("tongyuan6_index");
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
HighlightBuilder highlightBuilder = new HighlightBuilder().field("*").requireFieldMatch(false);
highlightBuilder.preTags("<span style=\"color:red\">");
highlightBuilder.postTags("</span>");
sourceBuilder.highlighter(highlightBuilder);
MatchQueryBuilder matchQueryBuilder = QueryBuilders.matchQuery("info", "济南高新区");
sourceBuilder.query(matchQueryBuilder).timeout(new TimeValue(60, TimeUnit.SECONDS));
searchRequest.source(sourceBuilder);
SearchResponse response = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
for(SearchHit hit : response.getHits()){
Map<String, Object> source = hit.getSourceAsMap();
Map<String, HighlightField> highlightFields = hit.getHighlightFields();
HighlightField nameField = highlightFields.get("info");
if (nameField != null) {
Text[] fragments = nameField.fragments();
StringBuilder nameTmp = new StringBuilder();
for(Text text : fragments){
nameTmp.append(text);
}
source.put("info", nameTmp.toString());
}
}
SearchHits hits = response.getHits();
SearchHit[] searchHits = hits.getHits();
for (SearchHit hit : searchHits) {
JSONObject jsonObject = new JSONObject(hit.getSourceAsMap());
blogList.add(jsonObject);
}
} catch (IOException e) { }
return blogList;
}
检索结果,明显看出"济南高新区"按照汉语语法被分解成"济南"、“高新区”,分别进行适配检索和高亮显示,同一个返回字段中,有多个"高新区",全部会高亮显示 8、好了,至此,全部完工
|