最近项目中遇到一个需求。统计用户的搜索热词Top5。于是就想到了用elasticsearch来记录用户检索时的关键词及用户信息,然后通过聚合操作实现统计用户搜索热词,返回搜索次数最多的前10个。
Elasticsearch版本:7.0.0
首先创建存储关键词及用户信息的索引:
POST http://localhost:9200/hotwords_test/_mapping
{
"properties": {
"search_txt": {
"type": "keyword"
},
"user_name":{
"type": "text",
"analyzer": "keyword"
},
"happend_time":{
"type": "date",
"format": "yyy-MM-dd HH:mm:ss"
}
}
}
通过RestHighLevelClient 客户端,将测试数据插入索引,首先引入maven依赖:
<dependencies>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>7.0.0</version>
</dependency>
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>7.0.0</version>
</dependency>
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.48</version>
</dependency>
</dependencies>
测试数据索引入库代码:
import com.alibaba.fastjson.JSONObject;
import org.apache.http.HttpHost;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.aggregations.Aggregation;
import org.elasticsearch.search.aggregations.AggregationBuilder;
import org.elasticsearch.search.aggregations.AggregationBuilders;
import org.elasticsearch.search.aggregations.Aggregations;
import org.elasticsearch.search.aggregations.bucket.terms.Terms;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import java.io.IOException;
public class ElasticsearchTesl {
public static final String host = "localhost";
public static final Integer port = 9200;
public static final String index = "hotwords_test";
public static void main(String[] args) throws IOException{
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(
new HttpHost(host, port, "http")));
JSONObject data = new JSONObject();
data.put("search_txt", "大枣");
data.put("user_name", "test");
data.put("happend_time", "2021-10-17 15:11:30");
String docId = indexDoc(client, index, data);
System.out.println(docId);
client.close();
}
public static String indexDoc(RestHighLevelClient client, String index, JSONObject data){
IndexRequest request = new IndexRequest(index);
request.source(data);
try {
IndexResponse response = client.index(request, RequestOptions.DEFAULT);
return response.getId();
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
}
执行多次,索引中已存在数据如下:
?
下面是聚合查询操作,查询出同一个用户,搜索各类水果的次数,并输出搜索次数最多的前5个。
AggregationBuilder aggregationBuilder = AggregationBuilders
.terms("value_count").field("search_txt").size(5);
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.aggregation(aggregationBuilder);
sourceBuilder.query(QueryBuilders.termQuery("user_name", "test"));
SearchRequest searchRequest = new SearchRequest(index);
searchRequest.source(sourceBuilder);
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
Aggregations aggregations = searchResponse.getAggregations();
for(Aggregation a:aggregations){
Terms terms = (Terms) a;
for(Terms.Bucket bucket:terms.getBuckets()){
System.out.println(bucket.getKeyAsString() +":" + bucket.getDocCount());
}
}
?控制台输出如下:
甘蔗:4
芒果:4
榴莲:3
大枣:2
桃子:2
|