IT数码 购物 网址 头条 软件 日历 阅读 图书馆
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
图片批量下载器
↓批量下载图片,美女图库↓
图片自动播放器
↓图片自动播放器↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁
 
   -> 大数据 -> elasticsearch:使用top_hits聚合获取分组列表 -> 正文阅读

[大数据]elasticsearch:使用top_hits聚合获取分组列表

业务需求

  • 需要获取分组列表,查询返回值不是一个列表,而是根据某个字段分组
  • 每组返回前5条,不足5条,则返回全部
  • 按照时间倒序

实现

  • elasticsearch提供了top_hits聚合函数,支持返回前几条
  • TermsAggregationBuilderTopHitsAggregationBuilder2个聚合函数结合使用,实现该效果

java代码

  • 具体代码如下:
	@Override
    public Map<String, Object> aggregateTopHitsByField(PolicyContentQuerySynVO queryVO) {
        Map<String, Object> result = new HashedMap();
        SearchRequestBuilder builder = elasticsearchTemplate.getClient()
                .prepareSearch(POLICY_INDEX_NAME).setTypes(POLICY_TYPE_NAME);
        //根据查询条件PolicyContentQuerySynVO,构造SearchCondition
        SearchCondition searchCondition = searchCondition(queryVO);
        //查询条件
        builder.setQuery(searchCondition.getQueryBuilder());
        builder.setFetchSource(false);
        TermsAggregationBuilder aggregation = AggregationBuilders.terms(queryVO.getAggregateKey()+"_arr")
                .field(queryVO.getAggregateKey());
        if(queryVO.getAggregateOrder() != null){
            aggregation.order(Terms.Order.count(queryVO.getAggregateOrder()));
        }
        if(null != queryVO.getAggregateSize()){
            aggregation.size(queryVO.getAggregateSize());
        }else {
            aggregation.size(2000);
        }
        // topHits 返回分组列表
        TopHitsAggregationBuilder topHitsAggregationBuilder = AggregationBuilders.topHits(queryVO.getAggregateKey());
        if(null != queryVO.getAggregateTopHitSize()){
            topHitsAggregationBuilder.size(queryVO.getAggregateTopHitSize());
        }else {
            topHitsAggregationBuilder.size(10);
        }
        // 增加排序
        String sortField = queryVO.getSortField();
        if(Ls.isEmpty(sortField)){
            sortField = SORT_FIELD;
        }
        if("desc".equals(queryVO.getSortOrder())){
            topHitsAggregationBuilder.sort(sortField, SortOrder.DESC);
        }else {
            topHitsAggregationBuilder.sort(sortField, SortOrder.ASC);
        }
        aggregation.subAggregation(topHitsAggregationBuilder);
        builder = builder.addAggregation(aggregation).setSize(0);
        logger.info("aggregateTopHitsByField:: builder:\n{}",builder);
        SearchResponse response = builder.get();
        if(response != null){
            String bucketKey = queryVO.getAggregateKey()+"_arr";
            Terms terms = response.getAggregations().get(bucketKey);
            Map<String,Long> countMap = new LinkedHashMap();
            Map<String,List<Object>> infoMap = new LinkedHashMap();
            for (Terms.Bucket entry1 : terms.getBuckets()) {
                String key = entry1.getKey().toString();          // bucket key
                Long docCount = entry1.getDocCount();            // Doc count
                countMap.put(key,docCount);
                TopHits topHits = entry1.getAggregations().get("level");
                List<Object> topResult = new ArrayList<>();
                for(SearchHit hit : topHits.getHits()){
                    Map<String, Object> resultMap = new HashedMap();
                    for (Map.Entry<String, Object> entry : hit.getSource().entrySet()) {
                        if(entry.getValue() == null){
                            continue;
                        }
                        resultMap.put(entry.getKey(),String.valueOf(entry.getValue()));
                    }
                    topResult.add(resultMap);
                }
                infoMap.put(key,topResult);
            }
            result.put(bucketKey + "_count",countMap);
            result.put(bucketKey + "_info",infoMap);
        }

        return result;
    }

kibana请求体

  • 也可以使用kibana等工具测试,或者直接curl测试
GET /policylibrary/info/_search
{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "isPublish": {
              "value": 1,
              "boost": 1
            }
          }
        },
        {
          "term": {
            "topicIds.keyword": {
              "value": 100041,
              "boost": 1
            }
          }
        }
      ],
      "disable_coord": false,
      "adjust_pure_negative": true,
      "boost": 1
    }
  },
  "_source": false,
  "aggregations": {
    "level_arr": {
      "terms": {
        "field": "level",
        "size": 2000,
        "min_doc_count": 1,
        "shard_min_doc_count": 0,
        "show_term_doc_count_error": false,
        "order": [
          {
            "_count": "desc"
          },
          {
            "_term": "asc"
          }
        ]
      },
      "aggregations": {
        "level": {
          "top_hits": {
            "from": 0,
            "size": 6,
            "version": false,
            "explain": false,
            "sort": [
              {
                "publishDate": {
                  "order": "desc"
                }
              }
            ],
            "_source" : {
              "includes" : [],
              "excludes" : ["content"]
            }
          }
        }
      }
    }
  }
}

返回示例

  • 返回结果为:
{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "level_arr": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": 2,
          "doc_count": 2,
          "level": {
            "hits": {
              "total": 2,
              "max_score": null,
              "hits": [
                {
                  "_index": "policylibrary",
                  "_type": "info",
                  "_id": "100000911",
                  "_score": null,
                  "_source": {
                    "publishDate": "2021-05-27 19:16:13",
                    "title": "安徽省人民政府办公厅关于推进社会保障卡建设的通知",
                    "url": "http://192.168.1.73:7381/site/tpl/5021?id=100000911"
                  },
                  "sort": [
                    1622142973000
                  ]
                },
                {
                  "_index": "policylibrary",
                  "_type": "info",
                  "_id": "100000651",
                  "_score": null,
                  "_source": {
                    "publishDate": "2021-03-15 15:15:25",
                    "title": "安徽省人民政府办公厅关于印发支持凹凸棒基新材料产业发展若干政策的通知",
                    "url": "http://192.168.1.73:7381/site/tpl/5021?id=100000651"
                  },
                  "sort": [
                    1615821325000
                  ]
                }
              ]
            }
          }
        },
        {
          "key": 1,
          "doc_count": 1,
          "level": {
            "hits": {
              "total": 1,
              "max_score": null,
              "hits": [
                {
                  "_index": "policylibrary",
                  "_type": "info",
                  "_id": "100000901",
                  "_score": null,
                  "_source": {
                    "publishDate": "2021-05-27 19:11:35",
                    "title": "人力资源和社会保障部关于印发“中华人民共和国社会保障卡”管理办法的通知",
                    "url": "http://192.168.x.xx:7381/site/tpl/5021?id=100000901"
                  },
                  "sort": [
                    1622142695000
                  ]
                }
              ]
            }
          }
        },
        {
          "key": 3,
          "doc_count": 1,
          "level": {
            "hits": {
              "total": 1,
              "max_score": null,
              "hits": [
                {
                  "_index": "policylibrary",
                  "_type": "info",
                  "_id": "100000931",
                  "_score": null,
                  "_source": {
                    "publishDate": "2021-07-05 15:31:25",
                    "title": "建党100周年",
                    "url": "http://192.168.x.xx:7381/site/tpl/5021?id=100000931"
                  },
                  "sort": [
                    1625499085000
                  ]
                }
              ]
            }
          }
        }
      ]
    }
  }
}
  大数据 最新文章
实现Kafka至少消费一次
亚马逊云科技:还在苦于ETL?Zero ETL的时代
初探MapReduce
【SpringBoot框架篇】32.基于注解+redis实现
Elasticsearch:如何减少 Elasticsearch 集
Go redis操作
Redis面试题
专题五 Redis高并发场景
基于GBase8s和Calcite的多数据源查询
Redis——底层数据结构原理
上一篇文章      下一篇文章      查看所有文章
加:2021-08-16 11:48:52  更:2021-08-16 11:51:01 
 
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁

360图书馆 购物 三丰科技 阅读网 日历 万年历 2025年1日历 -2025/1/18 20:16:50-

图片自动播放器
↓图片自动播放器↓
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
图片批量下载器
↓批量下载图片,美女图库↓
  网站联系: qq:121756557 email:121756557@qq.com  IT数码