开发: C++知识库 Java知识库 JavaScript Python PHP知识库人工智能区块链大数据移动开发嵌入式开发工具数据结构与算法开发测试游戏开发网络协议系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑笔记本显卡显示器固态硬盘硬盘耳机手机 iphone vivo oppo 小米华为单反装机图拉丁

-> 大数据 -> 《谷粒商城》开发记录 6：Elasticsearch 和商品上架 -> 正文阅读

[大数据]《谷粒商城》开发记录 6：Elasticsearch 和商品上架

Elasticsearch 是一个分布式的开源搜索和分析引擎。

1 基本概念

索引
文档
映射
倒排索引
分词

2 基本操作

2.1 操作类型

create：创建

PUT student/1
{"name": "Li Ming"}

index：索引

POST student/1
{"name": "Li Ming"}
与 create 的区别在于，不带 id 进行 index 操作时会自动生成一个 id。

update：更新

POST student/1/_update
{"doc": {"name": "Li Ming"} }
如果 doc 内容无变化，则对应的操作是 noop。

delete：删除

DELETE student/1

noop：空操作

2.2 查询

GET student/1

2.3 命令

bulk：批量执行

POST student/_bulk
{"create": {"_id": "1"} }
{"name": "Li Ming", "ranking": "2"}
{"index": {"_id": "2"} }
{"name": "Another Li Ming", "ranking": "5"}
{"update": {"_id": "2"} }
{"name": "Li Xiaoming", "ranking": "5"}
{"delete": {"_id": "1"} }

search：检索

GET student/_search?q=*&sort=ranking:asc
或者
GET student/_search
{
? ? "query": {"match_all": { } },
? ? "sort": [ {"ranking": "asc"} ]
}
一般采用第二种方式——Elasticsearch 提供的 Query DSL(查询领域对象语言)。

mapping：映射

查看：
GET student/_mapping
新增：
PUT student/_mapping
{
? ? "properties": {
? ? ? ? "name": {"type":"text"},
? ? ? ? "age": {"type":"integer"},
? ? ? ? "email": {"type":"keyword"}
? ? }
}
修改：
不支持，只能创建新的索引再进行数据迁移。

reindex：数据迁移

POST _reindex
{
? ? "source": {"index": "student"},
? ? "dest": {"index": "new_student"}
}

analyze：分析

2.4 Query DSL

query：查询

match：匹配，分词后走倒排索引，计算相关性得分。
match_phrase：匹配短语，不进行分词进行匹配。
multi_match：多字段匹配，会进行分词。
match_all：匹配全部。
term：精确匹配数字，对字符串不适用。
bool：复合查询，可以添加多个查询条件。

aggs：聚合

自定义聚合名称-->选择聚合类型。
常用的聚合类型有 terms(分词条)，avg(求平均值) 等。见官网。

其他

sort：排序。
from：起始位置。
size：最大返回结果数目。
_source：指定返回字段，默认全部。

3 功能扩展

3.1 可视化

安装 Kibana。

3.2 支持中文分词

安装 IK分词插件。

3.3 自定义扩展词库

【提示】需要安装 IK分词插件和 Nginx。
在 nginx/html/ 目录下创建 es/ 目录，然后在 es/ 目录下创建 txt 文件：
vi my_tokens.txt
在 my_tokens.txt 文件中添加自定义的词语。

在 elasticsearch/plugins/ik/config/ 目录下的 IKAnalyzer.cfg.xml 文件中添加配置：
<entry key="remote_ext_dict">http://192.168.56.10/es/my_tokens.txt</entry>

重启 elasticsearch：docker restart elasticsearch

3.4 整合 Spring Boot

【提示】本节内容可能因为版本变动而不再适用，建议查看官方文档。
一般情况下会单独创建一个工程来操作 Elasticsearch。

添加 Maven依赖：
<dependency>
? ? <groupId>org.elasticsearch.client</groupId>
? ? <artifactId>elasticsearch-rest-high-level-client</artifactId>
? ? <version>7.4.2</version>
</dependency>
在 properties标签中添加：
<elasticsearch.version>7.4.2</elasticsearch.version>

向容器中注入 bean：
@Configuration
public class GulimallElasticSearchConfig{
? ? @Bean
? ? public RestHighLevelClient esRestClient(){
? ? ? ? RestClientBuilder builder = RestClient.builder(new HttpHost("192.168.56.10", 9200, "http"));
? ? ? ? RestHighLevelClient client = new RestHighLevelClient(builder);
? ? ? ? return client;
? ? }
}

测试代码：
@RunWith(SpringRunner.class)
@SpringBootTest
public class GulimallSearchApplicationTests{
? ? @Autowired
? ? private RestHighLevelClient client;

? ? @Test
? ? public void contextLoads(){
? ? ? ? System.out.println(client);
? ? }
}

4 应用

4.1 商品上架

商品上架本质上就是将商品数据保存到 Elasticsearch 中。

需求分析

商品上架的单位是 SPU，每个 SPU 可能包含多个 SKU。
SPU 信息包括：
SPU id，品牌 id、品牌名称、品牌 logo，商品 id、商品名称、商品属性。其中每个商品属性包括属性 id、属性名、属性值。
SKU 信息包括：
SKU id，SPU id，SKU 标题、SKU 价格、SKU 商品图、销量，是否有货、热度评分。
商品上架的流程可以概括为：
根据一个 SPU id，查询得到一个 SKU 信息列表，将查询结果上传到 Elasticsearch。

模型设计

@Data
public class SkuEsModel{
? ? private Long skuId;
? ? private Long spuId;
? ? private String skuTitle;
? ? private BigDecimal skuPrice;
? ? private String skuImg;
? ? private Long saleCount;
? ? private Boolean hasStock;
? ? private Long hotScore;
? ? private Long brandId;
? ? private String brandName;
? ? private String brandImg;
? ? private Long catalogId;
? ? private String catalogName;
? ? private List<Attrs> attrs;
? ? @Data
? ? public static class Attrs{
? ? ? ? private Long attrId;
? ? ? ? private String attrName;
? ? ? ? private String attrValue;
? ? }
}

索引设计

PUT product
{
? ? "mappings": {
? ? ? ? "properties": {
? ? ? ? ? ? "skuId": {"type": "long"},
? ? ? ? ? ? "spuId": {"type": "keyword"},
? ? ? ? ? ? "skuTitle": {"type": "text", "analyzer": "ik_smart"},
? ? ? ? ? ? "skuPrice": {"type": "keyword"},
? ? ? ? ? ? "skuImg": {"type": "keyword", "index": false, "doc_values": false},
? ? ? ? ? ? "saleCount": {"type": "long"},
? ? ? ? ? ? "hasStock": {"type": "boolean"},
? ? ? ? ? ? "hotScore": {"type": "long"},
? ? ? ? ? ? "brandId": {"type": "long"},
? ? ? ? ? ? "brandName": {"type": "keyword", "index": false, "doc_values": false},
? ? ? ? ? ? "brandImg": {"type": "keyword", "index": false, "doc_values": false},
? ? ? ? ? ? "catalogId": {"type": "long"},
? ? ? ? ? ? "catalogName": {"type": "keyword", "index": false, "doc_values": false},
? ? ? ? ? ? "attrs": {
? ? ? ? ? ? ? ? "type": "nested",
? ? ? ? ? ? ? ? "properties": {
? ? ? ? ? ? ? ? ? ? "attrId": {"type": "long"},
? ? ? ? ? ? ? ? ? ? "attrName": {"type": "keyword", "index": false, "doc_values": false},
? ? ? ? ? ? ? ? ? ? "attrValue": {"type": "keyword"}
? ? ? ? ? ? ? ? }
? ? ? ? ? ? }
? ? ? ? }
? ? }
}

接口设计

1 商品上架
POST /product/spuinfo/{spuId}/up

2 检查是否有货
POST /ware/waresku/hasstock

3 将商品数据保存到 Elasticsearch
POST /search/save/product

接口实现

POST /product/spuinfo/{spuId}/up
1 数据准备。
根据 SPU id，在数据库中查询得到 SPU 信息：品牌 id、品牌名称、品牌 logo，商品 id、商品名称、商品属性。
得到一个 SKU 信息列表，每个 SKU 包括信息：SKU id，SPU id，SKU 标题、SKU 价格、SKU 商品图、销量。
远程调用 /ware/waresku/hasstock 接口，得到每个 SKU 是否有货。
根据热度评分算法计算每个 SKU 的热度评分。为简化流程，设所有 SKU 的热度评分为 0。
2 封装 SkuEsModel。
遍历 SKU 信息列表，创建 SkuEsModel 对象，给对象各属性赋值。
3 将封装好的 SkuEsModel 列表发送给 Elasticsearch。
远程调用 /search/save/product 接口。
4 修改 SPU 状态。
新建 --> 已上架。

POST /ware/waresku/hasstock
接收一个 SKU id 列表，返回一个 SkuHasStockVo 对象列表。
SkuHasStockVo 类定义如下：
@Data
public class SkuHasStockVo{
? ? private Long skuId;
? ? private Boolean hasStock;
}

POST /search/save/product
【提示】需要先在 Elasticsearch 中建立索引和映射关系。
参照 Elasticsearch 官方文档，调用 API，进行批量插入。