前言
Elasticsearch这样的分布式计算系统执行全SQL风格的表连接操作代价昂贵。相应地,Elasticsearch提供了两种形式的联结可以实现水平规模的扩展。
1.Nested Query
嵌套查询,嵌套查询首先要定义嵌套字段类型,然后使用嵌套查询(我认为这种方式价值不高,既然使用嵌套字段,为什么不直接在上层字段直接新建字段表示嵌套字段的含义呢),这里不做说明。
2.Has Child Query 和 Has Parent Query
一般sql我们要jion查询是在两个表的。所以父子查询也要在两个type 中查询,但是这两个type 必须属于同一个索引(一个索引对应多个类型官方是不建议的,大概7版本后要求一个索引只有一个type ) 下面是例子:
PUT my_index1
{
"mappings": {
"my_parent": {
"properties": {
"parentId" :{
"type": "keyword"
},
"name" :{
"type": "keyword"
},
"age" :{
"type": "integer"
}
}
},
"my_child": {
"_parent": {
"type": "my_parent"
},
"properties": {
"childId" :{
"type": "keyword"
},
"name" :{
"type": "keyword"
},
"age" :{
"type": "integer"
}
}
}
}
}
新建索引的mapping
"mappings": {
"my_child": {
"_parent": {
"type": "my_parent"
},
"_routing": {
"required": true
},
"properties": {
"age": {
"type": "integer"
},
"childId": {
"type": "keyword"
},
"name": {
"type": "keyword"
},
"parentId": {
"type": "keyword"
}
}
},
"my_parent": {
"properties": {
"age": {
"type": "integer"
},
"name": {
"type": "keyword"
},
"parentId": {
"type": "keyword"
}
}
}
}
可以发现两点:
my_child 有_parent 元属性,该值的"type": "my_parent" 构建父子type 关系。my_child 有_routing 元属性是true ,要通过_routing 构建具体文档的父子关系。 下面插入两个父文档
PUT my_index1/my_parent/parent100
{
"parentId": "parent100",
"name": "zhangsan",
"age": "45"
}
PUT my_index1/my_parent/parent200
{
"parentId": "parent200",
"name": "lily",
"age": "42"
}
在插入响应的子文档
PUT my_index1/my_child/1?parent=parent100
{
"childId": "child100",
"name": "xiaoming",
"age": "14"
}
POST my_index1/my_child/2?parent=parent100
{
"childId": "child200",
"name": "xiaohong",
"age": "17"
}
POST my_index1/my_child/3?parent=parent200
{
"childId": "child300",
"name": "lucy",
"age": "21"
}
具体文档的关系如下 :
“parent100”, “zhangsan”, “45” | “parent200”, “lily”, “42” |
---|
“child100”,“xiaoming”,“14” 、“child200”, “xiaohong”, “17” | “child300”, “lucy”, “21” |
查询举例:
- 1.用子文档条件查询父文档
has_child ---- 查询子文档xiaoming 的父文档
GET my_index1/my_parent/_search
{
"query": {
"has_child": {
"type": "my_child",
"query": {
"term": {
"name": {
"value": "xiaoming"
}
}
}
}
}
}
返回的结果:
"hits": [
{
"_index": "my_index1",
"_type": "my_parent",
"_id": "parent100",
"_score": 1,
"_source": {
"parentId": "parent100",
"name": "zhangsan",
"age": "45"
}
}
]
- 1.用子文档条件查询父文档
has_child ---- 查询子文档年龄大于10岁的父文档
GET my_index1/my_parent/_search
{
"query": {
"has_child": {
"type": "my_child",
"query": {
"range": {
"age": {
"gt": "10"
}
}
}
}
}
}
返回的结果:
"hits": [
{
"_index": "my_index1",
"_type": "my_parent",
"_id": "parent100",
"_score": 1,
"_source": {
"parentId": "parent100",
"name": "zhangsan",
"age": "45"
}
},
{
"_index": "my_index1",
"_type": "my_parent",
"_id": "parent200",
"_score": 1,
"_source": {
"parentId": "parent200",
"name": "lily",
"age": "42"
}
}
]
- 2.父文档条件查子文档
has_parent ---- 查询zhangsan 的子文档
GET my_index1/my_child/_search
{
"query": {
"has_parent": {
"type": "my_parent",
"query": {
"term": {
"name": "zhangsan"
}
}
}
}
}
返回的结果
"hits": [
{
"_index": "my_index1",
"_type": "my_child",
"_id": "1",
"_score": 1,
"_routing": "parent100",
"_parent": "parent100",
"_source": {
"childId": "child100",
"name": "xiaoming",
"age": "14"
}
},
{
"_index": "my_index1",
"_type": "my_child",
"_id": "2",
"_score": 1,
"_routing": "parent100",
"_parent": "parent100",
"_source": {
"childId": "child200",
"name": "xiaohong",
"age": "17"
}
}
]
- 2.父文档条件查子文档
has_parent ---- 查询父文档年龄小于43岁的子文档
GET my_index1/my_child/_search
{
"query": {
"has_parent": {
"type": "my_parent",
"query": {
"range": {
"age": {
"lt": "43"
}
}
}
}
}
}
返回结果:
"hits": [
{
"_index": "my_index1",
"_type": "my_child",
"_id": "3",
"_score": 1,
"_routing": "parent200",
"_parent": "parent200",
"_source": {
"childId": "child300",
"name": "lucy",
"age": "21"
}
}
]
- 3.综合查询实例:
最后说下,has_parent 和has_child 查询出的结果,仍然可以再用条件查询,达到真正的过滤,就是把has_parent 和has_child 作为bool查询中一个子查询。下面是一个例子。(其他类推)
查询张三子文档中年龄大于15的文档。
GET my_index1/my_child/_search
{
"query": {
"bool": {
"must": [
{
"range": {
"age": {
"gt": "15"
}
}
},
{
"has_parent": {
"type": "my_parent",
"query": {
"term": {
"name": "zhangsan"
}
}
}
}
]
}
}
}
返回结果:
"hits": [
{
"_index": "my_index1",
"_type": "my_child",
"_id": "2",
"_score": 2,
"_routing": "parent100",
"_parent": "parent100",
"_source": {
"childId": "child200",
"name": "xiaohong",
"age": "17"
}
}
]
Has Child Query 和 Has Parent Query 是很耗时的,官方建议如果追求性能的话,建议不使用该查询。 has_child 查询有min_children 和max_children 参数可以设置满足子文档数量的限制。
|