2017-07-21 51 views
0

有了這個映射:Elasticsearch嵌套集合返回重複的結果

PUT pizzas 
{ 
    "mappings": { 
    "pizza": { 
     "properties": { 
     "name": { 
      "type": "keyword" 
     }, 
     "types": { 
      "type": "nested", 
      "properties": { 
      "topping": { 
       "type": "keyword" 
      }, 
      "base": { 
       "type": "keyword" 
      } 
      } 
     } 
     } 
    } 
    } 
} 

而這個數據:

PUT pizzas/pizza/1 
{ 
    "name": "meat", 
    "types": [ 
    { 
     "topping": "bacon", 
     "base": "normal" 
    }, 
    { 
     "topping": "pepperoni", 
     "base": "normal" 
    } 
    ] 
} 

PUT pizzas/pizza/2 
{ 
    "name": "veg", 
    "types": [ 
    { 
     "topping": "broccoli", 
     "base": "normal" 
    } 
    ] 
} 

如果我運行這個嵌套的聚集查詢:

GET pizzas/_search 
{ 
    "size": 0, 
    "aggs": { 
    "types_agg": { 
     "nested": { 
     "path": "types" 
     }, 
     "aggs": { 
     "base_agg": { 
      "terms": { 
      "field": "types.base" 
      } 
     } 
     } 
    } 
    } 
} 

我得到這樣的結果:

{ 
    "took": 2, 
    "timed_out": false, 
    "_shards": { 
    "total": 5, 
    "successful": 5, 
    "failed": 0 
    }, 
    "hits": { 
    "total": 2, 
    "max_score": 0, 
    "hits": [] 
    }, 
    "aggregations": { 
    "types_agg": { 
     "doc_count": 3, 
     "base_agg": { 
     "doc_count_error_upper_bound": 0, 
     "sum_other_doc_count": 0, 
     "buckets": [ 
      { 
      "key": "normal", 
      "doc_count": 3 
      } 
     ] 
     } 
    } 
    } 
} 

我預計我的聚合返回2的doc_count,因爲只有兩個文檔與我的查詢匹配。但很顯然,因爲它是一個倒排索引,它找到了3個結果,因此有3個文檔。

有沒有辦法讓它返回唯一文件數?

(在Elasticsearch 5.4.3測試)

+1

像這樣,我是如何理解這一點的。在嵌套聚合中,如果嵌套你,這將在嵌套類型的上下文中返回結果。就像它讓聚合器在那裏減少一樣。所以你會推一個reverse_nested回到根目錄等等。使用https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-reverse-nested-aggregation.html – user3775217

回答

0

提出質疑後不久就發現了answer

改變聚合查詢爲:

GET pizzas/_search 
{ 
    "size": 0, 
    "aggs": { 
    "types_agg": { 
     "nested": { 
     "path": "types" 
     }, 
     "aggs": { 
     "base_agg": { 
      "terms": { 
      "field": "types.base" 
      }, 
      "aggs": { 
      "top_reverse_nested": { 
       "reverse_nested": {} 
      } 
      } 
     } 
     } 
    } 
    } 
} 

生成結果:

"aggs": { 
    "top_reverse_nested": { 
     "reverse_nested": {} 
    } 
} 

反向嵌套加入:

{ 
    "took": 5, 
    "timed_out": false, 
    "_shards": { 
    "total": 5, 
    "successful": 5, 
    "failed": 0 
    }, 
    "hits": { 
    "total": 2, 
    "max_score": 0, 
    "hits": [] 
    }, 
    "aggregations": { 
    "types_agg": { 
     "doc_count": 3, 
     "base_agg": { 
     "doc_count_error_upper_bound": 0, 
     "sum_other_doc_count": 0, 
     "buckets": [ 
      { 
      "key": "normal", 
      "doc_count": 3, 
      "top_reverse_nested": { 
       "doc_count": 2 
      } 
      } 
     ] 
     } 
    } 
    } 
} 

將其加入到查詢是最重要的部分回到文檔的根目錄,所以它只獲得唯一的聚合。

你可以閱讀約reverse_nestedhere