2016-03-31 164 views
0

我有以下映射:ElasticSearch:從嵌套聚集查詢中訪問外文檔字段

{ 
    "dynamic": "strict", 
    "properties": { 
     "id": { 
      "type": "string" 
     }, 
     "title": { 
      "type": "string" 
     }, 
     "things": { 
      "type": "nested", 
      "properties": { 
       "id": { 
        "type": "long" 
       }, 
       "something": { 
        "type": "long" 
       } 
      } 
     } 
    } 
} 

我插入文檔如下(Python腳本):

body = {"id": 1, "title": "one", "things": [{"id": 1000, "something": 33}, {"id": 1001, "something": 34}, ]} 
es.create(index_name, doc_type=doc_type, body=body, id=1) 

body = {"id": 2, "title": "two", "things": [{"id": 1000, "something": 43}, {"id": 1001, "something": 44}, ]} 
es.create(index_name, doc_type=doc_type, body=body, id=2) 

body = {"id": 3, "title": "three", "things": [{"id": 1000, "something": 53}, {"id": 1001, "something": 54}, ]} 
es.create(index_name, doc_type=doc_type, body=body, id=3) 

我運行以下聚合查詢:

{ 
    "query": { 
    "match_all": {} 
    }, 
    "aggs": { 
    "things": { 
     "aggs": { 
     "num_articles": { 
      "terms": { 
      "field": "things.id", 
      "size": 0 
      }, 
      "aggs": { 
      "articles": { 
       "top_hits": { 
       "size": 50 
       } 
      } 
      } 
     } 
     }, 
     "nested": { 
     "path": "things" 
     } 
    } 
    }, 
    "size": 0 
} 

(所以,我要計算每個「事物」出現的次數,並對每個事物列出一個列表在其中出現每個事物的文章)的

查詢生成:

"key": 1000, 
"doc_count": 3, 
"articles": { 
    "hits": { 
     "total": 3, 
     "max_score": 1, 
     "hits": [{ 
      "_index": "test", 
      "_type": "article", 
      "_id": "2", 
      "_nested": { 
       "field": "things", 
       "offset": 0 
      }, 
      "_score": 1, 
      "_source": { 
       "id": 1000, 
       "something": 43 
      } 
     }, { 
      "_index": "test", 
      "_type": "article", 
      "_id": "1", 
      "_nested": { 
       "field": "things", 
       "offset": 0 
      }, 
      "_score": 1, 
      "_source": { 
       "id": 1000, 
       "something": 33 
      } 

......(依此類推)

我想什麼是每個命中列出所有來自「外部」或頂級文檔的字段,即在這種情況下是id和標題。

這實際上是可能的.....如果是這樣如何?

回答

0

我不知道如果這是你在找什麼,但讓我們試試看:

{ 
    "query": { 
    "match_all": {} 
    }, 
    "aggs": { 
    "nested_things": { 
     "nested": { 
     "path": "things" 
     }, 
     "aggs": { 
     "num_articles": { 
      "terms": { 
      "field": "things.id", 
      "size": 0 
      }, 
      "aggs": { 
      "articles": { 
       "top_hits": { 
       "size": 50 
       } 
      }, 
      "reverse_things": { 
       "reverse_nested": {}, 
       "aggs": { 
       "title": { 
        "terms": { 
        "field": "title", 
        "size": 0 
        } 
       }, 
       "id": { 
        "terms": { 
        "field": "id", 
        "size": 0 
        } 
       } 
       } 
      } 
      } 
     } 
     } 
    } 
    } 
} 

這會產生這樣的:

  "buckets": [ 
       { 
        "key": 1000, 
        "doc_count": 3, 
        "reverse_things": { 
        "doc_count": 3, 
        "id": { 
         "buckets": [ 
          { 
           "key": "1", 
           "doc_count": 1 
          }, 
          { 
           "key": "2", 
           "doc_count": 1 
          }, 
          { 
           "key": "3", 
           "doc_count": 1 
          } 
         ] 
        }, 
        "title": { 
         ... 
        } 
        }, 
        "articles": { 
        "hits": { 
         "total": 3, 
         "max_score": 1, 
         "hits": [ 
          { 
           "_index": "test", 
           "_type": "article", 
           "_id": "AVPOgQQjgDGxUAMojyuY", 
           "_nested": { 
           "field": "things", 
           "offset": 0 
           }, 
           "_score": 1, 
           "_source": { 
           "id": 1000, 
           "something": 53 
           } 
          }, 
          ... 
+0

非常近... .. –

+0

問題是''''reverse_things'''部分列出了ID和標題,但不是以相同的順序。所以,密鑰對ID是1,2,3 「ID」:{ 「doc_count_error_upper_bound」:0, 「sum_other_doc_count」:0, 「桶」:[{ 「鑰匙」: 「1」, 「doc_count」:1 },{ 「鑰匙」: 「2」, 「doc_count」:1 },{ 「鍵」: 「3」, 「doc_count」:1 }] }, –

+0

但是標題的關鍵是一,三,二。 「標題」:{ 「doc_count_error_upper_bound」:0, 「sum_other_doc_count」:0, 「桶」:[{ 「鍵」: 「一個」, 「doc_count」:1 },{ 「鍵「: 「三化」, 「doc_count」:1 },{ 「鑰匙」: 「兩節」, 「doc_count」:1 }] } 如果排序可能會被迫以配合原創文章,這將工作。謝謝@ kristian-ferkić順便說一句... –