2016-02-23 131 views
0

我們使用ElasticSearch根據5個字段查找商品,例如某些「自由文本」,商品狀態和客戶名稱。我們還需要在兩個字段中彙總客戶名稱和報價狀態。所以當有人輸入一些自由文本時,我們發現有10個狀態爲閉合狀態,8個狀態爲開放狀態,'狀態過濾器'應該包含關閉狀態(10)和打開狀態(8)。聚合上的ElasticSearch過濾器在不影響聚合計數的情況下

現在的問題是,當我選擇狀態'封閉'被包含在過濾器中,打開的聚合結果更改爲0.我希望這保持8.所以我怎樣才能防止聚合上的過濾器影響聚合本身?

這是第一個搜索,比如搜索「Java」作爲:

{ 
    "query": { 
     "bool": { 
      "filter": [ 
      ], 
      "must": { 
       "simple_query_string": { 
        "query" : "java" 
       } 
      } 
     } 
    }, 
    "aggs": { 
     "OFFER_STATE_F": { 
      "terms": { 
       "size": 0, 
       "field": "offer_state_f", 
       "min_doc_count": 0 
      } 
     } 
    }, 
    "from": 0, 
    "size": 1, 
    "fields": ["offer_id_ft", "offer_state_f"] 
} 

結果是這樣的:

{ 
    "hits": { 
    "total": 960, 
    "max_score": 0.89408284000000005, 
    "hits": [ 
     { 
     "_type": "offer", 
     "_index": "select", 
     "_id": "40542", 
     "fields": { 
      "offer_id_ft": [ 
      "40542" 
      ], 
      "offer_state_f": [ 
      "REJECTED" 
      ] 
     }, 
     "_score": 0.89408284000000005 
     } 
    ] 
    }, 
    "_shards": { 
    "total": 5, 
    "successful": 5, 
    "failed": 0 
    }, 
    "timed_out": false, 
    "aggregations": { 
    "OFFER_STATE_F": { 
     "buckets": [ 
     { 
      "key": "REJECTED", 
      "doc_count": 778 
     }, 
     { 
      "key": "ACCEPTED", 
      "doc_count": 130 
     }, 
     { 
      "key": "CANCELED", 
      "doc_count": 22 
     }, 
     { 
      "key": "WITHDRAWN", 
      "doc_count": 13 
     }, 
     { 
      "key": "LONGLIST", 
      "doc_count": 12 
     }, 
     { 
      "key": "SHORTLIST", 
      "doc_count": 5 
     }, 
     { 
      "key": "INTAKE", 
      "doc_count": 0 
     } 
     ], 
     "doc_count_error_upper_bound": 0, 
     "sum_other_doc_count": 0 
    } 
    }, 
    "took": 2 
} 

正如你看到的,client_state_f桶的總和等於總命中(960)。現在,我在查詢中包含一個狀態,說'已接受'。所以我的查詢變爲:

{ 
    "query": { 
     "bool": { 
      "filter": [ 
       { 
        "bool": { 
         "should": [ 
          { 
           "term": { 
            "offer_state_f": "ACCEPTED" 
           } 
          } 
         ] 
        } 
       }    
      ], 
      "must": { 
       "simple_query_string": { 
        "query" : "java" 
       } 
      } 
     } 
    }, 
    "aggs": { 
     "OFFER_STATE_F": { 
      "terms": { 
       "size": 0, 
       "field": "offer_state_f", 
       "min_doc_count": 0 
      } 
     } 
    }, 
    "from": 0, 
    "size": 1, 
    "fields": ["offer_id_ft", "offer_state_f"] 
} 

我要的是130個的結果,但client_state_f桶消力總結高達960,但我得到的是這樣的:

{ 
    "hits": { 
    "total": 130, 
    "max_score": 0.89408284000000005, 
    "hits": [ 
     { 
     "_type": "offer", 
     "_index": "select", 
     "_id": "16884", 
     "fields": { 
      "offer_id_ft": [ 
      "16884" 
      ], 
      "offer_state_f": [ 
      "ACCEPTED" 
      ] 
     }, 
     "_score": 0.89408284000000005 
     } 
    ] 
    }, 
    "_shards": { 
    "total": 5, 
    "successful": 5, 
    "failed": 0 
    }, 
    "timed_out": false, 
    "aggregations": { 
    "OFFER_STATE_F": { 
     "buckets": [ 
     { 
      "key": "ACCEPTED", 
      "doc_count": 130 
     }, 
     { 
      "key": "CANCELED", 
      "doc_count": 0 
     }, 
     { 
      "key": "INTAKE", 
      "doc_count": 0 
     }, 
     { 
      "key": "LONGLIST", 
      "doc_count": 0 
     }, 
     { 
      "key": "REJECTED", 
      "doc_count": 0 
     }, 
     { 
      "key": "SHORTLIST", 
      "doc_count": 0 
     }, 
     { 
      "key": "WITHDRAWN", 
      "doc_count": 0 
     } 
     ], 
     "doc_count_error_upper_bound": 0, 
     "sum_other_doc_count": 0 
    } 
    }, 
    "took": 10 
} 

正如你所看到的,只有已接受的存儲桶已滿,其他所有存儲都爲0.

回答

0

好吧,我在一位同事的幫助下找到了答案,事情就是,Val i是對的。爲他+1。我所做的是將所有查詢過濾器放在post_filter中,這就是問題所在。我只需將過濾器放置在我想要在post_filter中聚合的字段中。因此:

{ 
    "query": { 
     "bool": { 
      "filter": [ 
      { 
       "term": { 
        "broker_f": "false" 
       } 
      } 
      ], 
      "must": { 
       "simple_query_string": { 
        "query" : "java" 
       } 
      } 
     } 
    }, 
    "aggs": { 
     "OFFER_STATE_F": { 
      "terms": { 
       "size": 0, 
       "field": "offer_state_f", 
       "min_doc_count": 0 
      } 
     } 
    }, 
    "post_filter" : { 
     "bool": { 
      "should": [ 
       { 
        "term": { 
         "offer_state_f": "SHORTLIST" 
        } 
       } 
      ] 
     } 
    }, 
    "from": 0, 
    "size": 1, 
    "fields": ["offer_id_ft", "offer_state_f"] 
} 

而現在的結果是正確的:

{ 
    "hits": { 
    "total": 5, 
    "max_score": 0.76667790000000002, 
    "hits": [ 
     { 
     "_type": "offer", 
     "_index": "select", 
     "_id": "24454", 
     "fields": { 
      "offer_id_ft": [ 
      "24454" 
      ], 
      "offer_state_f": [ 
      "SHORTLIST" 
      ] 
     }, 
     "_score": 0.76667790000000002 
     } 
    ] 
    }, 
    "_shards": { 
    "total": 5, 
    "successful": 5, 
    "failed": 0 
    }, 
    "timed_out": false, 
    "aggregations": { 
    "OFFER_STATE_F": { 
     "buckets": [ 
     { 
      "key": "REJECTED", 
      "doc_count": 777 
     }, 
     { 
      "key": "ACCEPTED", 
      "doc_count": 52 
     }, 
     { 
      "key": "CANCELED", 
      "doc_count": 22 
     }, 
     { 
      "key": "LONGLIST", 
      "doc_count": 12 
     }, 
     { 
      "key": "WITHDRAWN", 
      "doc_count": 12 
     }, 
     { 
      "key": "SHORTLIST", 
      "doc_count": 5 
     }, 
     { 
      "key": "INTAKE", 
      "doc_count": 0 
     } 
     ], 
     "doc_count_error_upper_bound": 0, 
     "sum_other_doc_count": 0 
    } 
    }, 
    "took": 4 
} 
+0

不要忘記+1「爲他」;-) – Val

1

您需要將過濾器移至post_filter部分,而不是query部分。

這樣,過濾將在計算聚合後應用,並且您將能夠聚合整組數據,但只會得到與過濾器匹配的結果匹配。

+0

嗨,這並不能達到預期的效果。我想在'自由文本'上進行搜索,計算結果中每個狀態/客戶端名稱的出現次數,然後將這些州/名稱用作縮小結果的多選過濾器。但多選應該是一個'OR'明智的過濾器。有什麼建議麼? – JointEffort

+0

對不起,我一定誤解了這個問題。讓我考慮一下,除非有人在此期間有一個好的解決方案。也許如果你可以分享你現在擁有的東西,那可能有助於描繪它。 – Val