2015-11-17 40 views
0

假設以下stockInWarehouse模式:在Elasticsearch中選擇TOP + GROUP BY + SHORT?

{ 
    product_db: { 
    mappings: { 
     stockInWarehouse: { 
     properties: { 
      sku: { 
      type: "string" 
      }, 
      arrivalTime: { 
      type: "date", 
      format: "dateOptionalTime" 
      } 
     } 
     } 
    } 
    } 
} 

的數據stockInWarehouse樣子:

{ 
    "hits": { 
    "total": 5, 
    "hits": [ 
     { 
     "_index": "product_db", 
     "_type": "stockInWarehouse", 
     "_id": "1", 
     "_source": { 
      "sku": "item 1", 
      "arrivalTime": "2015-11-11T19:00:10.231Z" 
     } 
     }, 
     { 
     "_index": "product_db", 
     "_type": "stockInWarehouse", 
     "_id": "2", 
     "_source": { 
      "sku": "item 2", 
      "arrivalTime": "2015-11-12T19:00:10.231Z" 
     } 
     }, 
     { 
     "_index": "product_db", 
     "_type": "stockInWarehouse", 
     "_id": "3", 
     "_source": { 
      "sku": "item 1", 
      "arrivalTime": "2015-11-12T19:35:10.231Z" 
     } 
     }, 
     { 
     "_index": "product_db", 
     "_type": "stockInWarehouse", 
     "_id": "4", 
     "_source": { 
      "sku": "item 1", 
      "arrivalTime": "2015-11-13T19:56:10.231Z" 
     } 
     }, 
     { 
     "_index": "product_db", 
     "_type": "stockInWarehouse", 
     "_id": "5", 
     "_source": { 
      "sku": "item 3", 
      "arrivalTime": "2015-11-15T19:56:10.231Z" 
     } 
     } 
    ] 
    } 
} 

我所試圖做的是通過arrivalTime取TOP文件(也就是最近的文檔)但我希望他們是排序其他字段(sku)限制可用sku。預期的結果是這樣的:

{ 
    "hits": { 
    "total": 3, 
    "hits": [ 
     { 
     "_index": "product_db", 
     "_type": "stockInWarehouse", 
     "_id": "5", 
     "_source": { 
      "sku": "item 3", 
      "arrivalTime": "2015-11-15T19:56:10.231Z" 
     } 
     }, 
     { 
     "_index": "product_db", 
     "_type": "stockInWarehouse", 
     "_id": "4", 
     "_source": { 
      "sku": "item 1", 
      "arrivalTime": "2015-11-13T19:56:10.231Z" 
     } 
     }, 
     { 
     "_index": "product_db", 
     "_type": "stockInWarehouse", 
     "_id": "2", 
     "_source": { 
      "sku": "item 2", 
      "arrivalTime": "2015-11-12T19:00:10.231Z" 
     } 
     } 
    ] 
    } 
} 

如果我排序arrivalTime,結果SKU列表將包含item 3, item 1, item 1, item 2, item 1(一式兩份)。如果我按sku排序,則結果列表將不會反映正確的到達時間順序。

這種類型的查詢可能在Elasticsearch中嗎?我怎樣才能存檔這個?

+1

我的猜測是,你可以通過使用一個桶聚合按sku進行分組,以及按到達時間排序的標準排序。你試過這個嗎? – Phil

回答

1

這個怎麼樣?

{ 
    "size": 0, 
    "aggs": { 
    "terms_agg": { 
     "terms": { 
     "field": "sku", 
     "size": 100, 
     "order": { 
      "max_date_agg": "desc" 
     } 

     }, 
     "aggs": { 
     "max_date_agg": { 
      "max": { 
      "field": "arrivalTime" 
      } 
     } 
     } 
    } 
    } 
} 

假設您有很多產品,我製作了size : 100

注意您需要添加index : not_analyzed到您的SKU 的mapping這是查詢

"aggregations": { 
     "terms_agg": { 
     "doc_count_error_upper_bound": 0, 
     "sum_other_doc_count": 0, 
     "buckets": [ 
      { 
       "key": "item 3", 
       "doc_count": 1, 
       "max_date_agg": { 
        "value": 1447617370231, 
        "value_as_string": "2015-11-15T19:56:10.231Z" 
       } 
      }, 
      { 
       "key": "item 1", 
       "doc_count": 3, 
       "max_date_agg": { 
        "value": 1447444570231, 
        "value_as_string": "2015-11-13T19:56:10.231Z" 
       } 
      }, 
      { 
       "key": "item 2", 
       "doc_count": 1, 
       "max_date_agg": { 
        "value": 1447354810231, 
        "value_as_string": "2015-11-12T19:00:10.231Z" 
       } 
      } 
     ] 
     } 
    } 

我希望它能幫助的結果!