2016-06-09 40 views
0

大約有300,000個獨特的用戶/客戶。我們每份訂單都有一份文件,因此我們有數百萬份文件。elasticsearch - 獨特記錄的最新文檔聚合統計

每個訂單的文件看起來像這樣

{ 
    "customer_id" : 1001, 
    "order_amount" : 15.00, 
    "timestamp" : 1465450000, //epoch time when order was placed 
} 

我需要每一個獨特的客戶記錄(CUSTOMER_ID)最新訂單,即每個客戶的「統計彙總」度量採取最新的訂單金額,進行統計彙總(忽略舊訂單)

這是可能的elasticsearch?

回答

0

如果我正確理解您的要求,以下應該工作。由於我們有權訪問查詢,我們可以做任何事情來限制數據集。在我的例子,我只是說了時間戳> = 1365440000:

{ 
    "size": 0, 
    "query": { 
     "constant_score": { 
      "filter": { 
       "range": { 
        "timestamp": { 
         "gte": 1365440000 
        } 
       } 
      } 
     } 
    }, 
    "aggs": { 
     "customers": { 
      "terms": { 
       "field": "customer_id" 
      }, 
      "aggs": { 
       "order_stats": { 
        "stats": { 
         "field": "order_amount" 
        } 
       } 
      } 
     } 
    } 
} 

下面是結果:

{ 
    "took": 32, 
    "timed_out": false, 
    "_shards": { 
     "total": 5, 
     "successful": 5, 
     "failed": 0 
    }, 
    "hits": { 
     "total": 8, 
     "max_score": 0, 
     "hits": [] 
    }, 
    "aggregations": { 
     "customers": { 
     "doc_count_error_upper_bound": 0, 
     "sum_other_doc_count": 0, 
     "buckets": [ 
     { 
      "key": 1001, 
      "doc_count": 4, 
      "order_stats": { 
       "count": 4, 
       "min": 13, 
       "max": 15, 
       "avg": 13.875, 
       "sum": 55.5 
      } 
     }, 
     { 
      "key": 1002, 
      "doc_count": 4, 
      "order_stats": { 
       "count": 4, 
       "min": 13.5, 
       "max": 15.5, 
       "avg": 14.625, 
       "sum": 58.5 
      } 
      } 
     ] 
     } 
    } 
} 

希望它能幫助。

+0

這不起作用。您正在使用硬編碼的時間戳過濾器範圍。那不是我正在尋找的。查詢需要查看每個客戶的最新文檔 – user3658423