2015-04-16 138 views
2

我每週收到一次服務的數據,並將其放入一個集合中。數據有一個金額,projectNo和dataDate時間戳。使用聚合框架我通過projectNo和dataDate總結量和組:Mongodb聚合框架:月份中最大日期的總和值

db.collection.aggregate([ 
    {$project: {projectNo: 1, bdgtAppd: 1, dataDate: 1}}, 
    {$group: {_id: { 
       projectNo: "$projectNo", 
       dataDate: "$dataDate" 
       }, 
      amount: {$sum: "$bdgtAppd"}} 
    }, 
    {$project: {_id:0, 
       projectNo:"$_id.projectNo", 
       dataDate:"$_id.dataDate", 
       amount:"$amount" 
       } 
    }, 
    {$sort: {projectNo:1,dataDate:1}} 
]) 

其中產量如下:

[{ 
    "amount" : 7887, 
    "projectNo" : "5544A", 
    "dataDate" : "2015-01-02T08:00:00.000Z" 
}, { 
    "amount" : 137947, 
    "projectNo" : "5544A", 
    "dataDate" : "2015-01-16T08:00:00.000Z" 
}, { 
    "amount" : 137947, 
    "projectNo" : "5544A", 
    "dataDate" : "2015-01-23T08:00:00.000Z" 
}, { 
    "amount" : 137947, 
    "projectNo" : "5544A", 
    "dataDate" : "2015-01-30T08:00:00.000Z" 
}, { 
    "amount" : 130060, 
    "projectNo" : "5544A", 
    "dataDate" : "2015-02-06T08:00:00.000Z" 
}, { 
    "amount" : 130060, 
    "projectNo" : "5544A", 
    "dataDate" : "2015-02-13T08:00:00.000Z" 
}, { 
    "amount" : 130060, 
    "projectNo" : "5544A", 
    "dataDate" : "2015-02-20T08:00:00.000Z" 
}] 

我現在需要做的是限制返回的數據只是最後日期每月:

[{ 
    "amount" : 137947, 
    "projectNo" : "5544A", 
    "dataDate" : "2015-01-30T08:00:00.000Z" 
}, { 
    "amount" : 130060, 
    "projectNo" : "5544A", 
    "dataDate" : "2015-02-27T08:00:00.000Z" 
}] 

編輯:

{ 
    "_id" : ObjectId("5527e724fc53ec16bc5fe57a"), 
    "projectNo" : "5544G", 
    "cpfoNo" : "1448R", 
    "cpfoDate" : ISODate("2014-10-20T07:00:00Z"), 
    "description" : "INC 6 CO 176 - Booster Pump", 
    "pcoNo" : "1510", 
    "approvedAmount" : null, 
    "days" : null, 
    "remarks" : null, 
    "itemNo" : "0005", 
    "costCode" : "5030.09900.0000.0000", 
    "itemTitle" : "Painting - Hasson", 
    "bdgtEst" : 0.0, 
    "bdgtProp" : 745.0, 
    "bdgtAprv" : 745.0, 
    "bdgtAppd" : 745.0, 
    "dataDate" : ISODate("2014-12-12T08:00:00Z") 
} 
:從收集樣品文件
+0

你能告訴我們你的文件嗎? – styvane

+0

@Michael你有沒有機會看看這個? – Splitty

回答

1

感謝@chidrian讓我開始。這是適合我的解決方案。可能是預測月份和年份鍵的額外步驟,但它可行。

{ 
    "$group": { 
     "_id": { 
      "projectNo": "$projectNo", 
      "dataDate": "$dataDate" 
     }, 
     "sum": "$bdgtAppd" 
    } 
} 
}, { 
    "$project": { 
     "_id": 0, 
     "projectNo": "$_id.projectNo", 
     "dataDate": "$_id.dataDate", 
     "amount": 1 
    } 
}, { 
    "$project": { 
     "_id": 0, 
     "projectNo": "$projectNo", 
     "amount": 1, 
     "dataDate": 1, 
     "month": { 
      $month: "$dataDate" 
     }, 
     "year": { 
      "$year": "$dataDate" 
     } 
    } 
}, { 
    "$sort": { 
     projectNo: 1, 
     dataDate: 1 
    } 
}, { 
    "$group": { 
     "_id": { 
      "projectNo": "$projectNo", 
      "month": "$month", 
      "year": "$year" 
     }, 
     "dataDate": { 
      "$last": "$dataDate" 
     }, 
     "amount": { 
      "$last": "$amount" 
     } 
    } 
}, { 
    "$sort": { 
     projectNo: 1, 
     dataDate: 1 
    } 
}, { 
    "$project": { 
     "_id": 0, 
     "projectNo": "$_id.projectNo", 
     "dataDate": 1, 
     "amount": 1 
    } 
} 
1

沒有必要用於初始$project流水線級,簡單地用$group步驟開始和下面的流水線階段將產生所期望的結果:

db.collection.aggregate([ 
    { 
     "$group": { 
      "_id": { 
       "projectNo": "$projectNo", 
       "dataDate": "$dataDate" 
      }, 
      "amount": {"$sum": "$bdgtAppd"}    
     }  
    }, 
    { 
     "$project": { 
      "_id": 0,    
      "projectNo": "$_id.projectNo", 
      "dataDate": "$_id.dataDate", 
      "amount": 1 
     } 
    }, 
    { 
     "$group": { 
      "_id": "$projectNo",       
      "dataDate": {"$first" : "$dataDate"}, 
      "amount": {"$first" : "$amount"}   
     } 
    }, 
    { 
     "$project": { 
      "_id": 0,    
      "projectNo": "$_id", 
      "dataDate": 1, 
      "amount": 1 
     } 
    } 
]); 

用以下示例文檔(包括在相關領域僅作爲最小測試用例):

db.collection.insert([ 
    /* 0 */ 
    { 
     "projectNo" : "5544A", 
     "bdgtAppd" : 3, 
     "dataDate" : ISODate("2015-01-02T08:00:00.000Z") 
    }, 

    /* 1 */ 
    { 
     "projectNo" : "5544A", 
     "bdgtAppd" : 7, 
     "dataDate" : ISODate("2015-01-28T08:00:00.000Z") 
    }, 

    /* 2 */ 
    { 
     "projectNo" : "5544A", 
     "bdgtAppd" : 5, 
     "dataDate" : ISODate("2015-01-28T08:00:00.000Z") 
    }, 

    /* 3 */ 
    { 
     "projectNo" : "5544B", 
     "bdgtAppd" : 15, 
     "dataDate" : ISODate("2015-02-13T08:00:00.000Z") 
    }, 

    /* 4 */ 
    { 
     "projectNo" : "5544G", 
     "bdgtAppd" : 10, 
     "dataDate" : ISODate("2015-02-27T08:00:00.000Z") 
    }, 

    /* 5 */ 
    { 
     "projectNo" : "5544G", 
     "bdgtAppd" : 25, 
     "dataDate" : ISODate("2015-02-27T08:00:00.000Z") 
    }, 
]); 

上述聚合生產:

/* 0 */ 
{ 
    "result" : [ 
     { 
      "dataDate" : ISODate("2015-01-28T08:00:00.000Z"), 
      "amount" : 12, 
      "projectNo" : "5544A" 
     }, 
     { 
      "dataDate" : ISODate("2015-02-13T08:00:00.000Z"), 
      "amount" : 15, 
      "projectNo" : "5544B" 
     }, 
     { 
      "dataDate" : ISODate("2015-02-27T08:00:00.000Z"), 
      "amount" : 35, 
      "projectNo" : "5544G" 
     } 
    ], 
    "ok" : 1 
} 
+0

感謝您的提示,我想限制的關鍵只有我所關心的會加快一點點,但我忘了$組同樣的事情。 這個聚合讓我非常接近,有幾種情況下它是正確的,而其他的則選擇了本月的第一個dataDate。我覺得有一種方法可以一次完成,但我擔心我必須先獲取日期,然後將日期傳遞給$匹配。 – Splitty

+0

@Splitty別擔心。我嘗試了幾個測試文檔,並且不能提供較少的流水線階段查詢,只要我能夠滿足您提供的要求即可。亮點是你如何做你的初始分組;在上面你想在項目日期上分組。 – chridam

+0

如果可能,您能否提供最低測試用例(樣本文檔和樣本集合中的預期結果),或許我們可以找到另一種方法? – chridam