2015-12-14 207 views
1

我有一個Mongo查詢,我想以與SQL中的GROUP BY相同的方式有效地使用$組。Mongo聚合組由多個值

這不適用於我,除非我將新文檔的_id設置爲對我不適用的組類別之一,而且我也無法獲得我想要的值,它來自潛在的三我在Mongo合併在一起的文件。

在SQL中,我會寫東西想說明的分組和選擇,我用我的聚集在蒙戈的基礎:

SELECT entity_id, connection_id, cycle_id, objectOriginAPI,accountBalance 
FROM raw_originBusinessData 
WHERE objectStatus = 'UPROCESSED' 
AND (objectOriginAPI = 'Profit & Loss' 
OR objectOriginAPI = 'Balance Sheet' 
OR objectOriginAPI = 'Bank Summary') 
GROUP BY entity_id, connection_id, cycle_id; 

我已經改寫簡化什麼我蒙戈的腳本用做嵌入式陣列。

db.getCollection('raw_originBusinessData').aggregate([ 
{ "$match": { 
    objectStatus : "UNPROCESSED" 
    , $or: [ 
    { objectOriginAPI : "Profit & Loss"} 
    ,{objectOriginAPI : "Balance Sheet"} 
    ,{objectOriginAPI : "Bank Summary"} 
    ]} 
}, 
     // don't worry about this, this is all good 
{ "$unwind": "$objectRawOriginData.Reports" } 
,{ "$unwind": "$objectRawOriginData.Reports.Rows" } 
,{ "$unwind": "$objectRawOriginData.Reports.Rows.Rows" }, 

     // this is where I believe I'm having my problem 
{ "$group": {"_id": "$entity_id" 
     // , "$connection_id" 
     // , "objectCycleID" 
, "accountBalances": { "$push": "$objectRawOriginData.Reports.Rows.Rows.Cells.Value" } 
}}, 
{$project: {objectClass: {$literal: "Source Data"} 
, objectCategory: {$literal: "Application"} 
, objectType: {$literal: "Account Balances"} 
, objectOrigin: {$literal: "Xero"} 
, entity_ID: "$_id" 
, connection_ID: "$connection_ID" 
, accountBalances: "$accountBalances"} 
} 
] 
     // ,{$out: "std_sourceBusinessData"} 
) 

因此,每個我合併成一個單一的文件的文件具有相同的ENTITY_ID,CONNECTION_ID和cycle_id我要投入到新文檔。我也想確保新文檔具有自己獨特的object_id。

非常感謝您的幫助 - Mongo文檔不包含除$以外的任何$ group組件,但是如果我沒有將_id設置爲我想分組的東西(在上面的腳本中它是設置爲entity_id)它沒有正確分組。

回答

1

簡而言之,_id需要是一個「複合」的值,因此包括三個「子密鑰」:

{ "$group":{ 
    "_id": { 
     "entity_id": "$entity_id" 
     "connection_id": "$connection_id", 
     "objectCycleID": "$objectCycleID" 
    }, 
    "accountBalances": { 
     "$push": "$objectRawOriginData.Reports.Rows.Rows.Cells.Value" 
    } 
}}, 
{ "$project": { 
    "_id": 0, 
    "objectClass": { "$literal": "Source Data" }, 
    "objectCategory": { "$literal": "Application"}, 
    "objectType": { "$literal": "Account Balances"}, 
    "objectOrigin": { "$literal": "Xero"}, 
    "entity_ID": "$_id.entity_id", 
    "connection_ID": "$_id.connection_id", 
    "accountBalances": "$accountBalances" 
}} 

然後當然,referncing任何這些值中的後面的$project要求您現在使用前綴$_id,因爲這是現在的父密鑰。

正如任何MongoDB文檔一樣,_id可以是代表有效的BSON對象的任何東西。所以在這種情況下,組合意味着「所有這些字段值上的組」

+0

這太棒了,非常有意義 - 它的工作,你是一個明星! –