2014-03-31 90 views
2

我需要使用PyMongo驅動程序通過無序的不同對字段(senderrecipient)將特定集合中的記錄分組。 例如對(sender_field_value,recipient_field_value)(recipient_field_value,sender_field_value)被認爲是相等的。MongoDB聚合 - 按不同對分組

我的聚合管線

groups = base.flow.records.aggregate([ 
    {'$match': {'$or': [ 
       {'sender': _id}, 
       {'recipient': _id} 
      ] 
     } 
    }, 
    {'$group': { 
      '_id': { 
       'sender': '$sender', 
       'recipient': '$recipient', 
      }, 
      'data_id': { 
       '$max': '$_id' 
      } 
     } 
    },    
    {'$limit': 20} 
]) 

施加到

{ "_id" : ObjectId("533950ca9c3b6222569520c2"), "recipient" : ObjectId("533950ca9c3b6222569520c1"), "sender" : ObjectId("533950ca9c3b6222569520c0") } 
{ "_id" : ObjectId("533950ca9c3b6222569520c4"), "recipient" : ObjectId("533950ca9c3b6222569520c0"), "sender" : ObjectId("533950ca9c3b6222569520c1") } 

產生以下

{'ok': 1.0, 
'result': [ 
    {'_id': {'recipient': ObjectId('533950ca9c3b6222569520c0'), 'sender': ObjectId('533950ca9c3b6222569520c1')}, 
    'data_id': ObjectId('533950ca9c3b6222569520c4')}, 
    {'_id': {'recipient': ObjectId('533950ca9c3b6222569520c1'), 'sender': ObjectId('533950ca9c3b6222569520c0')}, 
    'data_id': ObjectId('533950ca9c3b6222569520c2')} 
    ] 
} 

但所期望的結果的數據僅僅是

{'ok': 1.0, 
'result': [ 
    {'_id': {'recipient': ObjectId('533950ca9c3b6222569520c0'), 'sender': ObjectId('533950ca9c3b6222569520c1')}, 
    'data_id': ObjectId('533950ca9c3b6222569520c4')} 
    ] 
} 

什麼是正確的管道?

+0

也許顯示應該減少到這個結果的數據。 –

+0

@NeilLunn已更新,但我認爲它太小而無法幫助 – vaultah

回答

2

實現不同的對分組的技巧是通過傳遞給$ group _id兩種情況下相同的'東西'。我會用一個正常的比較做(你能拿出不同的東西更符合你的情況 - 如果您的發送者和接收者沒有直接的可比性我的解決方案不工作):

{$project : { 
    "_id" : 1, 
    "groupId" : {"$cond" : [{"$gt" : ['$sender', '$recipient']}, {big : "$sender", small : "$recipient"}, {big : "$recipient", small : "$sender"}]} 
}}, 
{$group: { 
    '_id': "$groupId", 
    'data_id': { 
     '$max': '$_id' 
    } 
}} 

完整的聚集管道看起來像:

{$match : { 
    '$or': [{'sender': userId},{'recipient': userId}] 
}}, 
{$project : { 
    "_id" : 1, 
    "groupId" : {"$cond" : [{"$gt" : ['$sender', '$recipient']}, {big : "$sender", small : "$recipient"}, {big : "$recipient", small : "$sender"}]} 
}}, 
{$group: { 
    '_id': "$groupId", 
    'data_id': { 
     '$max': '$_id' 
    } 
}}, 
{$limit: 20} 
+0

這很聰明。謝謝。 –