單個操作中的mongodb多個聚合

我有一個包含以下文檔的項目集合。單個操作中的mongodb多個聚合

{ "item" : "i1", "category" : "c1", "brand" : "b1" } 
{ "item" : "i2", "category" : "c2", "brand" : "b1" } 
{ "item" : "i3", "category" : "c1", "brand" : "b2" } 
{ "item" : "i4", "category" : "c2", "brand" : "b1" } 
{ "item" : "i5", "category" : "c1", "brand" : "b2" }

我想分開的聚集結果 - 按類別>計數，按品牌計算。請注意，它不是由（類別，品牌）計數

我能夠使用map-reduce使用以下代碼來執行此操作。

map = function(){ 
    emit({type:"category",category:this.category},1); 
    emit({type:"brand",brand:this.brand},1); 
} 
reduce = function(key, values){ 
    return Array.sum(values) 
} 
db.item.mapReduce(map,reduce,{out:{inline:1}})

，其結果是

{ 
     "results" : [ 
       { 
         "_id" : { 
           "type" : "brand", 
           "brand" : "b1" 
         }, 
         "value" : 3 
       }, 
       { 
         "_id" : { 
           "type" : "brand", 
           "brand" : "b2" 
         }, 
         "value" : 2 
       }, 
       { 
         "_id" : { 
           "type" : "category", 
           "category" : "c1" 
         }, 
         "value" : 3 
       }, 
       { 
         "_id" : { 
           "type" : "category", 
           "category" : "c2" 
         }, 
         "value" : 2 
       } 
     ], 
     "timeMillis" : 21, 
     "counts" : { 
       "input" : 5, 
       "emit" : 10, 
       "reduce" : 4, 
       "output" : 4 
     }, 
     "ok" : 1, 
}

我可以發射兩種不同的聚集如下命令得到相同的結果。

db.item.aggregate({$group:{_id:"$category",count:{$sum:1}}}) 
db.item.aggregate({$group:{_id:"$brand",count:{$sum:1}}})

是否有反正我可以使用聚合框架通過單個聚合命令做同樣的事情。

我在這裏簡化了我的情況，但實際上我需要從子文檔數組中的字段進行分組。假設上述結構是我放鬆後的結構。

它是一個實時查詢（有人在等待響應），雖然在較小的數據集上，所以執行時間非常重要。

我正在使用MongoDB 2.4。

來源

2014-04-30 Poorna

在一個大型數據集上，我會說你現在的mapreduce方法是最好的，因爲這種聚合技術對於大數據並不適用。但是，可能在相當小的尺寸很可能就是你需要：

db.items.aggregate([ 
    { "$group": { 
     "_id": null, 
     "categories": { "$push": "$category" }, 
     "brands": { "$push": "$brand" } 
    }}, 
    { "$project": { 
     "_id": { 
      "categories": "$categories", 
      "brands": "$brands" 
     }, 
     "categories": 1 
    }}, 
    { "$unwind": "$categories" }, 
    { "$group": { 
     "_id": { 
      "brands": "$_id.brands", 
      "category": "$categories" 
     }, 
     "count": { "$sum": 1 } 
    }}, 
    { "$group": { 
     "_id": "$_id.brands", 
     "categories": { "$push": { 
      "category": "$_id.category", 
      "count": "$count" 
     }}, 
    }}, 
    { "$project": { 
     "_id": "$categories", 
     "brands": "$_id" 
    }}, 
    { "$unwind": "$brands" }, 
    { "$group": { 
     "_id": { 
      "categories": "$_id", 
      "brand": "$brands" 
     }, 
     "count": { "$sum": 1 } 
    }}, 
    { "$group": { 
     "_id": null, 
     "categories": { "$first": "$_id.categories" }, 
     "brands": { "$push": { 
      "brand": "$_id.brand", 
      "count": "$count" 
     }} 
    }} 
])

不是真的一樣MapReduce的輸出，你可以在一些階段拋出來改變輸出格式，但是這應該是可用：

{ 
    "_id" : null, 
    "categories" : [ 
      { 
        "category" : "c2", 
        "count" : 2 
      }, 
      { 
        "category" : "c1", 
        "count" : 3 
      } 
    ], 
    "brands" : [ 
      { 
        "brand" : "b2", 
        "count" : 2 
      }, 
      { 
        "brand" : "b1", 
        "count" : 3 
      } 
    ] 
}

正如你所看到的，這涉及到相同的流水線工藝中爲了陣列之間的洗牌組每組無論是「類別」或「品牌」的公平一點。我會再次說，這對大數據並不適用，但對於像「訂單中的項目」這樣的東西，它可能會做得很好。

當然，如你所說，你已經有所簡化，等等null第一組密鑰要麼會是別的東西或任縮小由早期$match階段，這也許正是這樣做null情況你想要做的。

來源

2014-04-30 10:09:10

太棒了！理論上的作品！但9個管道 - 不直觀且易於管理。這就像多次進行自我連接，記憶和流程密集型一樣。在快速測量時，它比調用聚合兩次的時間多3倍。對我而言，這不是正確的事情，因爲我的用例不僅要求訂單中的物品，而且還要在給定時間範圍內的訂單和計算數量，價格總和等方面做到這一點， – Poorna

@Poorna是的，可能是這樣，但我確實添加了該免責聲明在開始和主要問題將始終與大小，大陣列是一個很大的性能問題。但我也注意到，除了你真正要求的事情之外做任何事情其實都不是你的問題，是嗎？所以如果你想真正解決你真正的問題，你最好發佈一個實際上提出的問題。 –

我喜歡你的解決方案，在發佈我的問題之前我想不出任何更接近的解決方案。我只是解釋它爲什麼不適合我的情況。 – Poorna

單個操作中的mongodb多個聚合

回答

相關問題