取決於你想要輸出的是什麼,map reduce絕對是你的選擇。下面是一個簡單的,將你的文件上面,隔離的唯一ID,並計算每一個的發生:
map = function() {
this.data.people.forEach(function(id){
emit(id, {count:1});
});
this.data.guys.forEach(function(id){
emit(id, {count:1});
});
};
reduce = function(key, values) {
var total = 0;
values.forEach(function(value) {
total += value.count;
});
return {count: total};
};
db.test.mapReduce(map, reduce, {out: 'result'});
如果你的數據集是:
{ "_id" : 1, "data" : { "people" : [ { "id" : "234323432" }, { "id" : "44213126" }, { "id" : "1321452" } ], "guys" : [ { "id" : "521452" }, { "id" : "92321452" } ] } }
{ "_id" : 2, "data" : { "people" : [ { "id" : "234323432" }, { "id" : "44213126" }, { "id" : "1321452" } ], "guys" : [ { "id" : "521452" }, { "id" : "92321452" } ] } }
{ "_id" : 3, "data" : { "people" : [ { "id" : "234323432" }, { "id" : "44213126" }, { "id" : "1321452" } ], "guys" : [ { "id" : "521452" }, { "id" : "92321452" } ] } }
運行:
db.test.mapReduce(map, reduce, {out: 'result'});
將產生一個名爲「結果」集合中有以下內容:
{ "_id" : { "id" : "1321452" }, "value" : { "count" : 3 } }
{ "_id" : { "id" : "234323432" }, "value" : { "count" : 3 } }
{ "_id" : { "id" : "44213126" }, "value" : { "count" : 3 } }
{ "_id" : { "id" : "521452" }, "value" : { "count" : 3 } }
{ "_id" : { "id" : "92321452" }, "value" : { "count" : 3 } }
您可以將上述內容塑造成您想要表達數據的方式或想要如何處理的數據,但希望這可以幫助您順其自然。
當然,我可以做到這一點,但這將繞過在MongoDB中做到這一點的美麗 – acheruns 2012-03-06 14:30:34
嗯,這取決於你對美的定義。如果你問Mongo做到這一點:服務器完成更多的工作。由於無論如何您都將下載所有數據,因此將處理負擔交給客戶可能是值得的。這真的取決於你的架構。 – Eduardo 2012-03-06 15:07:05