我的目標是讓我的Map-Reduce作業始終在我的MongoDB集羣的分片的輔助節點上運行。在分片羣集上運行MapReduce時,MongoDB會忽略readPreference?
我將readPreference
設置爲secondary,將out
參數的MapReduce命令設置爲inline
以實現此目的。這在非分片副本集合上工作正常:作業在輔助副本上運行。但是,在分片羣集上,此作業在Primary上運行。
有人可以解釋爲什麼發生這種情況或指向任何相關的文檔?我在relevant documentation中找不到任何東西。從二次
public static final String mapfunction = "function() { emit(this.custid, this.txnval); }";
public static final String reducefunction = "function(key, values) { return Array.sum(values); }";
...
private void mapReduce() {
...
MapReduceIterable<Document> iterable = collection.mapReduce(mapfunction, reducefunction);
...
}
...
Builder options = MongoClientOptions.builder().readPreference(ReadPreference.secondary());
MongoClientURI uri = new MongoClientURI(MONGO_END_POINT, options);
MongoClient client = new MongoClient(uri);
...
日誌時,這是一個上副本集執行:
2016-11-23T15:05:26.735 + 0000我COMMAND [conn671]命令test.txns命令:MapReduce的映射精簡{: 「txns」,map:function(){emit(this.custid,this.txnval); },reduce:function(key,values){return Array.sum(values); },out:{inline:1},query:null,sort:null,finalize:null,scope:null,verbose:true} planSummary:COUNT keyUpdates:0 writeConflicts:0 numYields:7 reslen:4331 locks:全球:{acquireCount:{r:44}},數據庫:{acquireCount:{r:3,R:19}},集合:{acquireCount:{r:3}}}協議:op_query 124ms
Sharded collection :從碎片-0初級
mongos> db.txns.getShardDistribution()
Shard Shard-0 at Shard-0/primary.shard0.example.com:27017,secondary.shard0.example.com:27017
data : 498KiB docs : 9474 chunks : 3
estimated data per chunk : 166KiB
estimated docs per chunk : 3158
Shard Shard-1 at Shard-1/primary.shard1.example.com:27017,secondary.shard1.example.com:27017
data : 80KiB docs : 1526 chunks : 3
estimated data per chunk : 26KiB
estimated docs per chunk : 508
Totals
data : 579KiB docs : 11000 chunks : 6
Shard Shard-0 contains 86.12% data, 86.12% docs in cluster, avg obj size on shard : 53B
Shard Shard-1 contains 13.87% data, 13.87% docs in cluster, avg obj size on shard : 53B
日誌:
2016-11-24T08:46:30.828 + 0000我COMMAND [conn357]命令測試$ cmd命令:mapreduce.shardedfinish {mapred uce.shardedfinish:{mapreduce:「txns」,map:function(){emit(this.custid,this.txnval); },reduce:function(key,values){return Array.sum(values); },out:{in line:1},query:null,sort:null,finalize:null,scope:null,verbose:true,$ queryOptions:{$ readPreference:{mode:「secondary」}}},inputDB:「test」,shardedOutputCollection:「tmp.mrs.txns_1479977190_0」,shards:{Shard-0/primary.shard0.example.com:27017,secondary.shard0.example.com:27017:{result :「tmp.mrs.txns_1479977190_0」,timeMillis:123,timing:{mapTime:51,emitLoop:116,reduceTime:9,mode:「mixed」,total:123},counts:{input:9474,emit:9474, reduce:909,output:101},ok:1.0,$ gleS tats:{lastOpTime:Timestamp 1479977190000 | 103,electionId:ObjectId('7fffffff0000000000000001')}},Shard-1/primary.shard1.example.com:27017 ,secondary.shard1.example.com:27017:{result:「tmp.mrs.txns_1479977190_0」,timeMil lis:71,時間: {mapTime:8,emitLoop:63,reduceTime:4,mode:「mixed」,total:71},counts:{input:1526,emit:1526,reduce:197,output:101} ,ok:1.0,$ gleStats:{lastOpTime:Timestamp 1479977190000 | 103,electionId:ObjectId('7fffffff0000000000000001')}}},shardCounts:{Sha rd-0/primary.shard0.example.com:27017,secondary.shard0 .example.com:27017:{input:9474,emit:9474,reduce:909,output:101},Shard-1/primary.shard1.example.com:27017,secondary.shard1.example.com:27017:{ inpu t:1526,emit:1526,reduce:197,output:101}},counts:{emit:11000,input:11000,output:202,reduce:1106}} keyUpdates:0 writeConflicts:0 numYields:0 reslen :4368鎖:{全局:{acquireCount:{r:2}},數據庫:{acquireCount:{r:1}},集合:{acqu ireCount:{r:1}}}協議:op_command 115ms 2016- 11-24T08:46 :30.830 + 0000 I COMMAND [conn46] CMD:drop test.tmp.mrs。txns_1479977190_0
有關預期行爲的任何指針都會非常有用。謝謝。
寫了一篇關於這個原因的博客文章,這對於希望在MongoDB上挖掘其MR的人來說是一個重要的限制: https://scalegrid.io/blog/mongodb-performance-running-mongodb-map-reduce-操作上,次級/ – Vaibhaw