Mongodb分片餘額不能正常工作，有大量moveChunk錯誤報告

我們有一個3分片的mongoDb集羣，每個分片包含3個節點的複製集，我們使用的mongoDb版本是3.2.6。我們有一個大小約230G的大型數據庫，其中包含大約5500個集合。我們發現大約2300個集合不均衡，其他3200個集合均勻分佈到3個分片。Mongodb分片餘額不能正常工作，有大量moveChunk錯誤報告

下面

是sh.status的結果（整個結果是太大了，我剛剛張貼的一部分）：

mongos> sh.status() 
--- Sharding Status --- 
    sharding version: { 
    "_id" : 1, 
    "minCompatibleVersion" : 5, 
    "currentVersion" : 6, 
    "clusterId" : ObjectId("57557345fa5a196a00b7c77a") 
} 
    shards: 
    { "_id" : "shard1", "host" : "shard1/10.25.8.151:27018,10.25.8.159:27018" } 
    { "_id" : "shard2", "host" : "shard2/10.25.2.6:27018,10.25.8.178:27018" } 
    { "_id" : "shard3", "host" : "shard3/10.25.2.19:27018,10.47.102.176:27018" } 
    active mongoses: 
    "3.2.6" : 1 
    balancer: 
    Currently enabled: yes 
    Currently running: yes 
     Balancer lock taken at Sat Sep 03 2016 09:58:58 GMT+0800 (CST) by iZ23vbzyrjiZ:27017:1467949335:-2109714153:Balancer 
    Collections with active migrations: 
     bdtt.normal_20131017 started at Sun Sep 18 2016 17:03:11 GMT+0800 (CST) 
    Failed balancer rounds in last 5 attempts: 0 
    Migration Results for the last 24 hours: 
     1490 : Failed with error 'aborted', from shard2 to shard3 
     1490 : Failed with error 'aborted', from shard2 to shard1 
     14 : Failed with error 'data transfer error', from shard2 to shard1 
    databases: 
    { "_id" : "bdtt", "primary" : "shard2", "partitioned" : true } 
     bdtt.normal_20160908 
      shard key: { "_id" : "hashed" } 
      unique: false 
      balancing: true 
      chunks: 
       shard2 142 
      too many chunks to print, use verbose if you want to force print 
     bdtt.normal_20160909 
      shard key: { "_id" : "hashed" } 
      unique: false 
      balancing: true 
      chunks: 
       shard1 36 
       shard2 42 
       shard3 46 
      too many chunks to print, use verbose if you want to force print 
     bdtt.normal_20160910 
      shard key: { "_id" : "hashed" } 
      unique: false 
      balancing: true 
      chunks: 
       shard1 34 
       shard2 32 
       shard3 32 
      too many chunks to print, use verbose if you want to force print 
     bdtt.normal_20160911 
      shard key: { "_id" : "hashed" } 
      unique: false 
      balancing: true 
      chunks: 
       shard1 30 
       shard2 32 
       shard3 32 
      too many chunks to print, use verbose if you want to force print 
     bdtt.normal_20160912 
      shard key: { "_id" : "hashed" } 
      unique: false 
      balancing: true 
      chunks: 
       shard2 126 
      too many chunks to print, use verbose if you want to force print 
     bdtt.normal_20160913 
      shard key: { "_id" : "hashed" } 
      unique: false 
      balancing: true 
      chunks: 
       shard2 118 
      too many chunks to print, use verbose if you want to force print 
    }

集「normal_20160913」是不均衡的，我發佈getShardDistribution （）下面這個集合的結果：

mongos> db.normal_20160913.getShardDistribution() 

Shard shard2 at shard2/10.25.2.6:27018,10.25.8.178:27018 
data : 4.77GiB docs : 203776 chunks : 118 
estimated data per chunk : 41.43MiB 
estimated docs per chunk : 1726 

Totals 
data : 4.77GiB docs : 203776 chunks : 118 
Shard shard2 contains 100% data, 100% docs in cluster, avg obj size on shard : 24KiB

平衡器過程處於運行狀態，而CHUNKSIZE是默認值（64M）：

mongos> sh.isBalancerRunning() 
true 
mongos> use config 
switched to db config 
mongos> db.settings.find() 
{ "_id" : "chunksize", "value" : NumberLong(64) } 
{ "_id" : "balancer", "stopped" : false }

而且我發現從mogos很多moveChunk錯誤的日誌，這可能是爲什麼有些藏品的不均衡，這裏是他們的最新組成部分的原因：

2016-09-19T14:25:25.427+0800 I SHARDING [conn37136926] moveChunk result: { ok: 0.0, errmsg: "Not starting chunk migration because another migration is already in progress", code: 117 } 
2016-09-19T14:25:59.620+0800 I SHARDING [conn37136926] moveChunk result: { ok: 0.0, errmsg: "Not starting chunk migration because another migration is already in progress", code: 117 } 
2016-09-19T14:25:59.644+0800 I SHARDING [conn37136926] moveChunk result: { ok: 0.0, errmsg: "Not starting chunk migration because another migration is already in progress", code: 117 } 
2016-09-19T14:35:02.701+0800 I SHARDING [conn37136926] moveChunk result: { ok: 0.0, errmsg: "Not starting chunk migration because another migration is already in progress", code: 117 } 
2016-09-19T14:35:02.728+0800 I SHARDING [conn37136926] moveChunk result: { ok: 0.0, errmsg: "Not starting chunk migration because another migration is already in progress", code: 117 } 
2016-09-19T14:42:18.232+0800 I SHARDING [conn37136926] moveChunk result: { ok: 0.0, errmsg: "Not starting chunk migration because another migration is already in progress", code: 117 } 
2016-09-19T14:42:18.256+0800 I SHARDING [conn37136926] moveChunk result: { ok: 0.0, errmsg: "Not starting chunk migration because another migration is already in progress", code: 117 } 
2016-09-19T14:42:27.101+0800 I SHARDING [conn37136926] moveChunk result: { ok: 0.0, errmsg: "Not starting chunk migration because another migration is already in progress", code: 117 } 
2016-09-19T14:42:27.112+0800 I SHARDING [conn37136926] moveChunk result: { ok: 0.0, errmsg: "Not starting chunk migration because another migration is already in progress", code: 117 } 
2016-09-19T14:43:41.889+0800 I SHARDING [conn37136926] moveChunk result: { ok: 0.0, errmsg: "Not starting chunk migration because another migration is already in progress", code: 117 }

我嘗試使用moveChunk命令手動，它返回相同的錯誤：

mongos> sh.moveChunk("bdtt.normal_20160913", {_id:ObjectId("57d6d107edac9244b6048e65")}, "shard3") 
{ 
    "cause" : { 
     "ok" : 0, 
     "errmsg" : "Not starting chunk migration because another migration is already in progress", 
     "code" : 117 
    }, 
    "code" : 117, 
    "ok" : 0, 
    "errmsg" : "move failed" 
}

我不知道是否太多的集合創建，導致遷移不堪重負？每天將創建大約60-80個新集合。

我需要幫助這裏回答下面的問題，任何提示將是巨大的：

爲什麼有些藏品的不均衡，是它涉及到大數目新創建的集合嗎？
有沒有什麼命令可以檢查處理遷移作業的細節？我得到了很多錯誤日誌，顯示一些遷移慢跑正在運行，但我找不到哪個正在運行。

來源

2016-09-19 user2495915

您在遷移過程中有'數據傳輸錯誤'，這可能指向網絡層的問題。另外，你能否提供每個碎片的更多細節？我看到每個分片只包含兩個節點（最少推薦三個數據承載節點）。這是故意的嗎？ –

我會猜測，但我的猜測是你的收藏非常不平衡，目前正在通過塊遷移來平衡（可能需要很長時間）。因此，您的手動塊遷移已排隊，但不會立即執行。

以下是可以解釋多一點幾點：

One chunk at a time：MongoDB的塊遷移發生在一個隊列機制，只有一次一個塊被遷移。
Balancer lock：平衡器鎖定信息可能會讓您更深入瞭解正在遷移的內容。您還應該能夠在日誌文件中看到日誌條目是塊遷移。

您的一個選擇是在您的收藏中做一些pre-splitting。預分割過程基本上配置了一個空集合以開始平衡並避免首先失衡。因爲一旦它們失去平衡，塊遷移過程可能不是你的朋友。

此外，您可能需要重新訪問分片鍵。你可能在你的分片鍵上做了一些錯誤，導致很多不平衡。

另外，您的數據大小對我來說似乎不是太大，以保證分片配置。請記住，除非您被數據大小/工作集大小屬性強制執行，否則不要執行分片配置。因爲分片不是免費的（你可能已經感覺到了痛苦）。

來源

2016-09-20 01:25:48

回答我的問題：最後，我們找到了根本原因，它與這一個「MongoDB balancer timeout with delayed replica」，造成不正常的副本集配置的完全相同的問題。當出現此問題發生時，我們的副本集的配置如下：

shard1:PRIMARY> rs.conf() 
{ 
    "_id" : "shard1", 
    "version" : 3, 
    "protocolVersion" : NumberLong(1), 
    "members" : [ 
     { 
      "_id" : 0, 
      "host" : "10.25.8.151:27018", 
      "arbiterOnly" : false, 
      "buildIndexes" : true, 
      "hidden" : false, 
      "priority" : 1, 
      "tags" : { 

      }, 
      "slaveDelay" : NumberLong(0), 
      "votes" : 1 
     }, 
     { 
      "_id" : 1, 
      "host" : "10.25.8.159:27018", 
      "arbiterOnly" : false, 
      "buildIndexes" : true, 
      "hidden" : false, 
      "priority" : 1, 
      "tags" : { 

      }, 
      "slaveDelay" : NumberLong(0), 
      "votes" : 1 
     }, 
     { 
      "_id" : 2, 
      "host" : "10.25.2.6:37018", 
      "arbiterOnly" : true, 
      "buildIndexes" : true, 
      "hidden" : false, 
      "priority" : 1, 
      "tags" : { 

      }, 
      "slaveDelay" : NumberLong(0), 
      "votes" : 1 
     }, 
     { 
      "_id" : 3, 
      "host" : "10.47.114.174:27018", 
      "arbiterOnly" : false, 
      "buildIndexes" : true, 
      "hidden" : true, 
      "priority" : 0, 
      "tags" : { 

      }, 
      "slaveDelay" : NumberLong(86400), 
      "votes" : 1 
     } 
    ], 
    "settings" : { 
     "chainingAllowed" : true, 
     "heartbeatIntervalMillis" : 2000, 
     "heartbeatTimeoutSecs" : 10, 
     "electionTimeoutMillis" : 10000, 
     "getLastErrorModes" : { 

     }, 
     "getLastErrorDefaults" : { 
      "w" : 1, 
      "wtimeout" : 0 
     }, 
     "replicaSetId" : ObjectId("5755464f789c6cd79746ad62") 
    } 
}

有副本集內的4個節點：一個伯，一個從屬，一個仲裁器和一個延遲24小時從站。這使得3個節點成爲多數，因爲仲裁器沒有數據存在，平衡器需要等待延遲的從器件來滿足寫入關注點（確保接收器碎片已經接收到該塊）。

有幾種方法可以解決這個問題。我們剛剛取消了仲裁員，平衡器現在工作正常。

來源

2016-10-07 03:19:51 user2495915

Mongodb分片餘額不能正常工作，有大量moveChunk錯誤報告

回答

相關問題