2017-03-03 120 views
1

從上一個星期我試圖爲我的一個節點mongodb(3.4.2版本)設置副本集,但面臨多個問題。我的主節點目前擁有大約650GB的數據,並且每天增長90GB。第一次,我添加了一個空的數據目錄的新的輔助節點,幾乎一天後,它在oplog問題上出現了太多的滯後。下次我嘗試手動複製數據。複製後重新啓動次要它開始給我的錯誤,我不能從主同步(沒有連接問題,我能ping)。我再次重試手動複製過程,但這次失敗,出現以下錯誤。由於有線老虎問題是與特定的收集文件。我再次複製該文件並重試,但是在同樣的問題中再次失敗。有人可以幫助我設立中學。隨着數據的增長,每天都會變得越來越困難,而且我無法長時間保持主服務器的正常運行(在手動拷貝過程中,我停止了所有主服務器的寫操作)。Mongo Db二級安裝

2017-03-02T16:08:16.315+0000 E STORAGE [initandlisten] WiredTiger error (-31802) [1488470896:315136][17051:0x7ffdbd3d7dc0], file:mcse.45trace/collection-16-7756455024301269277.wt, WT_SESSION.open_cursor: /app/data/mcse.45trace/collection-16-7756455024301269277.wt: handle-read: pread: failed to read 4096 bytes at offset 86474874880: WT_ERROR: non-specific WiredTiger error

2017-03-02T16:08:16.315+0000 I - [initandlisten] Invariant failure: ret resulted in status UnknownError: -31802: WT_ERROR: non-specific WiredTiger error at src/mongo/db/storage/wiredtiger/wiredtiger_session_cache.cpp 95

回答

0

如果你可以解決複製延遲的第一個問題,那麼你可能會讓一切運行正常。在Troubleshooting Replica Sets guide看看,它有一些有益的建議:

Possible causes of replication lag include:

Network Latency

Check the network routes between the members of your set to ensure that there is no packet loss or network routing issue.
Use tools including ping to test latency between set members and traceroute to expose the routing of packets network endpoints.

Disk Throughput

If the file system and disk device on the secondary is unable to flush data to disk as quickly as the primary, then the secondary will have difficulty keeping state. Disk-related issues are incredibly prevalent on multi-tenant systems, including virtualized instances, and can be transient if the system accesses disk devices over an IP network (as is the case with Amazon’s EBS system.)
Use system-level tools to assess disk status, including iostat or vmstat .

Concurrency

In some cases, long-running operations on the primary can block replication on secondaries. For best results, configure write concern to require confirmation of replication to secondaries. This prevents write operations from returning if replication cannot keep up with the write load.
Use the database profiler to see if there are slow queries or long-running operations that correspond to the incidences of lag.

Appropriate Write Concern

If you are performing a large data ingestion or bulk load operation that requires a large number of writes to the primary, particularly with unacknowledged write concern , the secondaries will not be able to read the oplog fast enough to keep up with changes.
To prevent this, request write acknowledgement write concern after every 100, 1,000, or another interval to provide an opportunity for secondaries to catch up with the primary.
For more information see:
Write Concern
Replica Set Write Concern
Oplog Size

0

WiredTiger error (-31802) file:xxx.wt

這可能與損壞.wt文件(例如WiredTiger.wt/WiredTiger.turtle)按SERVER-31076 bug報告。

嘗試運行:

mongod --repair --dbpath /path/to/data/db 

還要確保所有data/db文件具有讀取權和寫入權限。