的CentOS 6.7 的PostgreSQL 9.5.3參考WAL包含無效頁面
我已經是上主備份複製數據庫服務器。
突然,備用服務器的postgresql進程停止在該日誌中。
2016-07-14 18:14:19.544 JST [][5783e03b.3cdb][0][15579]WARNING: page 1671400 of relation base/16400/559613 is uninitialized
2016-07-14 18:14:19.544 JST [][5783e03b.3cdb][0][15579]CONTEXT: xlog redo Heap2/VISIBLE: cutoff xid 1902107520
2016-07-14 18:14:19.544 JST [][5783e03b.3cdb][0][15579]PANIC: WAL contains references to invalid pages
2016-07-14 18:14:19.544 JST [][5783e03b.3cdb][0][15579]CONTEXT: xlog redo Heap2/VISIBLE: cutoff xid 1902107520
2016-07-14 18:14:21.026 JST [][5783e038.3cd9][0][15577]LOG: startup process (PID 15579) was terminated by signal 6: Aborted
2016-07-14 18:14:21.026 JST [][5783e038.3cd9][0][15577]LOG: terminating any other active server processes
而且,主服務器的postgresql日誌沒什麼特別的。
但是,主服務器的/ var/log/messages如下所示。
Jul 14 05:38:44 host kernel: sbridge: HANDLING MCE MEMORY ERROR
Jul 14 05:38:44 host kernel: CPU 8: Machine Check Exception: 0 Bank 9: 8c000040000800c0
Jul 14 05:38:44 host kernel: TSC 0 ADDR 1f7dad7000 MISC 90004000400008c PROCESSOR 0:306e4 TIME 1468442324 SOCKET 1 APIC 20
Jul 14 05:38:44 host kernel: EDAC MC1: CE row 1, channel 0, label "CPU_SrcID#1_Channel#0_DIMM#1": 1 Unknown error(s): memory scrubbing on FATAL area : cpu=8 Err=0008:00c0 (ch=0), addr = 0x1f7dad7000 => socket=1, Channel=0(mask=1), rank=4
Jul 14 05:38:44 host kernel:
Jul 14 18:30:40 host kernel: sbridge: HANDLING MCE MEMORY ERROR
Jul 14 18:30:40 host kernel: CPU 8: Machine Check Exception: 0 Bank 9: 8c000040000800c0
Jul 14 18:30:40 host kernel: TSC 0 ADDR 1f7dad7000 MISC 90004000400008c PROCESSOR 0:306e4 TIME 1468488640 SOCKET 1 APIC 20
Jul 14 18:30:41 host kernel: EDAC MC1: CE row 1, channel 0, label "CPU_SrcID#1_Channel#0_DIMM#1": 1 Unknown error(s): memory scrubbing on FATAL area : cpu=8 Err=0008:00c0 (ch=0), addr = 0x1f7dad7000 => socket=1, Channel=0(mask=1), rank=4
Jul 14 18:30:41 host kernel:
內存錯誤在1周前開始。所以,我懷疑內存錯誤導致postgresql的錯誤。
我的問題是在這裏。
1)內核可能導致內存錯誤導致postgresql的「WAL包含對無效頁面的引用」錯誤?
2)爲什麼主服務器的postgresql沒有任何日誌?
thx。
謝謝您的回覆。 我的postgresql的版本是9.5.3 – Jinil
我不認爲已經有一個已知的數據損壞錯誤,因爲複製。 –