我正在嘗試使用C/Cython擴展和multiprocessing
來查找Python/NumPy程序中令人討厭的內存泄漏的起源。調試Python/NumPy內存泄漏
每個子進程處理一張圖像列表,每個子進程通過Queue
向主進程發送輸出數組(通常大約200-300MB)。相當標準的地圖/縮小設置。
正如你可以想象的內存泄漏可以採取巨大的比例與這個巨大的陣列,並有多個進程愉快地超過20GB內存時,他們只需要5-6GB是...煩人。
我已經嘗試通過Valgrind運行Python的調試版本,並對內存泄漏的擴展進行了四重檢查,但沒有發現任何內容。
我檢查了我的Python代碼中對數組的懸掛引用,並且還使用NumPy的allocation tracker來檢查我的數組是否確實被釋放。他們是。
我做的最後一件事是在GDB連接到我的過程中的一個(這個壞男孩現在在27GB RAM和計數運行),傾倒堆到磁盤的很大一部分。讓我驚訝的是,傾銷文件充滿了零!大約7G的零值。
這是Python/NumPy中的標準內存分配行爲嗎?我是否錯過了一些顯而易見的東西,這可以解釋爲什我如何正確管理內存?
編輯:爲了記錄在案,我正在與NumPy 1.7.1和Python 2.7.3。
編輯2:我一直在監測過程與strace
,並且它似乎不斷增加的每個處理(使用系統調用brk()
)的破發點。
CPython實際上是否正確釋放內存? C擴展,NumPy數組呢?誰決定什麼時候調用brk()
,它是Python本身還是它的底層庫(libc
,...)?
以下是帶一個迭代(即一個輸入圖像集)的帶有註釋的樣本strace日誌。請注意,中斷點不斷增加,但我確信(使用objgraph
)Python解釋器中沒有有意義的NumPy數組。
# Reading .inf files with metadata
# Pretty small, no brk()
open("1_tif_all/AIR00642_1.inf", O_RDONLY) = 6
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f9387fff000
munmap(0x7f9387fff000, 4096) = 0
open("1_tif_all/AIR00642_2.inf", O_RDONLY) = 6
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f9387fff000
munmap(0x7f9387fff000, 4096) = 0
open("1_tif_all/AIR00642_3.inf", O_RDONLY) = 6
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f9387fff000
munmap(0x7f9387fff000, 4096) = 0
open("1_tif_all/AIR00642_4.inf", O_RDONLY) = 6
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f9387fff000
munmap(0x7f9387fff000, 4096) = 0
# This is where I'm starting the heavy processing
write(2, "[INFO/MapProcess-1] Shot 642: Da"..., 68) = 68
write(2, "[INFO/MapProcess-1] Shot 642: Vi"..., 103) = 103
write(2, "[INFO/MapProcess-1] Shot 642: Re"..., 66) = 66
# I'm opening a .tif image (752 x 480, 8-bit, 1 channel)
open("1_tif_all/AIR00642_3.tif", O_RDONLY) = 6
read(6, "II*\0JC\4\0", 8) = 8
mmap(NULL, 279600, PROT_READ, MAP_SHARED, 6, 0) = 0x7f9387fbb000
munmap(0x7f9387fbb000, 279600) = 0
write(2, "[INFO/MapProcess-1] Shot 642: Pr"..., 53) = 53
# Another .tif
open("1_tif_all/AIR00642_4.tif", O_RDONLY) = 6
read(6, "II*\0\266\374\3\0", 8) = 8
mmap(NULL, 261532, PROT_READ, MAP_SHARED, 6, 0) = 0x7f9387fc0000
munmap(0x7f9387fc0000, 261532) = 0
write(2, "[INFO/MapProcess-1] Shot 642: Pr"..., 51) = 51
brk(0x1aea97000) = 0x1aea97000
# Another .tif
open("1_tif_all/AIR00642_1.tif", O_RDONLY) = 6
read(6, "II*\0\220\253\4\0", 8) = 8
mmap(NULL, 306294, PROT_READ, MAP_SHARED, 6, 0) = 0x7f9387fb5000
munmap(0x7f9387fb5000, 306294) = 0
brk(0x1af309000) = 0x1af309000
write(2, "[INFO/MapProcess-1] Shot 642: Pr"..., 53) = 53
brk(0x1b03da000) = 0x1b03da000
# Another .tif
open("1_tif_all/AIR00642_2.tif", O_RDONLY) = 6
mmap(NULL, 345726, PROT_READ, MAP_SHARED, 6, 0) = 0x7f9387fab000
munmap(0x7f9387fab000, 345726) = 0
brk(0x1b0c42000) = 0x1b0c42000
write(2, "[INFO/MapProcess-1] Shot 642: Pr"..., 51) = 51
# I'm done reading my images
write(2, "[INFO/MapProcess-1] Shot 642: Fi"..., 72) = 72
# Allocating some more arrays for additional variables
# Increases by about 8M at a time
brk(0x1b1453000) = 0x1b1453000
brk(0x1b1c63000) = 0x1b1c63000
brk(0x1b2473000) = 0x1b2473000
brk(0x1b2c84000) = 0x1b2c84000
brk(0x1b3494000) = 0x1b3494000
brk(0x1b3ca5000) = 0x1b3ca5000
# What are these mmap calls doing here?
mmap(NULL, 270594048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f9377df1000
mmap(NULL, 270594048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f9367be2000
mmap(NULL, 270594048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f93579d3000
mmap(NULL, 270594048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f93477c4000
mmap(NULL, 270594048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f93375b5000
munmap(0x7f93579d3000, 270594048) = 0
munmap(0x7f93477c4000, 270594048) = 0
mmap(NULL, 270594048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f93579d3000
munmap(0x7f93375b5000, 270594048) = 0
mmap(NULL, 50737152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f9354970000
munmap(0x7f9354970000, 50737152) = 0
brk(0x1b4cc6000) = 0x1b4cc6000
brk(0x1b5ce7000) = 0x1b5ce7000
EDIT 3:Is freeing handled differently for small/large numpy arrays?可能是相關的。我越來越確信,我只是分配了太多不能釋放到系統中的數組,因爲它實際上是標準行爲。將嘗試事先分配我的數組,並根據需要重用它們。
你用什麼來讀取圖像文件?我在過去使用PIL'Image'對象時遇到了內存泄漏問題 –
我正在使用PyLibTiff綁定。我解決了這個問題,看到我的答案! –