2012-05-18 62 views
1

可能重複:
Big Satellite Image Processing計算凍結

我試圖運行莫特活潑的雙顳上多光譜RapidEye圖像的Python伊馬德實施。它基本上計算兩幅圖像的典型相關性,然後將它們相減。我遇到的問題是圖像是5000 x 5000 x 5(帶)像素。當我在整個圖像上運行此我的電腦死機非常,我不得不將其關閉。

有沒有人有個想法可以讓python 崩潰這樣的電腦?例如,如果我爲每個樂隊選擇2999x2999像素,則一切運行良好。

8 GBS RAM,I7-2617M 1.5 1.5千兆赫,Windows7的64位。即時通訊使用64位版本的一切:python(2.7),numpy,scipy和gdal。

謝謝你提前!

def covw(dm,w): 
    # weighted covariance matrix and means 
    # from (transposed) data array  
     N = size(dm,0) 
     n = size(w) 
     sumw = sum(w) 
     ws = tile(w,(N,1)) 
     means = mat(sum(ws*dm,1)/sumw).T 
     means = tile(means,(1,n)) 
     dmc = dm - means 
     dmc = multiply(dmc,sqrt(ws)) 
     covmat = dmc*dmc.T/sumw 
     return (covmat,means) 

    def main(): 
    # ------------test---------------------------------------------------------------  
    if len(sys.argv) == 1:   
    (sys.argv).extend(['-p','[0,1,2,3,4]','- 
    d','[0,4999,0,4999]', 
'c://users//pythonxy//workspace//1uno.tif','c://users//pythonxy//workspace//2dos.tif']) 
    # -------------------------------------------------------------------------------   

options, args = getopt.getopt(sys.argv[1:],'hp:d:') 
pos = None 
dims = None    
for option, value in options: 
    if option == '-h': 
     print 'Usage: python %s [-p "bandPositions" -d "spatialDimensions"] 
     filename1 filename2' %sys.argv[0] 
     print '  bandPositions and spatialDimensions are quoted lists, 
     e.g., -p "[0,1,3]" -d "[0,400,0,400]" \n' 
     sys.exit(1) 
    elif option == '-p': 
     pos = eval(value) 
    elif option == '-d': 
     dims = eval(value) 
if len(args) != 2: 
    print 'Incorrect number of arguments' 
    print 'Usage: python %s [-p "bandspositions" -d "spatialdimensions"] 
    filename1 filename2 \n' %sys.argv[0] 
    sys.exit(1)          
gdal.AllRegister() 
fn1 = args[0] 
fn2 = args[1] 
path = os.path.dirname(fn1) 
basename1 = os.path.basename(fn1) 
root1, ext = os.path.splitext(basename1) 
basename2 = os.path.basename(fn2) 
outfn = path+'\\MAD['+basename1+'-'+basename2+']'+ext 
inDataset1 = gdal.Open(fn1,GA_ReadOnly)  
inDataset2 = gdal.Open(fn2,GA_ReadOnly) 
cols = inDataset1.RasterXSize 
rows = inDataset1.RasterYSize  
bands = inDataset1.RasterCount 
cols2 = inDataset2.RasterXSize 
rows2 = inDataset2.RasterYSize  
bands2 = inDataset2.RasterCount 
if (rows != rows2) or (cols != cols2) or (bands != bands2): 
    sys.stderr.write("Size mismatch") 
    sys.exit(1) 
if pos is None: 
    pos = range(bands) 
else: 
    bands = len(pos) 
if dims is None: 
    x0 = 0 
    y0 = 0 
else: 
    x0 = dims[0] 
    y0 = dims[2] 
    cols = dims[1]-dims[0] + 1 
    rows = dims[3]-dims[2] + 1      
# initial weights 
wt = ones(cols*rows)  
# data array (transposed so observations are columns) 
dm = zeros((2*bands,cols*rows),dtype='float32') 
k = 0 
for b in pos: 
    band1 = inDataset1.GetRasterBand(b+1) 
    band1 = band1.ReadAsArray(x0,y0,cols,rows).astype(float) 
    dm[k,:] = ravel(band1) 
    band2 = inDataset2.GetRasterBand(b+1) 
    band2 = band2.ReadAsArray(x0,y0,cols,rows).astype(float)   
    dm[bands+k,:] = ravel(band2) 
    k += 1 
print '=========================' 
print '  iMAD' 
print '=========================' 
print 'time1: '+fn1 
print 'time2: '+fn2 
print 'Delta [canonical correlations]' 
# iteration of MAD   
delta = 1.0 
oldrho = zeros(bands) 
iter = 0 
while (delta > 0.001) and (iter < 50):  
#  weighted covariance matrices and means 
    sigma,means = covw(dm,wt)   
    s11 = mat(sigma[0:bands,0:bands]) 
    s22 = mat(sigma[bands:,bands:]) 
    s12 = mat(sigma[0:bands,bands:]) 
    s21 = mat(sigma[bands:,0:bands]) 
#  solution of generalized eigenproblems 
    s22i = mat(linalg.inv(s22)) 
    lama,a = linalg.eig(s12*s22i*s21,s11) 
    s11i = mat(linalg.inv(s11))  
    lamb,b = linalg.eig(s21*s11i*s12,s22) 
#  sort a 
    idx = argsort(lama) 
    a = a[:,idx] 
#  sort b   
    idx = argsort(lamb) 
    b = b[:,idx]   
#  canonical correlations   
    rho = sqrt(real(lamb[idx]))    
#  normalize dispersions 
    a = mat(a) 
    tmp1 = a.T*s11*a 
    tmp2 = 1./sqrt(diag(tmp1)) 
    tmp3 = tile(tmp2,(bands,1)) 
    a = multiply(a,tmp3) 
    b = mat(b) 
    tmp1 = b.T*s22*b 
    tmp2 = 1./sqrt(diag(tmp1)) 
    tmp3 = tile(tmp2,(bands,1)) 
    b = multiply(b,tmp3) 
#  assure positive correlation 
    tmp = diag(a.T*s12*b) 
    b = b*diag(tmp/abs(tmp)) 
#  canonical and MAD variates 
    U = a.T*mat(dm[0:bands,:]-means[0:bands,:])  
    V = b.T*mat(dm[bands:,:]-means[bands:,:])   
    MAD = U-V 
#  new weights   
    var_mad = tile(mat(2*(1-rho)).T,(1,rows*cols))  
    chisqr = sum(multiply(MAD,MAD)/var_mad,0) 
    wt = 1-stats.chi2.cdf(chisqr,[bands]) 
#  continue iteration   
    delta = sum(abs(rho-oldrho)) 
    oldrho = rho 
    print delta 
    iter += 1 
# write results to disk 
driver = inDataset1.GetDriver()  
outDataset = driver.Create(outfn,cols,rows,bands+1,GDT_Float32) 
projection = inDataset1.GetProjection() 
geotransform = inDataset1.GetGeoTransform() 
if geotransform is not None: 
    gt = list(geotransform) 
    gt[0] = gt[0] + x0*gt[1] 
    gt[3] = gt[3] + y0*gt[5] 
    outDataset.SetGeoTransform(tuple(gt)) 
if projection is not None: 
    outDataset.SetProjection(projection)   
for k in range(bands):   
    outBand = outDataset.GetRasterBand(k+1) 
    outBand.WriteArray(resize(MAD[k,:],(rows,cols)),0,0) 
    outBand.FlushCache() 
outBand = outDataset.GetRasterBand(bands+1)  
outBand.WriteArray(resize(chisqr,(rows,cols)),0,0) 
outBand.FlushCache()  
outDataset = None 
inDataset1 = None 
inDataset2 = None 
print 'result written to: '+outfn 
print '---------------------------------'  

如果 == '主要': 的main()

+2

什麼是崩潰的症狀是什麼?你確定這不僅僅是顛簸交換? –

+0

對不起,但我不知道顛簸交換是什麼。我有點新鮮。一切都凍結。不能做任何事情,但關閉電腦。 – JEquihua

+0

你能更具體一點嗎?鼠標指針是否移動?當你敲擊鍵盤上的數字鍵時,鍵盤上的指示燈是否亮起?硬盤活動指示燈是否閃爍?你可以ALT + TAB嗎? –

回答

1

這聽起來像這種操作只是需要更多的內存比您的計算機可以提供。這是一個過於簡單化,但是當系統耗盡實際RAM使用的,有時它寫入似乎是被少使用到硬盤的內存部分,因此它可以使用真正的內存別的東西。硬盤的速度比主內存慢很多個數量級,所以當你的軟件需要部分已寫入磁盤的內存時,一切都會變得很慢。當這種情況發生在一個戲劇性的規模,軟件和操作系統的部分都在不斷嘗試使用的已換出內存條(寫入到磁盤),硬盤驅動器可以得到一個重要的鍛鍊試圖尋求來回,寫很多東西,閱讀很多東西,寫更多的東西等等。在這樣的情況下,系統可能變得非常沒有響應。你可以看看你的系統活動監視器是否真的發生了什麼(我忘記了它們在Windows上被調用的內容,但我知道它們在那裏;有些軟件會顯示你分配了多少內存,使用等,併爲你繪製一個不錯的圖表)。在觀看這些內容時,啓動程序並觀察內存分配率。

有可能以緩解內存使用此代碼,如果較少的事情在同一時間保存在內存中的一些方法,但我沒有看到隨便它們是什麼。你也可以添加更多的RAM給你的系統,希望它能解決這個問題。

+0

具有很多意義。有沒有一種可靠的方法來跟蹤哪部分代碼最終會死?我的意思是,不僅僅是試驗和錯誤,因爲我的計算機如此努力下去將永遠耗盡。除了在問題上拋出更多內存之外,還有其他想法來處理這些數據量嗎?問候,並感謝你... – JEquihua

+0

如果你真的看到交換thrash,沒有任何部分的代碼造成它。任何需要大量內存的事情都是部分原因。您可以嘗試讓代碼輸出週期性狀態消息來感受它到底有多遠,或者在Python探查器下運行代碼,這可以幫助您識別大部分內存。 –

+0

好的,找到它了!執行此操作時,所有事情都會到地獄:covmat = dmc * dmc.T/sumw其中dmc是一個10x25000000矩陣,sumw是一個標量。有任何想法嗎? – JEquihua