在cython中如何處理np.ndarray的內存？

例如，如果我這樣做：在cython中如何處理np.ndarray的內存？

cdef np.ndarray[np.int64_t, ndim=1] my_array

儲存在哪裏my_array？我會認爲，因爲我沒有告訴cython存儲在堆中，它將被存儲在堆棧中，但是在做了下面的實驗之後，它似乎將它存儲在堆上，或者以某種方式高效地進行內存管理。如何管理內存相對於my_array？也許我錯過了一些明顯的東西，但我找不到任何文檔。

import numpy as np 
cimport cython 
cimport numpy as np 

from libc.stdlib cimport malloc, free 

def big_sum(): 
    # freezes up: 
    # "a" is created on the stack 
    # space on the stack is limited, so it runs out 

    cdef int a[10000000] 

    for i in range(10000000): 
     a[i] = i 

    cdef int my_sum 
    my_sum = 0 
    for i in range(10000000): 
     my_sum += a[i] 
    return my_sum 

def big_sum_malloc(): 
    # runs fine: 
    # "a" is stored on the heap, no problem 

    cdef int *a 
    a = <int *>malloc(10000000*cython.sizeof(int)) 

    for i in range(10000000): 
     a[i] = i 

    cdef int my_sum 
    my_sum = 0 
    for i in range(10000000): 
     my_sum += a[i] 

    with nogil: 
     free(a) 
    return my_sum  

def big_numpy_array_sum(): 
    # runs fine: 
    # I don't know what is going on here 
    # but given that the following code runs fine, 
    # it seems that entire array is NOT stored on the stack 

    cdef np.ndarray[np.int64_t, ndim=1] my_array 
    my_array = np.zeros(10000000, dtype=np.int64) 

    for i in range(10000000): 
     my_array[i] = i 

    cdef int my_sum 
    my_sum = 0 
    for i in range(10000000): 
     my_sum += my_array[i] 
    return my_sum

來源

2013-11-15 Akavall

你爲什麼不看看生成的C文件？無論如何，我相信cython只是調用numpy函數進行分配，它調用堆中分配的PyMalloc。 numpy不*管理它的內存。它只是依靠python分配/釋放。 – Bakuriu

@Bakuriu，感謝您的評論，這是有道理的，並且有很大的幫助，但是您是否知道更詳細地解釋這些步驟的來源？我試着查看生成的C文件，但它超過6000行代碼，並且我無法理解它。 – Akavall

這幾乎肯定是堆 - 考慮到在聲明時數組的大小是未知的，numpy通常在大型數組上工作，堆棧是有限的。雖然堆棧優化在技術上是可行的，但'ndarray'可以是視圖，因此數據引用可以逃避當前的範圍。因此，在堆中實現它更簡單。如果可能，請使用MemoryView，或者閱讀http://docs.cython.org/src/tutorial/numpy.html –

Cython在這裏沒有做什麼神奇的事情。 Numpy有一個完整的C-api，這就是cython正在與之交互的內容 - cython不會自己執行內存管理，並且numpy數組中的內存的處理方式與使用python中的numpy數組時的處理方式相同。 @Bakuriu是對的 - 這絕對是一堆。

考慮這個用Cython代碼：

cimport numpy as np 
def main(): 
    zeros = np.zeros 
    cdef np.ndarray[dtype=np.double_t, ndim=1] array 
    array = zeros(10000)

這被轉換爲等值的主要功能如下C。我已經刪除了聲明和錯誤處理代碼，使其更清晰可讀。

PyArrayObject *__pyx_v_array = 0; 
PyObject *__pyx_v_zeros = NULL; 
PyObject *__pyx_t_1 = NULL; 
PyObject *__pyx_t_2 = NULL; 

// zeros = np.zeros    # <<<<<<<<<<<<<< 
// get the numpy module object 
__pyx_t_1 = __Pyx_GetModuleGlobalName(__pyx_n_s__np); 
// get the "zeros" function 
__pyx_t_2 = __Pyx_PyObject_GetAttrStr(__pyx_t_1, __pyx_n_s__zeros) 
__pyx_v_zeros = __pyx_t_2; 

// array = zeros(10000)    # <<<<<<<<<<<<<< 
// (__pyx_k_tuple_1 is a static global variable containing the literal python tuple 
// (10000,) that was initialized during the __Pyx_InitCachedConstants function) 
__pyx_t_2 = PyObject_Call(__pyx_v_zeros, ((PyObject *)__pyx_k_tuple_1), NULL); 
__pyx_v_array = ((PyArrayObject *)__pyx_t_2);

如果你擡頭看numpy的C API文檔，你會看到PyArrayObject是numpy的 ndarray的C-API結構。這裏的關鍵是看到cython並沒有明確地處理內存分配。相同的面向對象設計原則適用於python和numpy C apis，這裏的內存管理是PyArrayObject的責任。這種情況與在python中使用numpy數組沒有什麼不同。

來源

2013-11-27 12:18:30

在cython中如何處理np.ndarray的內存？

回答

相關問題