計算numpy中聯合pmfs的條件概率太慢。想法？（python-numpy）

我有一個合取概率質量函數數組，具有形狀，例如（1,2,3,4,5,6），我想計算概率表，這些維度（出口cpts），用於決策目的。計算numpy中聯合pmfs的條件概率太慢。想法？（python-numpy）

我此刻想出了代碼如下（輸入的形式爲字典「vdict」 {「variable_1」：_1，「variable_2的」：_2 ...}）

for i in vdict: 
    dim = self.invardict.index(i) # The index of the dimension that our Variable resides in 
    val = self.valdict[i][vdict[i]] # The value we want it to be 
    d = d.swapaxes(0, dim) 
    **d = array([d[val]])** 
    d = d.swapaxes(0, dim)

...

所以，我現在做的是：

我的變量轉化爲在CPT相應的維度。
我將零軸與我之前找到的軸交換。
我用所需的值替換整個0軸。

我把尺寸恢復到原來的軸。現在

，問題是，爲了做第2步，我有（A）來計算子陣和（b）。把它放在一個列表，再次翻譯爲陣，因此我得我的新陣列。

事情是，大膽的東西意味着我創建新的對象，而不是隻使用引用舊的，這，如果d是非常大的（這發生在我身上）和使用d的方法被稱爲多次（這又發生在我身上），整個結果非常緩慢。

那麼，有沒有人提出一個想法，將這一小塊代碼，並會跑得更快？也許有些東西可以讓我計算出適當的條件。

注意：我必須保持原始的軸順序（或者至少確保在移除軸時如何將變量更新爲尺寸字典）。我不想訴諸自定義的dtypes。

來源

2010-02-04 mhourdakis

好的，在numpy的就地陣列操作中玩了一下之後，我自己找到了答案。

改變了過去的3條線路中的循環來：

d = conditionalize(d, dim, val)

其中條件化被定義爲：

def conditionalize(arr, dim, val): 
     arr = arr.swapaxes(dim, 0) 
     shape = arr.shape[1:]  # shape of the sub-array when we omit the desired dimension. 
     count = array(shape).prod() # count of elements omitted the desired dimension. 
     arr = arr.reshape(array(arr.shape).prod()) # flatten the array in-place. 
     arr = arr[val*count:(val+1)*count] # take the needed elements 
     arr = arr.reshape((1,)+shape) # the desired sub-array shape. 
     arr = arr. swapaxes(0, dim) # fix dimensions 

     return arr

這使我的程序的執行時間從15分鐘減少到6秒。巨大的收益。

我希望這可以幫助遇到同樣問題的人。

來源

2010-02-07 20:23:28 mhourdakis

計算numpy中聯合pmfs的條件概率太慢。想法？ （python-numpy）

回答

相關問題

計算numpy中聯合pmfs的條件概率太慢。想法？（python-numpy）