添加numpy數組元素/切片具有相同的bin分配

我有一些數組A，並且數組的相應元素包含每行的bin分配。我想構建一個陣列S，這樣添加numpy數組元素/切片具有相同的bin分配

S[0, :] = (A[(bins == 0), :]).sum(axis=0)

這是相當容易做到np.stack和列表理解，但似乎過於複雜，不是非常可讀。是否有一種更一般的方法來加總（或者甚至應用一些通用函數）帶有分配分配的數組切片？ scipy.stats.binned_statistic沿着正確的線條，但要求用於計算函數的bin賦值和值是相同的形狀（因爲我正在使用切片，情況並非如此）。

例如，如果

A = np.array([[1., 2., 3., 4.], 
       [2., 3., 4., 5.], 
       [9., 8., 7., 6.], 
       [8., 7., 6., 5.]])

和

bins = np.array([0, 1, 0, 2])

那麼就應該引起

S = np.array([[10., 10., 10., 10.], 
       [2., 3., 4., 5. ], 
       [8., 7., 6., 5. ]])

來源

2017-04-11 DathosPachy

下面是使用np.dotmatrix-multiplication的方法 -

(bins == np.arange(bins.max()+1)[:,None]).dot(A)

採樣運行 -

In [40]: A = np.array([[1., 2., 3., 4.], 
    ...:    [2., 3., 4., 5.], 
    ...:    [9., 8., 7., 6.], 
    ...:    [8., 7., 6., 5.]]) 

In [41]: bins = np.array([0, 1, 0, 2]) 

In [42]: (bins == np.arange(bins.max()+1)[:,None]).dot(A) 
Out[42]: 
array([[ 10., 10., 10., 10.], 
     [ 2., 3., 4., 5.], 
     [ 8., 7., 6., 5.]])

性能提升

一種更有效的方式來創建蒙(bins == np.arange(bins.max()+1)[:,None])，會像這樣 -

mask = np.zeros((bins.max()+1, len(bins)), dtype=bool) 
mask[bins, np.arange(len(bins))] = 1

來源

2017-04-11 17:21:12 Divakar

這比@ Psidom的解決方案快大約30％，所以接受這一點。這對我來說稍微直截了當，但都起作用。 – DathosPachy

您可以使用np.add.reduceat：

import numpy as np 
# index to sort the bins 
sort_index = bins.argsort() 

# indices where the array needs to be split at 
indices = np.concatenate(([0], np.where(np.diff(bins[sort_index]))[0] + 1)) 

# sum values where the bins are the same 
np.add.reduceat(A[sort_index], indices, axis=0) 

# array([[ 10., 10., 10., 10.], 
#  [ 2., 3., 4., 5.], 
#  [ 8., 7., 6., 5.]])

來源

2017-04-11 17:07:57 Psidom

添加numpy數組元素/切片具有相同的bin分配

回答

相關問題