矢量化在numpy的

我試圖做numpy的下面，而無需使用一個循環的操作：矢量化在numpy的

我有維度的矩陣X N * d和尺寸爲N的向量y y保存整數範圍從1到K.
我想獲得大小爲K * d的矩陣M，其中M [i，：] = np.mean（X [y == i，：]，0）

我可以在不使用循環的情況下實現嗎？

隨着循環，它會像這樣。

import numpy as np 

N=3 
d=3 
K=2 

X=np.eye(N) 
y=np.random.randint(1,K+1,N) 
M=np.zeros((K,d)) 
for i in np.arange(0,K): 
    line=X[y==i+1,:] 
    if line.size==0: 
     M[i,:]=np.zeros(d) 
    else: 
     M[i,:]=mp.mean(line,0)

在此先感謝您。

來源

2016-05-15 popuban

是否K == N？ y的值是否獨特？ –

如果你顯示了一些代碼，這將是很酷的。 – Bonifacio2

不，不。例如，如果K = 2，X = np.eye（3），Y = [1 2 1]，我想M是[[1/2 1/2]，[0 1 0]]。 – popuban

這解決了這個問題，但創建了一箇中間K×N布爾矩陣，並且不使用內置的平均函數。在某些情況下，這可能導致性能變差或數字穩定性變差。我讓類標籤範圍從0到K-1而不是1到K。

# Define constants 
K,N,d = 10,1000,3 

# Sample data 
Y = randint(0,K-1,N) #K-1 to omit one class to test no-examples case 
X = randn(N,d) 

# Calculate means for each class, vectorized 

# Map samples to labels by taking a logical "outer product" 
mark = Y[None,:]==arange(0,K)[:,None] 

# Count number of examples in each class  
count = sum(mark,1) 

# Avoid divide by zero if no examples 
count += count==0 

# Sum within each class and normalize 
M = (dot(mark,X).T/count).T 

print(M, shape(M), shape(mark))

來源

2016-05-15 11:42:18 MRule

代碼的基本收集特定的行關閉X和加入他們，我們有一個與NumPy在np.add.reduceat內置。因此，以此爲焦點，以矢量化方式解決問題的步驟可能如下所列 -

# Get sort indices of y 
sidx = y.argsort() 

# Collect rows off X based on their IDs so that they come in consecutive order 
Xr = X[np.arange(N)[sidx]] 

# Get unique row IDs, start positions of each unique ID 
# and their counts to be used for average calculations 
unq,startidx,counts = np.unique((y-1)[sidx],return_index=True,return_counts=True) 

# Add rows off Xr based on the slices signified by the start positions 
vals = np.true_divide(np.add.reduceat(Xr,startidx,axis=0),counts[:,None]) 

# Setup output array and set row summed values into it at unique IDs row positions 
out = np.zeros((K,d)) 
out[unq] = vals

來源

2016-05-15 11:45:02 Divakar

矢量化在numpy的

回答

相關問題