Python的構造在陣列矩陣迭代

from numpy import genfromtxt, linalg, array, append, hstack, vstack 

#Euclidean distance function 
def euclidean(v1, v2): 
    dist = linalg.norm(v1 - v2) 
    return dist 

#get the .csv files and eliminate heading and unused columns from test 
BMUs = genfromtxt('BMU3.csv', delimiter=',') 
data = genfromtxt('test.csv', delimiter=',') 
data = data[1:, :-2] 

i = 0 
for obj in data: 
    D = 0 
    for BMU in BMUs: 
     Dist = append(euclidean(obj, BMU[: -2]), BMU[-2:]) 
    D = hstack(Dist) 

Map = vstack(D) 

#iteration counter 
i += 1 
if not i % 1000: 
    print (i, ' of ', len(data)) 

print (Map)

我想要做的是：Python的構造在陣列矩陣迭代

以一個對象從數據
計算距離從BMU（歐幾里德（OBJ，BMU [：-2] ）
追加到距離所述BMU陣列
創建一個包含所有的距離加上從數據對象（d = hstack（DIST））
創建一個長度等於數據中對象數量的矩陣數組。（Map = vstack（D））

問題在這裏，或者至少是我認爲的問題是，hstack和vstack將作爲輸入數組的元組而不是單個數組。這就像我試圖使用它們，因爲我使用列表.append（）列表，可悲的是我是一個初學者，我不知道如何做不同。

任何幫助將是真棒，謝謝提前:)

來源

2016-12-12 Bradipo Eremita

首先使用情況注：

相反的：

from numpy import genfromtxt, linalg, array, append, hstack, vstack

使用

import numpy as np 
.... 
data = np.genfromtxt(....) 
.... 
    np.hstack...

其次，留遠離np.append。它太容易被誤用。使用np.concatenate，這樣您就可以充分感受它正在做什麼。

列表append爲增量工作

alist = [] 
for .... 
    alist.append(....) 
arr = np.array(alist)

==================

沒有樣本陣列（或至少形狀）我更好猜猜。但（n，2）陣列聽起來很合理。以彼此各對「點」的距離，我可以在嵌套列表理解收集的值：

In [121]: data = np.arange(6).reshape(3,2) 
In [122]: [[euclidean(d,b) for b in data] for d in data] 
Out[122]: 
[[0.0, 2.8284271247461903, 5.6568542494923806], 
[2.8284271247461903, 0.0, 2.8284271247461903], 
[5.6568542494923806, 2.8284271247461903, 0.0]]

和作出這樣的一個數組：

In [123]: np.array([[euclidean(d,b) for b in data] for d in data]) 
Out[123]: 
array([[ 0.  , 2.82842712, 5.65685425], 
     [ 2.82842712, 0.  , 2.82842712], 
     [ 5.65685425, 2.82842712, 0.  ]])

與嵌套循環的等效：

alist = [] 
for d in data: 
    sublist=[] 
    for b in data: 
     sublist.append(euclidean(d,b)) 
    alist.append(sublist) 
arr = np.array(alist)

有沒有這樣做的方式沒有循環，但讓我們確保基本的Python循環方法首先工作。

===============

如果我想在data每一個元素（行）之間的差值（沿最後軸）和bmu每一個元素（或點擊這裏data），我可以使用數組廣播。結果是（3,3,2）陣列：

In [130]: data[None,:,:]-data[:,None,:] 
Out[130]: 
array([[[ 0, 0], 
     [ 2, 2], 
     [ 4, 4]], 

     [[-2, -2], 
     [ 0, 0], 
     [ 2, 2]], 

     [[-4, -4], 
     [-2, -2], 
     [ 0, 0]]])

norm能夠處理較大的二維陣列和接受一個axis參數。

In [132]: np.linalg.norm(data[None,:,:]-data[:,None,:],axis=-1) 
Out[132]: 
array([[ 0.  , 2.82842712, 5.65685425], 
     [ 2.82842712, 0.  , 2.82842712], 
     [ 5.65685425, 2.82842712, 0.  ]])

來源

2016-12-12 20:02:10 hpaulj

非常感謝你，會等待你的建議:) –

'BMU'和'data'的'shape'（和'dtype'）是什麼？用樣本複製和測試代碼更容易。否則，我必須猜測並組成示例數組（如'data = np.arange（24）.reshape（12,2）'）。 – hpaulj

（243,7）BMUs.shape （19219,5）data.shape –

感謝你的幫助，我設法實現的僞代碼，這裏的最終方案：

import numpy as np 


def euclidean(v1, v2): 
    dist = np.linalg.norm(v1 - v2) 
    return dist 


def makeKNN(dataSet, BMUSet, k, fileOut, test=False): 
    # take input files 
    BMUs = np.genfromtxt(BMUSet, delimiter=',') 
    data = np.genfromtxt(dataSet, delimiter=',') 

    final = data[1:, :] 
    if test == False: 
     data = data[1:, :] 
    else: 
     data = data[1:, :-2] 

# Calculate all the distances between data and BMUs than reorder BMU with the distances information 

    dist = np.array([[euclidean(d, b[:-2]) for b in BMUs] for d in data]) 
    BMU_K = np.array([BMUs[np.argsort(d)] for d in dist]) 

    # median over the closest k BMU 
    Z = np.array([[np.sum(b[:k].T[5])/k] for b in BMU_K]) 

    # error propagation 
    Z_err = np.array([[np.sqrt(np.sum(np.power(b[:k].T[5], 2)))] for b in BMU_K]) 

    # Adding z estimates and errors to the data 
    final = np.concatenate((final, Z, Z_err), axis=1) 

    # print output file 
    np.savetxt(fileOut, final, delimiter=',') 
    print('So long, and thanks for all the fish')

非常感謝你，我希望這個代碼將會幫助別人，將來別人:)

來源

2016-12-13 12:53:15

Python的構造在陣列矩陣迭代

回答

相關問題