Numpy獨特的2D子陣列

我有3D numpy陣列，我只想要獨特的2D子陣列。Numpy獨特的2D子陣列

輸入：

[[[ 1 2] 
    [ 3 4]] 

[[ 5 6] 
    [ 7 8]] 

[[ 9 10] 
    [11 12]] 

[[ 5 6] 
    [ 7 8]]]

輸出：

[[[ 1 2] 
    [ 3 4]] 

[[ 5 6] 
    [ 7 8]] 

[[ 9 10] 
    [11 12]]]

我試圖轉換子陣列到字符串（toString（）方法），然後使用np.unique，但經過變換numpy的陣列，它刪除了\ x00的最後一個字節，所以我不能用np.fromstring（）將它轉換回來。

例子：

import numpy as np 
a = np.array([[[1,2],[3,4]],[[5,6],[7,8]],[[9,10],[11,12]],[[5,6],[7,8]]]) 
b = [x.tostring() for x in a] 
print(b) 
c = np.array(b) 
print(c) 
print(np.array([np.fromstring(x) for x in c]))

輸出：

[b'\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\x04\x00\x00\x00', b'\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08\x00\x00\x00', b'\t\x00\x00\x00\n\x00\x00\x00\x0b\x00\x00\x00\x0c\x00\x00\x00', b'\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08\x00\x00\x00'] 
[b'\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\x04' 
b'\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08' 
b'\t\x00\x00\x00\n\x00\x00\x00\x0b\x00\x00\x00\x0c' 
b'\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08'] 

--------------------------------------------------------------------------- 
ValueError        Traceback (most recent call last) 
<ipython-input-86-6772b096689f> in <module>() 
     5 c = np.array(b) 
     6 print(c) 
----> 7 print(np.array([np.fromstring(x) for x in c])) 

<ipython-input-86-6772b096689f> in <listcomp>(.0) 
     5 c = np.array(b) 
     6 print(c) 
----> 7 print(np.array([np.fromstring(x) for x in c])) 

ValueError: string size must be a multiple of element size

我也試過的看法，但我真的不知道如何使用它。你能幫我嗎？

來源

2016-11-18 Peťan

這是一個[新功能]（https://github.com/numpy/numpy/pull/7742）在即將到來的1.13中，作爲'np.unique（a，axis = 0）'。你可以簡單地複製新的實現並在你的代碼中使用它，因爲1.13還沒有發佈 – Eric

使用@Jaime's post，來解決我們發現獨特的2D子陣的情況下，我想出了這個解決方案，主要增加了一個重塑的view一步 -

def unique2D_subarray(a): 
    dtype1 = np.dtype((np.void, a.dtype.itemsize * np.prod(a.shape[1:]))) 
    b = np.ascontiguousarray(a.reshape(a.shape[0],-1)).view(dtype1) 
    return a[np.unique(b, return_index=1)[1]]

採樣運行 -

In [62]: a 
Out[62]: 
array([[[ 1, 2], 
     [ 3, 4]], 

     [[ 5, 6], 
     [ 7, 8]], 

     [[ 9, 10], 
     [11, 12]], 

     [[ 5, 6], 
     [ 7, 8]]]) 

In [63]: unique2D_subarray(a) 
Out[63]: 
array([[[ 1, 2], 
     [ 3, 4]], 

     [[ 5, 6], 
     [ 7, 8]], 

     [[ 9, 10], 
     [11, 12]]])

來源

2016-11-18 14:11:52 Divakar

謝謝你的回答！因此，如果我很好理解dtype指定的字節序列（不是任何類型）的大小a.dtype.itemsize *大小的子陣列？連續數組需要，因爲dtype指定爲一個字節序列？我很抱歉重複的問題，但我不明白從@海梅的職位。 –

@Peťan你對第一部分是正確的。關於需要「連續」的第二部分。我也不太清楚。我猜可能值得發表評論。如果我不得不猜測，我會說你的第二部分看起來合乎邏輯，但是這兩部分是相關的。 – Divakar

一種解決方案是使用一組跟蹤哪個子陣列的你已經看到：

seen = set([]) 
new_a = [] 

for j in a: 
    f = tuple(list(j.flatten())) 
    if f not in seen: 
     new_a.append(j) 
     seen.add(f) 

print np.array(new_a)

或者使用numpy的只有：

print np.unique(a).reshape((len(unique)/4, 2, 2)) 

>>> [[[ 1 2] 
     [ 3 4]] 

    [[ 5 6] 
     [ 7 8]] 

    [[ 9 10] 
     [11 12]]]

來源

2016-11-18 10:45:54 kezzos

所以[這個答案]（http://stackoverflow.com/a/22941699/102441）從複製代碼上面評論過 – Eric

你鬆了子數組的順序與該答案 – kezzos

如果只是將數組複製到一個集合，然後返回到一個數組，那麼順序就會丟失，這是真的，但是按照上面代碼中的方式完成，不會丟失。 –

的numpy_indexed包（免責聲明：我其作者）是專門做諸如這些操作以有效和矢量方式：

import numpy_indexed as npi 
npi.unique(a)

來源

2016-11-18 10:53:33

Numpy獨特的2D子陣列

回答

相關問題