numpy.ndarray枚舉的維度的適當子集？

（在這篇文章中，讓np被簡寫numpy）numpy.ndarray枚舉的維度的適當子集？

假設a爲（Ñ + ķ）＆＃x2011;尺寸np.ndarray對象，對於某些整數Ñ > 1和ķ > 1.（IOW，ñ + k > 3是a.ndim的值）。我想枚舉a的第一個n尺寸;這意味着，在每一次迭代，枚舉/迭代器產生一對，其第一元素是Ñ索引的元組ii，和第二個元素是ķ＆＃x2011;維子ndarray在a[ii]。

誠然，這是不難編寫一個函數來做到這一點（其實，我在下面提供這樣的功能的例子），但我想知道這一點：

並numpy提供任何特殊用於執行這種「部分」枚舉的語法或函數？

（通常情況下，當我想遍歷一個多維np.ndarray對象，我用np.ndenumerate，但它不會在這裏幫助，因爲（據我可以告訴）np.ndenumerate會遍歷所有ñ + ķ尺寸）

假設回答上面的問題是肯定的，再有就是這個後續：

那麼迭代的n尺寸是不是連續的情況呢？

（在這種情況下，一對的第一個元素返回在由枚舉/迭代每次迭代將- [R > Ñ元件的元組其中的一些將是一個特殊值表示「所有」，例如slice(None);這對第二元件仍然會長度ķ的ndarray）

謝謝！

下面的代碼有望澄清問題規範。功能partial_enumerate做我想做的事情，使用任何特殊的numpy結構可用於此目的。以下的partial_enumerate定義爲的情況下Ñ = ķ = 2.

import numpy as np 
import itertools as it 
def partial_enumerate(nda, n): 
    """Enumerate over the first N dimensions of the numpy.ndarray NDA. 

    Returns an iterator of pairs. The first element of each pair is a tuple 
    of N integers, corresponding to a partial index I into NDA; the second element 
    is the subarray of NDA at I. 
    """ 

    # ERROR CHECKING & HANDLING OMITTED 
    for ii in it.product(*[range(d) for d in nda.shape[:n]]): 
    yield ii, nda[ii] 

a = np.zeros((2, 3, 4, 5)) 
for ii, vv in partial_enumerate(a, 2): 
    print ii, vv.shape

輸出的每一行是一個「雙元組」，其中所述第一元組一個簡單的例子代表的部分集合座標a，第二個代表a在這些部分座標處的維形子陣列的形狀; （該第二對的值對於所有的行是相同的，從所述陣列的規律性預期）：

(0, 0) (4, 5) 
(0, 1) (4, 5) 
(0, 2) (4, 5) 
(1, 0) (4, 5) 
(1, 1) (4, 5) 
(1, 2) (4, 5)

相反，遍歷np.ndenumerate(a)在這種情況下將導致a.size迭代中，每個來訪的個體細胞的a。

來源

2012-03-05 kjo

「如何迭代維度不連續？我不確定這是可能的純粹的numpy陣列。與標準python列表不同，整個數組存儲在連續的內存塊中。作爲一個例子，你的意思是一個2D數組，其結果是'[row.shape for a row in a]'='[1,2,1,3，...]'？ – Hooked 2012-03-05 16:29:04

「在這種情況下，由枚舉器/迭代器在每次迭代中返回的對的第一個元素將是r> n個元素的元組，其中一些元素是表示」all「的特殊值，例如slice（None）;這一對的第二個元素仍然是長度爲k的ndarray。「這對我來說沒有意義，因爲它會產生不一致的習慣用法。它不是由單一項目索引組成，而是由單個項目索引和索引序列組成。因此它將不再是指數和子陣列之間的雙射。一個索引可以引用許多子數組。 – senderle 2012-03-07 17:38:03

我相信這會導致長度爲k + j的圖表，其中j是索引元組中的序列數目（即非單項索引）。 – senderle 2012-03-07 17:38:41

我認爲您正在尋找功能ndindex在numpy。只要看看你想要的子陣列的切片：

from numpy import * 

# Create the array 
A = zeros((2,3,4,5)) 

# Identify the subindex you're looking for 
idx = ndindex(A.shape[:2]) 

# Iterate through the array 
[(x, A[x].shape) for x in idx]

這給了預期的結果：

[((0, 0), (4, 5)), ((0, 1), (4, 5)), ((0, 2), (4, 5)), ((1, 0), (4, 5)), ((1, 1), (4, 5)), ((1, 2), (4, 5))]

來源

2012-03-05 16:22:41 Hooked

可以使用numpy的廣播規則生成笛卡爾積。 numpy.ix_函數創建適當數組的列表。它等同於以下：

>>> def pseudo_ix_gen(*arrays): 
...  base_shape = [1 for arr in arrays] 
...  for dim, arr in enumerate(arrays): 
...   shape = base_shape[:] 
...   shape[dim] = len(arr) 
...   yield numpy.array(arr).reshape(shape) 
... 
>>> def pseudo_ix_(*arrays): 
...  return list(pseudo_ix_gen(*arrays))

或者，更簡潔：

>>> def pseudo_ix_(*arrays): 
...  shapes = numpy.diagflat([len(a) - 1 for a in arrays]) + 1 
...  return [numpy.array(a).reshape(s) for a, s in zip(arrays, shapes)]

結果是broadcastable陣列的列表：

>>> numpy.ix_(*[[2, 4], [1, 3], [0, 2]]) 
[array([[[2]], 

     [[4]]]), array([[[1], 
     [3]]]), array([[[0, 2]]])]

比較這對的numpy.ogrid結果：

>>> numpy.ogrid[0:2, 0:2, 0:2] 
[array([[[0]], 

     [[1]]]), array([[[0], 
     [1]]]), array([[[0, 1]]])]

正如你所看到的，它是一樣的，但numpy.ix_允許你使用非連續的索引。現在，當我們應用numpy的廣播規則，我們得到了一個笛卡爾乘積：

>>> list(numpy.broadcast(*numpy.ix_(*[[2, 4], [1, 3], [0, 2]]))) 
[(2, 1, 0), (2, 1, 2), (2, 3, 0), (2, 3, 2), 
(4, 1, 0), (4, 1, 2), (4, 3, 0), (4, 3, 2)]

如果代替的numpy.ix_結果傳遞給numpy.broadcast，我們用它來索引一個數組，我們得到這樣的：

>>> a = numpy.arange(6 ** 4).reshape((6, 6, 6, 6)) 
>>> a[numpy.ix_(*[[2, 4], [1, 3], [0, 2]])] 
array([[[[468, 469, 470, 471, 472, 473], 
     [480, 481, 482, 483, 484, 485]], 

     [[540, 541, 542, 543, 544, 545], 
     [552, 553, 554, 555, 556, 557]]], 


     [[[900, 901, 902, 903, 904, 905], 
     [912, 913, 914, 915, 916, 917]], 

     [[972, 973, 974, 975, 976, 977], 
     [984, 985, 986, 987, 988, 989]]]])

但是，注意空格。 Broadcastable數組索引是有用的，但如果你硬是要枚舉的值，可以使用itertools.product會更好：如果你裝有一個for循環反正

>>> %timeit list(itertools.product(range(5), repeat=5)) 
10000 loops, best of 3: 196 us per loop 
>>> %timeit list(numpy.broadcast(*numpy.ix_(*([range(5)] * 5)))) 
100 loops, best of 3: 2.74 ms per loop

所以，後來itertools.product可能將更快。儘管如此，您仍可以使用上述方法在純numpy中獲得一些類似的數據結構：

>> pgrid_idx = numpy.ix_(*[[2, 4], [1, 3], [0, 2]]) 
>>> sub_indices = numpy.rec.fromarrays(numpy.indices((6, 6, 6))) 
>>> a[pgrid_idx].reshape((8, 6)) 
array([[468, 469, 470, 471, 472, 473], 
     [480, 481, 482, 483, 484, 485], 
     [540, 541, 542, 543, 544, 545], 
     [552, 553, 554, 555, 556, 557], 
     [900, 901, 902, 903, 904, 905], 
     [912, 913, 914, 915, 916, 917], 
     [972, 973, 974, 975, 976, 977], 
     [984, 985, 986, 987, 988, 989]]) 
>>> sub_indices[pgrid_idx].reshape((8,)) 
rec.array([(2, 1, 0), (2, 1, 2), (2, 3, 0), (2, 3, 2), 
      (4, 1, 0), (4, 1, 2), (4, 3, 0), (4, 3, 2)], 
      dtype=[('f0', '<i8'), ('f1', '<i8'), ('f2', '<i8')])

來源

2012-03-05 17:32:34 senderle

我相信你的'product_grid'和'numpy.ix_'是一樣的。 – 2012-03-06 21:32:56

@Bago，沒有像重新發明輪子那樣，是嗎？ – senderle 2012-03-06 22:27:46

numpy.ndarray枚舉的維度的適當子集？

回答

相關問題