與numpy的/大熊貓組骨料替換的一組值

我有一個numpy的陣列X的圖像：與numpy的/大熊貓組骨料替換的一組值

array([[ 0.01176471, 0.49019608, 0.01568627], 
     [ 0.01176471, 0.49019608, 0.01568627], 
     [ 0.00784314, 0.49411765, 0.00784314], 
     ..., 
     [ 0.03921569, 0.08235294, 0.10588235], 
     [ 0.09411765, 0.14901961, 0.18431373], 
     [ 0.10196078, 0.15294118, 0.21568627]])

我已經跑了clusterizer算法移到該陣列找到相似的顏色，並且具有另一陣列帶班的每個像素Y：

array([19, 19, 19, ..., 37, 20, 20], dtype=int32)

什麼是最快的，最漂亮，最pythonistic的方式，以取代所有像素的顏色與平均超過該集羣集羣？

我想出了下面的代碼：

import pandas as pd 
import numpy as np 
<...> 
df = pd.DataFrame.from_records(X, columns=list('rgb')) 
df['cls'] = Y 
mean_colors = df.groupby('cls').mean().values 
# as suggested in comments below 
# for cls in range(len(mean_colors)): 
# X[Y==cls] = mean_colors[cls] 
X = mean_colors[Y]

有沒有辦法做到這一點只大熊貓或僅在numpy的？

來源

2016-03-01 Direvius

假設'Y'包含所有標籤，那麼簡單的索引'mean_colors [Y]'怎麼樣？ – Divakar

對於你的例子，你的代碼不工作，因爲你有'Y' 3個不同的值，你什麼時候比較'Y == cls'什麼都沒有發生，因爲索引中沒有...（cls只等於0 ，1，2） –

@Divakar是的，這很漂亮，謝謝！ – Direvius

假設所有的標籤都存在於Y，您可以使用basic-indexing -

mean_colors[Y]

索引到同一位置多次時的情況下，性能也可以用np.take，而不是純粹的索引，像這樣 -

np.take(mean_colors,Y,axis=0)

運行測試 -

In [107]: X = np.random.rand(10000,3) 

In [108]: Y = np.random.randint(0,100,(10000)) 

In [109]: np.allclose(np.take(mean_colors,Y,axis=0),mean_colors[Y]) 
Out[109]: True   # Verify approaches 

In [110]: %timeit mean_colors[Y] 
1000 loops, best of 3: 280 µs per loop 

In [111]: %timeit np.take(mean_colors,Y,axis=0) 
10000 loops, best of 3: 63.7 µs per loop

來源

2016-03-01 13:11:09 Divakar

在我的機器上，使用我的數據：純索引是6.13毫秒，花費是2.08毫秒，'df.groupby（'cls'）.transform（np .mean）.values'是65.2毫秒。我認爲這兩個都是最好的=） – Direvius

啊，我沒有考慮到我需要在 – Direvius

純索引之前找到mean_colors：610毫秒; 建立索引：604毫秒; 熊貓變換：659毫秒 – Direvius

你可以使用transform用於GROUPBY對象，然後.values結果分配給您的X：從help

X = df.groupby('cls').transform(np.mean).values

信息有關tranfrom：

transform(func, *args, **kwargs) method of pandas.core.groupby.DataFrameGroupBy instance 
    Call function producing a like-indexed DataFrame on each group and 
    return a DataFrame having the same indexes as the original object 
    filled with the transformed values 

    Parameters 
    ---------- 
    f : function 
     Function to apply to each subframe 

    Notes 
    ----- 
    Each subframe is endowed the attribute 'name' in case you need to know 
    which group you are working on. 

    Examples 
    -------- 
    >>> grouped = df.groupby(lambda x: mapping[x]) 
    >>> grouped.transform(lambda x: (x - x.mean())/x.std())

來源

2016-03-01 12:38:58

與numpy的/大熊貓組骨料替換的一組值

回答

相關問題