這裏是另一種方式,用np.argmax:
In [55]: myDataFrame = pd.DataFrame([(True,False,False), (False,False,True), (False,False,True)], index=list('ABC'), columns=list('XYZ'))
In [56]: myDataFrame
Out[56]:
X Y Z
A True False False
B False False True
C False False True
[3 rows x 3 columns]
In [58]: pd.Series(myDataFrame.columns[np.argmax(myDataFrame.values, axis=1)], index=myDataFrame.index)
Out[58]:
A X
B Z
C Z
dtype: object
它的長,但也許更快尤其對於大dataframes:
In [76]: myDataFrame2 = pd.concat([myDataFrame]*10000)
In [77]: %timeit pd.Series(myDataFrame2.columns[np.argmax(myDataFrame2.values, axis=1)], index=myDataFrame2.index)
1000 loops, best of 3: 1.19 ms per loop
In [78]: %timeit pd.Series(np.dot(myDataFrame2, myDataFrame2.columns), index=myDataFrame2.index)
100 loops, best of 3: 5.72 ms per loop
In [79]: %timeit myDataFrame2.apply(lambda row: myDataFrame2.columns[row][0], axis=1)
1 loops, best of 3: 1.15 s per loop
點布爾和字符串?這甚至如何工作? –
@PhillipCloud它是魔術! –