這裏有一個方法來獲得不同n's
這些索引一氣呵成 -
def numpy_approach(df, reference='A'):
df0 = df.iloc[:,df.columns != 'Date']
valid_mask = df0.columns != reference
mask = ~np.isnan(df0.values)
count = mask[:,valid_mask].sum(1) * mask[:,(~valid_mask).argmax()]
idx0 = np.searchsorted(np.maximum.accumulate(count),[1,2,3])
return df.index[idx0]
樣品試驗 -
In [555]: df
Out[555]:
Date A B C D
0 2015-01-02 NaN 1.0 1.0 NaN
1 2015-01-02 NaN 2.0 2.0 NaN
2 2015-01-02 NaN 3.0 3.0 NaN
3 2015-01-02 1.0 NaN 4.0 NaN
5 2015-01-02 NaN 2.0 NaN NaN
6 2015-01-03 1.0 NaN 6.0 NaN
7 2015-01-03 1.0 1.0 6.0 NaN
8 2015-01-03 1.0 1.0 6.0 8.0
In [556]: numpy_approach(df, reference='A')
Out[556]: Int64Index([3, 7, 8], dtype='int64')
In [557]: numpy_approach(df, reference='B')
Out[557]: Int64Index([0, 7, 8], dtype='int64')
In [558]: numpy_approach(df, reference='C')
Out[558]: Int64Index([0, 7, 8], dtype='int64')
In [568]: numpy_approach(df, reference='D')
Out[568]: Int64Index([8, 8, 8], dtype='int64')
爲什麼n = 1,你會得到6而不是3? –
Mea culpa,你是對的,發佈編輯。 – Arthurim