2017-04-26 57 views
4

我有DF:數據幀的值第一個實例中

     Voltage 
01-02-2017 00:00  13.1 
01-02-2017 00:01  13.2 
01-02-2017 00:02  13.3 
01-02-2017 00:03  14.1 
01-02-2017 00:04  14.3 
01-02-2017 00:04  13.5 

我想的時間:第一個實例的(HH毫米)時在電壓柱> = 14.0的值。 「充電時間」欄中應該只有一個時間值。

     Voltage Time of Full Charge 
01-02-2017 00:00  13.1 
01-02-2017 00:01  13.2 
01-02-2017 00:02  13.3 
01-02-2017 00:03  14.1   00:03 
01-02-2017 00:04  14.3 
01-02-2017 00:04  13.5 

我想沿着這些線路的東西,但不能弄明白:

df.index = pd.to_datetime(df.index) 
df.['Time of Full Charge'] = np.where(df.['Voltage'] >= 14.0), (df.index.hour:df.index.minute))  

回答

4

需要idxmax通過條件第一指標值,不僅是必要的指數必須是唯一的:

idx = (df['Voltage'] >= 14.0).idxmax() 
df.loc[mask, 'Time of Full Charge'] = mask.idxmax().strftime('%H:%M') 
print (df) 
        Voltage Time of Full Charge 
2017-01-02 00:00:00  13.1     NaN 
2017-01-02 00:01:00  13.2     NaN 
2017-01-02 00:02:00  13.3     NaN 
2017-01-02 00:03:00  14.1    00:03 
2017-01-02 00:04:00  14.3     NaN 
2017-01-02 00:04:00  13.5     NaN 

或者:

idx = (df['Voltage'] >= 14.0).idxmax() 
df['Time of Full Charge'] = np.where(df.index == idx, idx.strftime('%H:%M'), '') 
print (df) 
        Voltage Time of Full Charge 
2017-01-02 00:00:00  13.1      
2017-01-02 00:01:00  13.2      
2017-01-02 00:02:00  13.3      
2017-01-02 00:03:00  14.1    00:03 
2017-01-02 00:04:00  14.3      
2017-01-02 00:04:00  13.5  

對於非唯一索引,可以使用MultiIndex

df.index = [np.arange(len(df.index)), df.index] 

idx = (df['Voltage'] >= 14.0).idxmax() 
df['Time of Full Charge'] = np.where(df.index.get_level_values(0) == idx[0], 
            idx[1].strftime('%H:%M'), 
            '') 

df.index = df.index.droplevel(0) 
print (df) 
        Voltage Time of Full Charge 
2017-01-02 00:00:00  13.1      
2017-01-02 00:01:00  13.2      
2017-01-02 00:02:00  13.3      
2017-01-02 00:03:00  14.1    00:03 
2017-01-02 00:04:00  14.3      
2017-01-02 00:04:00  13.5      
+0

感謝@jezrael。我只需要該列到達14或以上時的第一個實例(新列中應該只有一個值,這是可能的嗎? – wazzahenry

+0

請檢查編輯答案,索引是否唯一? – jezrael

+0

是的,索引本質上是24小時制的一天,因此將是唯一的。謝謝! – wazzahenry

2

您可以使用numpy.searchsorted()如果Voltage列進行排序:

In [260]: df.index[np.searchsorted(df.Voltage, 14)] 
Out[260]: DatetimeIndex(['2017-01-02 00:03:00'], dtype='datetime64[ns]', freq=None) 
相關問題