2014-06-12 72 views
1

我試圖計算一系列數據的本地最大值和最小值:如果當前行值大於或小於下一行和前一行,則將其設置爲當前值,否則設置爲NaN 。有沒有更優雅的方式做到這一點,不止這一個其它:訪問Pandas.Dataframe中的鄰居行

import pandas as pd 
import numpy as np 

rng = pd.date_range('1/1/2014', periods=10, freq='5min') 
s = pd.Series([1, 2, 3, 2, 1, 2, 3, 5, 7, 4], index=rng) 
df = pd.DataFrame(s, columns=['val']) 
df.index.name = "dt" 
df['minmax'] = np.NaN 

for i in range(len(df.index)): 
    if i == 0: 
     continue 
    if i == len(df.index) - 1: 
     continue 
    if df['val'][i] >= df['val'][i - 1] and df['val'][i] >= df['val'][i + 1]: 
     df['minmax'][i] = df['val'][i] 
     continue 
    if df['val'][i] <= df['val'][i - 1] and df['val'][i] <= df['val'][i + 1]: 
     df['minmax'][i] = df['val'][i] 
     continue 

print(df) 

的結果是:

     val minmax 
dt        
2014-01-01 00:00:00 1  NaN 
2014-01-01 00:05:00 2  NaN 
2014-01-01 00:10:00 3  3 
2014-01-01 00:15:00 2  NaN 
2014-01-01 00:20:00 1  1 
2014-01-01 00:25:00 2  NaN 
2014-01-01 00:30:00 3  NaN 
2014-01-01 00:35:00 5  NaN 
2014-01-01 00:40:00 7  7 
2014-01-01 00:45:00 4  NaN 

回答

0

我們可以用shiftwhere以確定如何處理分配值,重要的是我們必須比較序列時使用位比較器&|Shift將返回一個Series或DataFrame移動1行(默認)或傳遞的值。

當使用where時,我們可以傳遞一個布爾條件,第二個參數NaN告訴它分配這個值,如果False

In [81]: 

df['minmax'] = df['val'].where(((df['val'] < df['val'].shift(1))&(df['val'] < df['val'].shift(-1)) | (df['val'] > df['val'].shift(1))&(df['val'] > df['val'].shift(-1))), NaN) 
df 
Out[81]: 
        val minmax 
dt        
2014-01-01 00:00:00 1  NaN 
2014-01-01 00:05:00 2  NaN 
2014-01-01 00:10:00 3  3 
2014-01-01 00:15:00 2  NaN 
2014-01-01 00:20:00 1  1 
2014-01-01 00:25:00 2  NaN 
2014-01-01 00:30:00 3  NaN 
2014-01-01 00:35:00 5  NaN 
2014-01-01 00:40:00 7  7 
2014-01-01 00:45:00 4  NaN