python pandas：通過滾動另一個Dataframe的索引獲取一個Dataframe的滾動值

我有兩個數據框：一個有多級別的列，另一個只有單級列（這是第一個數據框的第一個級別，或者說通過對第一個數據幀進行分組來計算第二個數據幀）。python pandas：通過滾動另一個Dataframe的索引獲取一個Dataframe的滾動值

這兩個dataframes如下所示：

first dataframe-df1 second dataframe-df2 DF1和DF2之間的關係是：

df2 = df1.groupby(axis=1, level='sector').mean()

然後，我通過得到DF1的rolling_max指數：

result1=pd.rolling_apply(df1,window=5,func=lambda x: pd.Series(x).idxmax(),min_periods=4)

讓我稍微解釋一下result1。例如，在2016/2/23 - 2016/2/29五天（窗口長度）期間，股票sh600870的最大價格發生在2016/2/24，五年期間2016/2/24指數的最高價格發生了變化，因此，在結果1中，2016/2/29中的股票sh600870的值爲1.

現在，我想通過result1中的指數獲取每個股票的行業價格。

讓我們以同樣的股票爲例，股票sh600870在'家用電器視聽器材白色家電'部門。因此在2016/2/29，我想在2016/2/24獲得行業價格，這是8.770。

我該怎麼做？

來源

2016-05-17 April

歡迎來到SO。如果您在問題中插入數據框作爲文本（您可以編輯它）會很有幫助。請按照此鏈接獲取有關如何問'熊貓'問題的有用信息：http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – IanS

idxmax（或np.argmax）返回一個相對於滾動窗口的索引。爲了使相對指數df1，加左邊緣的指數滾動窗口：

index = pd.rolling_apply(df1, window=5, min_periods=4, func=np.argmax) 
shift = pd.rolling_min(np.arange(len(df1)), window=5, min_periods=4) 
index = index.add(shift, axis=0)

一旦你有了相對df1序索引，你可以用它們來索引到df1或df2使用.iloc 。

例如，

import numpy as np 
import pandas as pd 
np.random.seed(2016) 
N = 15 
columns = pd.MultiIndex.from_product([['foo','bar'], ['A','B']]) 
columns.names = ['sector', 'stock'] 
dates = pd.date_range('2016-02-01', periods=N, freq='D') 
df1 = pd.DataFrame(np.random.randint(10, size=(N, 4)), columns=columns, index=dates) 
df2 = df1.groupby(axis=1, level='sector').mean() 

window_size, min_periods = 5, 4 
index = pd.rolling_apply(df1, window=window_size, min_periods=min_periods, func=np.argmax) 
shift = pd.rolling_min(np.arange(len(df1)), window=window_size, min_periods=min_periods) 
# alternative, you could use 
# shift = np.pad(np.arange(len(df1)-window_size+1), (window_size-1, 0), mode='constant') 
# but this is harder to read/understand, and therefore it maybe more prone to bugs. 
index = index.add(shift, axis=0) 

result = pd.DataFrame(index=df1.index, columns=df1.columns) 
for col in index: 
    sector, stock = col 
    mask = pd.notnull(index[col]) 
    idx = index.loc[mask, col].astype(int) 
    result.loc[mask, col] = df2[sector].iloc[idx].values 

print(result)

產生在rolling_apply語法變爲大熊貓0.18

sector  foo  bar  
stock   A B A B 
2016-02-01 NaN NaN NaN NaN 
2016-02-02 NaN NaN NaN NaN 
2016-02-03 NaN NaN NaN NaN 
2016-02-04 5.5 5 5 7.5 
2016-02-05 5.5 5 5 8.5 
2016-02-06 5.5 6.5 5 8.5 
2016-02-07 5.5 6.5 5 8.5 
2016-02-08 6.5 6.5 5 8.5 
2016-02-09 6.5 6.5 6.5 8.5 
2016-02-10 6.5 6.5 6.5 6 
2016-02-11 6 6.5 4.5 6 
2016-02-12 6 6.5 4.5 4 
2016-02-13 2 6.5 4.5 5 
2016-02-14 4 6.5 4.5 5 
2016-02-15 4 6.5 4 3.5

注。數據幀和系列現在有一個rolling方法，所以現在你可以使用：

index = df1.rolling(window=window_size, min_periods=min_periods).apply(np.argmax) 
shift = (pd.Series(np.arange(len(df1))) 
     .rolling(window=window_size, min_periods=min_periods).min()) 
index = index.add(shift.values, axis=0)

來源

2016-05-17 10:21:17 unutbu

python pandas：通過滾動另一個Dataframe的索引獲取一個Dataframe的滾動值

回答

相關問題