Groupd下采樣和繪圖pd.DataFrame

我試圖將分組數據降採樣到每日平均值，計算每組，並繪製在一個圖中得到的時間序列。我的出發點是以下pd.DataFrame：Groupd下采樣和繪圖pd.DataFrame

value  time  type 
0.1234  2013-04-03 A 
0.2345  2013-04-05 A 
0.34564  2013-04-07 A 
...   ...  ... 
0.2345  2013-04-03 B 
0.1234  2013-04-05 B 
0.2345  2013-04-07 C 
0.34564  2013-04-07 C

我想每天計算裝置，每種類型的內容，並繪製時間序列這些日常手段的單一情節。

目前，我有這個...

names = list(test['type'].unique()) 
types = [] 
for name in names: 
    single = df.loc[df.type == name] 
    single = single.set_index(single.time, drop=False) 
    single = single.resample("D") 
    types.append(single) 

for single, name in zip(types, names): 
    single.rename(columns={"value":name}, inplace=True) 

combined = pd.concat(types, axis=1) 
combined.plot()

...產生含有所需的輸出組合數據幀和下面的情節： What it should look like

在我看來，這可能是通過在初始數據幀上使用groupby更容易實現，但到目前爲止，我還無法使用此方法重現所需的繪圖。

什麼是「聰明的方式」來做到這一點？

編輯：更大的數據樣本（CSV，1000行）在：http://pastebin.com/gi16nZdh

感謝，馬蒂亞斯

來源

2014-10-01 Matthias

您能否提供更大的示例數據集？在csv格式的pastebin中。 – Ffisegydd 2014-10-01 12:46:23

從數據框中添加1k行csv隨機樣本。 – Matthias 2014-10-01 13:06:44

它當然有幫助，謝謝。我必須將整個數據集的'pivot'轉換爲'pivot_table'，但您肯定指出了正確的方向。 – Matthias 2014-10-01 13:38:52

您可以使用pandas.DataFrame.pivot容易做你想要什麼，我創建了一個隨機例如數據幀然後使用df.pivot按需要排列表。

注意：我已經重新採樣爲每週一次，因爲每天只有一個數據值，請不要忘記更改數據。

import pandas as pd 
import matplotlib.pyplot as plt 

dates = pd.date_range('2013-04-03', periods = 50, freq='D') 
dfs = [pd.DataFrame(dict(time=dates, value=pd.np.random.randn(len(dates)), type=i)) for i in ['A', 'B', 'C', 'D']] 
df = pd.concat(dfs) 

pivoted = df.pivot(index='time', columns='type', values='value') 

pivoted.resample('W') 

print(pivoted.head(10)) 
# type    A   B   C   D 
# time 
# 2013-04-03 0.161839 0.509179 0.055078 -2.072243 
# 2013-04-04 0.323308 0.891982 -1.266360 1.950389 
# 2013-04-05 -2.542464 -0.441849 -2.686183 0.717737 
# 2013-04-06 0.750871 0.438343 -0.002004 0.478821 
# 2013-04-07 -0.118890 1.026121 1.283397 -1.306257 
# 2013-04-08 -0.396373 -1.078925 -0.539617 -1.625549 
# 2013-04-09 0.328076 1.964779 0.194198 0.232702 
# 2013-04-10 -0.178683 0.177359 0.500873 -0.729988 
# 2013-04-11 0.762800 1.576662 -0.456480 0.526162 
# 2013-04-12 -1.301265 -0.586977 -0.903313 0.162008 

pivoted.plot() 

plt.show()

此代碼創建一個名爲pivoted其中每一列現在type pivot_table和數據是索引。然後我們簡單地使用pivoted.resample('W')對其進行重新採樣。

Example plot

來源

2014-10-01 13:01:23 Ffisegydd

這解決了這個小例子的問題。對於整個數據，'df.pivot（...）'方法失敗，因爲有重複的鍵，即在同一時間點的幾個觀測值。但是，'df.pivot_table（...）'使用與您提供的參數相同的參數。 – Matthias 2014-10-01 13:37:53

Groupd下采樣和繪圖pd.DataFrame

回答

相關問題