2016-10-10 118 views
1

因此,我有一個名爲'df'的熊貓數據框,我想刪除這些秒數,並且只有YYYY-MM-DD HH:MM格式的索引。但是會議記錄也會被分組,並顯示該分鐘的平均值。按分組計算並計算平均值

所以我希望把這個數據幀

     value 
2015-05-03 00:00:00  61.0 
2015-05-03 00:00:10  60.0 
2015-05-03 00:00:25  60.0 
2015-05-03 00:00:30  61.0 
2015-05-03 00:00:45  61.0 
2015-05-03 00:01:00  61.0 
2015-05-03 00:01:10  60.0 
2015-05-03 00:01:25  60.0 
2015-05-03 00:01:30  61.0 
2015-05-03 00:01:45  61.0 
2015-05-03 00:02:00  61.0 
2015-05-03 00:02:10  60.0 
2015-05-03 00:02:25  60.0 
2015-05-03 00:02:40  60.0 
2015-05-03 00:02:55  60.0 
2015-05-03 00:03:00  59.0 
2015-05-03 00:03:15  59.0 
2015-05-03 00:03:20  59.0 
2015-05-03 00:03:35  59.0 
2015-05-03 00:03:40  60.0 

到這個數據幀

     value 
2015-05-03 00:00  60.6 
2015-05-03 00:01  60.6 
2015-05-03 00:02  60.2 
2015-05-03 00:03  59.2 

香港專業教育學院試圖像

df['value'].resample('1Min').mean() 

df.index.resample('1Min').mean() 
代碼

但這似乎並不奏效。有任何想法嗎?

+0

對我來說,它完美的作品。你得到一些錯誤? – jezrael

+0

df.index.resample('1Min')。mean()給出錯誤AttributeError:'DatetimeIndex'對象沒有屬性'resample'和df ['value']。resample('1Min')。mean()不給出一個錯誤,但沒有給出期望的結果,沒有任何變化,我沒有得到平均值,秒仍然存在 –

回答

1

你需要先轉換指數DatetimeIndex

df.index = pd.DatetimeIndex(df.index) 
#another solution 
#df.index = pd.to_datetime(df.index) 

print (df['value'].resample('1Min').mean()) 
#another same solution 
#print (df.resample('1Min')['value'].mean()) 
2015-05-03 00:00:00 60.6 
2015-05-03 00:01:00 60.6 
2015-05-03 00:02:00 60.2 
2015-05-03 00:03:00 59.2 
Freq: T, Name: value, dtype: float64 

astype在指數塞汀秒值0另一種解決方案:

print (df.groupby([df.index.values.astype('<M8[m]')])['value'].mean()) 
2015-05-03 00:00:00 60.6 
2015-05-03 00:01:00 60.6 
2015-05-03 00:02:00 60.2 
2015-05-03 00:03:00 59.2 
Name: value, dtype: float64 
+0

我已經在我的代碼中有df.index = df.index.to_datetime(),是否不會轉換爲datetimeindex ? –

+0

你試過'df.index = pd.to_datetime(df.index)'? – jezrael

+0

好,所以我真正想要的代碼是df = df ['value']。resample('1Min')。mean(),謝謝你,會在4分鐘內接受你的答案! –