假設我在pandas.DataFrame中有一個時間戳列datetime。爲了舉例，時間戳以秒分辨率表示。我想在10分鐘[1]水桶/箱內剷鬥/裝桶。我知道我可以將datetime表示爲整數時間戳，然後使用直方圖。有一個更簡單的方法嗎？內置到pandas的東西？使用熊貓的日期時間的每小時直方圖

[1] 10分鐘只是一個例子。最終，我想使用不同的解決方案。

2016-01-15 Dror

這可能將讓你關閉：'df.groupby（pd.TimeGrouper（頻率= '10分鐘'））的意思是（）圖（KIND = 「巴」）'你可以用「hist」替換「bar」，但我不確定這是否有很大意義。我猜測y軸應該是頻率，但x軸應該是什麼？你有一個原始數據的例子和一個例子，說明情節應該是什麼樣子（即使它只是一個口頭描述） – johnchase

要使用像「10Min」這樣的自定義頻率，您必須使用TimeGrouper - 正如@johnchase所建議的那樣 - 在index上運行。

# Generating a sample of 10000 timestamps and selecting 500 to randomize them 
df = pd.DataFrame(np.random.choice(pd.date_range(start=pd.to_datetime('2015-01-14'),periods = 10000, freq='S'), 500), columns=['date']) 
# Setting the date as the index since the TimeGrouper works on Index, the date column is not dropped to be able to count 
df.set_index('date', drop=False, inplace=True) 
# Getting the histogram 
df.groupby(pd.TimeGrouper(freq='10Min')).count().plot(kind='bar')

使用`to_period`

也可以使用to_period方法，但它不工作 - 因爲據我所知 - 自定義時間段，如「10分鐘」。這個例子需要一個額外的列來模擬一個項目的類別。

# The number of sample 
nb_sample = 500 
# Generating a sample and selecting a subset to randomize them 
df = pd.DataFrame({'date': np.random.choice(pd.date_range(start=pd.to_datetime('2015-01-14'),periods = nb_sample*30, freq='S'), nb_sample), 
        'type': np.random.choice(['foo','bar','xxx'],nb_sample)}) 

# Grouping per hour and type 
df = df.groupby([df['date'].dt.to_period('H'), 'type']).count().unstack() 
# Droping unnecessary column level 
df.columns = df.columns.droplevel() 
df.plot(kind='bar')

來源

2016-01-15 22:23:34 Romain

這讓我更加接近。謝謝。我仍然有兩個問題：1）x軸刻度與數據的日期時間性質無關，2）不應將「小節之和」設置爲500？ – Dror

不應該像@johnchase建議的那樣，用'.plot（kind ='bar'）'而不是'.hist（）'？ – Dror

對不起，我在我的第一個答案中犯了一個大錯（太快不是解決方案）。我剛編輯它，並認爲它現在解決了您的問題。 ''sum''現在是500 :-) – Romain

使用熊貓的日期時間的每小時直方圖

回答

使用to_period

相關問題

使用`to_period`