熊貓 - 由 - 優文庫

我想2列分組後數天數計數爲組天數：熊貓 - 由

groups = df.groupby([df.col1,df.col2])

現在我想指望相關各組的天數：

result = groups['date_time'].dt.date.nunique()

我使用的是類似的，當我想通過天團，但在這裏我得到一個錯誤：

AttributeError: Cannot access attribute 'dt' of 'SeriesGroupBy' objects, try using the 'apply' method

什麼道具呃方式得到天數？

來源

2017-03-20 Guy s

你需要的groupby另一個變化 - 首先定義列：

df['date_time'].dt.date.groupby([df.col1,df.col2]).nunique()

df.groupby(['col1','col2'])['date_time'].apply(lambda x: x.dt.date.nunique())

df['date_time1'] = df['date_time'].dt.date 
a = df.groupby([df.col1,df.col2]).date_time1.nunique()

樣品：

start = pd.to_datetime('2015-02-24') 
rng = pd.date_range(start, periods=10, freq='15H') 

df = pd.DataFrame({'date_time': rng, 'col1': [0]*5 + [1]*5, 'col2': [2]*3 + [3]*4+ [4]*3}) 
print (df) 
    col1 col2   date_time 
0  0  2 2015-02-24 00:00:00 
1  0  2 2015-02-24 15:00:00 
2  0  2 2015-02-25 06:00:00 
3  0  3 2015-02-25 21:00:00 
4  0  3 2015-02-26 12:00:00 
5  1  3 2015-02-27 03:00:00 
6  1  3 2015-02-27 18:00:00 
7  1  4 2015-02-28 09:00:00 
8  1  4 2015-03-01 00:00:00 
9  1  4 2015-03-01 15:00:00

#solution with apply 
df1 = df.groupby(['col1','col2'])['date_time'].apply(lambda x: x.dt.date.nunique()) 
print (df1) 
col1 col2 
0  2  2 
     3  2 
1  3  1 
     4  2 
Name: date_time, dtype: int64 

#create new helper column 
df['date_time1'] = df['date_time'].dt.date 
df2 = df.groupby([df.col1,df.col2]).date_time1.nunique() 
print (df2) 
col1 col2 
0  2  2 
     3  2 
1  3  1 
     4  2 
Name: date_time, dtype: int64 

df3 = df['date_time'].dt.date.groupby([df.col1,df.col2]).nunique() 
print (df3) 
col1 col2 
0  2  2 
     3  2 
1  3  1 
     4  2 
Name: date_time, dtype: int64

來源

2017-03-20 08:21:46 jezrael

這工作得很好。我不明白爲什麼Series沒有像Dataframe那樣的dt屬性。 –

熊貓 - 由

回答

相關問題