2017-06-14 120 views
1

使用不同的列聚集分組通過在與大熊貓的ID數據框一個和交付的天(例如,每週7天): enter image description here與熊貓

我想使用GROUPBY()大熊貓函數並創建以下內容 - 爲每一天創建7個不同的列(例如,delivery_day_1,delivery_day_2等),並根據數據框中的ID對發生的分組進行計數。如何做到這一點?

謝謝。

回答

2

我認爲你需要groupby + size + unstackcrosstab重塑第一。

然後在必要時添加缺少的weekday S按reindex_axis和最後一個add_prefix

樣品:

df = pd.DataFrame({'subscription_id':[1,2,3,1], 'delivery_weekday':[1,1,2,1]}) 

print (df) 
    delivery_weekday subscription_id 
0     1    1 
1     1    2 
2     2    3 
3     1    1 

df = df.groupby(['subscription_id','delivery_weekday']) \ 
     .size() \ 
     .unstack(fill_value=0) \ 
     .reindex_axis(range(1,8), fill_value=0, axis=1) \ 
     .add_prefix('delivery_day_') 

print (df) 
delivery_weekday delivery_day_1 delivery_day_2 delivery_day_3 \ 
subscription_id              
1        2    0    0 
2        1    0    0 
3        0    1    0 

delivery_weekday delivery_day_4 delivery_day_5 delivery_day_6 \ 
subscription_id              
1        0    0    0 
2        0    0    0 
3        0    0    0 

delivery_weekday delivery_day_7 
subscription_id     
1        0 
2        0 
3        0 

df = pd.crosstab(df['subscription_id'],df['delivery_weekday']) \ 
     .reindex_axis(range(1,8), fill_value=0, axis=1) \ 
     .add_prefix('delivery_day_') 
print (df) 

delivery_weekday delivery_day_1 delivery_day_2 delivery_day_3 \ 
subscription_id              
1        2    0    0 
2        1    0    0 
3        0    1    0 

delivery_weekday delivery_day_4 delivery_day_5 delivery_day_6 \ 
subscription_id              
1        0    0    0 
2        0    0    0 
3        0    0    0 

delivery_weekday delivery_day_7 
subscription_id     
1        0 
2        0 
3        0