2016-09-27 287 views
2

我有數據框計算一些值,這是他們大熊貓:一列

ID,"url","app_name","used_at","active_seconds","device_connection","device_os","device_type","device_usage"  
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-05-01 09:29:11,13,3g,android,smartphone,home  
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-05-01 09:33:00,3,unknown,android,smartphone,home  
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-06-01 09:33:07,1,unknown,android,smartphone,home  
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-06-01 09:34:30,5,unknown,android,smartphone,home  
e990fae0f48b7daf52619b5ccbec61bc,"",Messaging,2015-06-01 09:36:22,133,3g,android,smartphone,home   
e990fae0f48b7daf52619b5ccbec61bc,"",Messaging,2015-05-02 09:38:40,5,3g,android,smartphone,home  
574c4969b017ae6481db9a7c77328bc3,"",Yandex.Navigator,2015-05-01 11:04:48,70,3g,ios,smartphone,home  
574c4969b017ae6481db9a7c77328bc3,"",VK Client,2015-6-01 12:02:27,248,3g,ios,smartphone,home  
574c4969b017ae6481db9a7c77328bc3,"",Viber,2015-07-01 12:06:35,7,3g,ios,smartphone,home  
574c4969b017ae6481db9a7c77328bc3,"",VK Client,2015-08-01 12:23:26,86,3g,ios,smartphone,home  
574c4969b017ae6481db9a7c77328bc3,"",Talking Angela,2015-08-02 12:24:52,0,3g,ios,smartphone,home  
574c4969b017ae6481db9a7c77328bc3,"",My Talking Angela,2015-08-03 12:24:52,167,3g,ios,smartphone,home   
574c4969b017ae6481db9a7c77328bc3,"",Talking Angela,2015-08-04 12:27:39,34,3g,ios,smartphone,home   

的一部分,我需要在每個月算天量每ID

如果我嘗試df.groupby('ID')['used_at'].count()我得到的訪問量,我怎麼可以計數daysmonth

回答

2

我想你需要groupby通過IDmonthday和聚集size

df1 = df.used_at.groupby([df['ID'], df.used_at.dt.month,df.used_at.dt.day ]).size() 

print (df1) 
ID        used_at used_at 
574c4969b017ae6481db9a7c77328bc3 5  1   1 
            6  1   1 
            7  1   1 
            8  1   1 
              2   1 
              3   1 
              4   1 
e990fae0f48b7daf52619b5ccbec61bc 5  1   2 
              2   1 
            6  1   3 
dtype: int64 

或者通過date - 這是一樣的yearmonthday

df1 = df.used_at.groupby([df['ID'], df.used_at.dt.date]).size() 

print (df1) 
ID        used_at 
574c4969b017ae6481db9a7c77328bc3 2015-05-01 1 
            2015-06-01 1 
            2015-07-01 1 
            2015-08-01 1 
            2015-08-02 1 
            2015-08-03 1 
            2015-08-04 1 
e990fae0f48b7daf52619b5ccbec61bc 2015-05-01 2 
            2015-05-02 1 
            2015-06-01 3 
dtype: int64 

之間的差異countsize

size計數NaN值,count沒有。