用戶在Python上分組數據

我想通過另一列中的數據對一列中的數據進行分組，但我只想要來自特定時間範圍的數據。所以讓我們坐2015-11-1至2016-4-30。我的數據庫看起來是這樣的：用戶在Python上分組數據

account_id employer_key login_date 
    1111111  google   2016-03-03 20:58:36.000000 
    2222222  walmart   2015-11-18 11:52:56.000000 
    2222222  walmart   2015-11-18 11:53:14.000000 
    1111111  walmart   2016-04-06 23:29:04.000000 
    3333333  walmart   2015-09-05 14:13:53.000000 
    3333333  walmart   2016-01-28 03:20:58.000000 
    2222222  walmart   2015-09-03 00:11:38.000000 
    1111111  walmart   2015-09-03 00:12:25.000000 
    1111111  dell_inc   2015-11-13 01:59:59.000000

我試圖得到一個輸出看起來是這樣的：

account_id    login_date 
    1111111    3 
    2222222    2 
    3333333    1

我該如何去獲得從ACCOUNT_ID的獨特的總和一定時間窗口？

來源

2017-03-05 CoffeeCoffeeBuzzBuzz

可以先篩選您的DF，然後使用.groupby().count()：

In [213]: df.query("'2015-11-01' <= login_date <= '2016-04-30'") \ 
      .groupby('account_id')['login_date'] \ 
      .count() \ 
      .reset_index() 
Out[213]: 
    account_id login_date 
0  1111111   3 
1  2222222   2 
2  3333333   1

或者您可以使用boolean indexing（df.loc[...]），而不是df.query(...)，但它看起來有點笨重...

來源

2017-03-05 22:00:47 MaxU

使用between和value_counts

v = pd.value_counts(df.account_id[df.login_date.between('2015-11-01', '2016-04-30')]) 
v.rename_axis('account_id').reset_index(name='login_date') 

    account_id login_date 
0  1111111   3 
1  2222222   2 
2  3333333   1

來源

2017-03-05 23:10:03 piRSquared

用戶在Python上分組數據

回答

相關問題