保持N首次出現

以下代碼（當然）僅保留按'Date'排序的行中第一次出現的'Item1'。有關我如何能夠保持它的任何建議，比如前5次發生？保持N首次出現

## Sort the dataframe by Date and keep only the earliest appearance of 'Item1' 
## drop_duplicates considers the column 'Date' and keeps only first occurence 

coocdates = data.sort('Date').drop_duplicates(cols=['Item1'])

來源

2014-06-11 textnet

也許'[5]'？ – Fabricator

你想用head，無論是在數據幀本身或on the groupby：

In [11]: df = pd.DataFrame([[1, 2], [1, 4], [1, 6], [2, 8]], columns=['A', 'B']) 

In [12]: df 
Out[12]: 
    A B 
0 1 2 
1 1 4 
2 1 6 
3 2 8 

In [13]: df.head(2) # the first two rows 
Out[13]: 
    A B 
0 1 2 
1 1 4 

In [14]: df.groupby('A').head(2) # the first two rows in each group 
Out[14]: 
    A B 
0 1 2 
1 1 4 
3 2 8

注：GROUPBY的頭部行爲0.14已更改（它沒有像一個過濾器 - 但修改了索引），因此如果使用較早版本，則必須重置索引。

來源

2014-06-11 20:34:15

使用groupby()和nth()：

據Pandas docs，nth()

取各組第n行如果n是一個int，或行的一個子集，當n爲整數的列表。

因此，所有你需要的是：

df.groupby('Date').nth([0,1,2,3,4]).reset_index(drop=False, inplace=True)

來源

2017-09-05 18:39:00 Shoresh

保持N首次出現

回答

相關問題